From DOE’s ASCR Discovery: “Bargain proteins”

From DOE’s ASCR Discovery

May 2022

A Flatiron Institute biologist uses supercomputers and their quantum cousins to streamline the search for promising drugs.

An X-ray crystal structure of a self-assembling peptide helical bundle. The structure was designed on a quantum annealer and validated with a classical simulation on large-scale high-performance computing resources. Lab researchers later synthesized and characterized it experimentally. The two mirror-image subunits (one made from L-amino acids in cyan, the other made from D-amino acids in orange) were designed to pack together. Image courtesy of Vikram Mulligan, Flatiron Institute.

Devising a drug to treat a disease isn’t easy. It means screening hundreds of thousands of compounds, testing a small pool of promising candidates first in tissue culture and later in animals, and then, after many intermediate steps, finding perhaps one molecule worthy of human testing.

Using powerful computers to design novel proteins with the best properties can save much of this time, effort and expense. Vikram Mulligan, a research scientist at the Flatiron Institute in New York, aims to make the process even quicker and cheaper.

“A protein’s function is uniquely determined by the way it folds, which in turn is determined by its sequence of amino-acid building blocks,” Mulligan says. “Untangling the sequence-fold-function relationship is challenging due to the vastness of both the possible sequence space and the possible conformational space” – the huge number of configurations a protein could fold into.

Unfortunately, most researchers lack the computational resources to model protein folding. Mulligan hopes to address that limit. With allocations of supercomputer time from the Department of Energy (DOE), he and collaborators are developing machine learning methods and quantum computing technology, which relies on the strange physics that dominate at the tiniest scales, to improve protein-folding models and make them accessible to the average scientist.

Mulligan’s quest to understand the sequence-fold-function relationship began as a University of Toronto (CA) doctoral student, when he used lab experiments to investigate the kinetics of protein folding and misfolding in late-onset neurodegenerative diseases. As a postdoctoral researcher, he later sought additional computational skills and joined David Baker’s University of Washington lab, one of the preeminent hubs for computationally designed proteins.

Starting in the early 2000s, Baker developed Rosetta@home, an open-source software suite that hundreds of labs worldwide now use to predict and design protein structures.

Until Mulligan joined Baker’s lab, Rosetta usually was used to model proteins made from the 20 naturally occurring amino acids. Mulligan, however, saw potential in designing peptides – amino acid chains – made from nonnatural amino acids that differ from the natural ones in various ways, such as an extra chlorine atom here or a fully reconfigured side chain there. “This flexibility allows us to make structures of mixed handedness, where we have helices that spiral in opposite directions packing against one another and things like that,” Mulligan says. “Ultimately, it allows us to access much more diverse structures, which in turn means we can access more diverse functions.”

Increasing structural diversity, however, also means increasing the challenge of computationally exploring the space of possible protein sequences. Mulligan embraced the challenge. “The ultimate test of whether we really understand how proteins fold is when we try to make something new out of building blocks that nature doesn’t use.”

The first proof-of-principle compound derived from nonnatural amino acids that Mulligan computationally designed at Baker’s lab was a peptide that binds to and inhibits New Delhi metallo-beta-lactamase 1, an enzyme implicated in antibiotic-resistant bacteria. “If we can make something that inhibits this antibiotic-resistance factor then we can treat antibiotic-resistant infection using a combination of the inhibitor and existing antibiotics,” Mulligan says. “This would make all our old antibiotics useful again.”

Since joining Flatiron in 2018, Mulligan has worked on methods that use ever fewer computational resources to design proteins with ever greater precision. Now he’s pursuing a pair of projects with a million node-hours on Theta, a Cray XC40 supercomputer at the Argonne Leadership Computing Facility, via a DOE INCITE (Innovative and Novel Computational Impact on Theory and Experiment) allocation.

“On Theta, we can do a calculation in a day that might take us a week or two on smaller clusters,” Mulligan says. “That allows us to iterate very fast.” Theta’s mix of GPU- and CPU-based nodes are well suited to the research, he adds. “There are a number of computational problems related to protein folding that don’t map all that well to GPUs, so it is valuable to also have access to a lot of CPUs.”

For his first INCITE project, Mulligan will develop machine-learning methods with low computational cost that can approximate the output of demanding validation simulations. “As we develop new methods, we need to validate the method against a reliable, established method that might be more computationally expensive,” Mulligan says. “So, we will do a one-time run of a ton of calculations on Theta to produce the data on which we will train the machine-learning method.” Once trained, researchers can use the technique to perform design and validation tasks on smaller computing systems. “It’s a one-time expensive use of computation to enable a lot of cheap computations down the road.”

The second INCITE project focuses on two areas. First, Mulligan and his colleagues will use simulations of quantum computers running on Theta to design proteins using an energy function from Rosetta that’s based on classical physics. Second, they’ll attempt to improve computations of energies performed on standard computers by incorporating quantum mechanical energy calculations to complement the Rosetta energy function.

Quantum computers could implicitly consider every possible amino acid sequence and let researchers efficiently sample from the best ones. With collaborators Hans Melo at drug-design company Menten AI and Brian Weitzner at protein-engineering firm Outpace Bio, Mulligan has successfully mapped the protein design problem to quantum annealers, special-purpose quantum computers that solve optimization problems, such as finding the most stable folding state for proteins with specific amino acid sequences.

With Menten AI’s Alexey Galda and Gavin Crooks at the Berkeley Institute for Theoretical Science, Mulligan also is beginning to map the problem to general-purpose gate-based quantum computers. The team has validated the first real proteins that were designed on a quantum computer, working with New York University’s Paramjit “Bobby” Arora and his graduate student, Haley Merritt, to synthesize molecules, and with UCLA researchers Michael Sawaya and Todd Yeates to solve structures.

Although its rewards seem far off, quantum computing will open the door to exploring, for the first time, the full palette of thousands of nonnatural amino acids and other chemical building blocks available to researchers, Mulligan expects. “We hope this will be the extra little boost we need to design proteins that get across the cell membrane and bind to a target and have all the other properties we’d like to see in a drug.”

See the full article here.


Please help promote STEM in your local schools.

Stem Education Coalition

ASCRDiscovery is a publication of The U.S. Department of Energy

The United States Department of Energy (DOE) is a cabinet-level department of the United States Government concerned with the United States’ policies regarding energy and safety in handling nuclear material. Its responsibilities include the nation’s nuclear weapons program; nuclear reactor production for the United States Navy; energy conservation; energy-related research; radioactive waste disposal; and domestic energy production. It also directs research in genomics. the Human Genome Project originated in a DOE initiative. DOE sponsors more research in the physical sciences than any other U.S. federal agency, the majority of which is conducted through its system of National Laboratories. The agency is led by the United States Secretary of Energy, and its headquarters are located in Southwest Washington, D.C., on Independence Avenue in the James V. Forrestal Building, named for James Forrestal, as well as in Germantown, Maryland.

Formation and consolidation

In 1942, during World War II, the United States started the Manhattan Project, a project to develop the atomic bomb, under the eye of the U.S. Army Corps of Engineers. After the war in 1946, the Atomic Energy Commission (AEC) was created to control the future of the project. The Atomic Energy Act of 1946 also created the framework for the first National Laboratories. Among other nuclear projects, the AEC produced fabricated uranium fuel cores at locations such as Fernald Feed Materials Production Center in Cincinnati, Ohio. In 1974, the AEC gave way to the Nuclear Regulatory Commission, which was tasked with regulating the nuclear power industry and the Energy Research and Development Administration, which was tasked to manage the nuclear weapon; naval reactor; and energy development programs.

The 1973 oil crisis called attention to the need to consolidate energy policy. On August 4, 1977, President Jimmy Carter signed into law The Department of Energy Organization Act of 1977 (Pub.L. 95–91, 91 Stat. 565, enacted August 4, 1977), which created the Department of Energy. The new agency, which began operations on October 1, 1977, consolidated the Federal Energy Administration; the Energy Research and Development Administration; the Federal Power Commission; and programs of various other agencies. Former Secretary of Defense James Schlesinger, who served under Presidents Nixon and Ford during the Vietnam War, was appointed as the first secretary.

President Carter created the Department of Energy with the goal of promoting energy conservation and developing alternative sources of energy. He wanted to not be dependent on foreign oil and reduce the use of fossil fuels. With international energy’s future uncertain for America, Carter acted quickly to have the department come into action the first year of his presidency. This was an extremely important issue of the time as the oil crisis was causing shortages and inflation. With the Three-Mile Island disaster, Carter was able to intervene with the help of the department. Carter made switches within the Nuclear Regulatory Commission in this case to fix the management and procedures. This was possible as nuclear energy and weapons are responsibility of the Department of Energy.


On March 28, 2017, a supervisor in the Office of International Climate and Clean Energy asked staff to avoid the phrases “climate change,” “emissions reduction,” or “Paris Agreement” in written memos, briefings or other written communication. A DOE spokesperson denied that phrases had been banned.

In a May 2019 press release concerning natural gas exports from a Texas facility, the DOE used the term ‘freedom gas’ to refer to natural gas. The phrase originated from a speech made by Secretary Rick Perry in Brussels earlier that month. Washington Governor Jay Inslee decried the term “a joke”.


The Department of Energy operates a system of national laboratories and technical facilities for research and development, as follows:

Ames Laboratory
Argonne National Laboratory
Brookhaven National Laboratory
Fermi National Accelerator Laboratory
Idaho National Laboratory
Lawrence Berkeley National Laboratory
Lawrence Livermore National Laboratory
Los Alamos National Laboratory
National Renewable Energy Laboratory
Oak Ridge National Laboratory
Pacific Northwest National Laboratory
Princeton Plasma Physics Laboratory
Sandia National Laboratories
Savannah River National Laboratory
SLAC National Accelerator Laboratory
Thomas Jefferson National Accelerator Facility

Other major DOE facilities include:
Albany Research Center
Bannister Federal Complex
Bettis Atomic Power Laboratory – focuses on the design and development of nuclear power for the U.S. Navy
Kansas City Plant
Knolls Atomic Power Laboratory – operates for Naval Reactors Program Research under the DOE (not a National Laboratory)
National Petroleum Technology Office
Nevada Test Site
New Brunswick Laboratory
Office of Fossil Energy[32]
Office of River Protection[33]
Radiological and Environmental Sciences Laboratory
Y-12 National Security Complex
Yucca Mountain nuclear waste repository

Pahute Mesa Airstrip – Nye County, Nevada, in supporting Nevada National Security Site