Tagged: The Baker Lab Toggle Comment Threads | Keyboard Shortcuts

  • richardmitnick 8:45 am on July 22, 2016 Permalink | Reply
    Tags: , , Proteins, , , The Baker Lab, This protein designer aims to revolutionize medicines and materials   

    From Science: “This protein designer aims to revolutionize medicines and materials” 



    David Baker shows off models of some of the unnatural proteins his team has designed and made.

    Jul. 21, 2016
    Robert F. Service

    David Baker appreciates nature’s masterpieces. “This is my favorite spot,” says the Seattle native, admiring the views from a terrace at the University of Washington (UW) here. To the south rises Mount Rainier, a 4400-meter glacier-draped volcano; to the west, the white-capped Olympic Mountain range.

    But head inside to his lab and it’s quickly apparent that the computational biochemist is far from satisfied with what nature offers, at least when it comes to molecules. On a low-slung coffee table lie eight toy-sized, 3D-printed replicas of proteins. Some resemble rings and balls, others tubes and cages—and none existed before Baker and his colleagues designed and built them. Over the last several years, with a big assist from the genomics and computer revolutions, Baker’s team has all but solved one of the biggest challenges in modern science: figuring out how long strings of amino acids fold up into the 3D proteins that form the working machinery of life. Now, he and colleagues have taken this ability and turned it around to design and then synthesize unnatural proteins intended to act as everything from medicines to materials.


    Already, this virtuoso proteinmaking has yielded an experimental HIV vaccine, novel proteins that aim to combat all strains of the influenza viruses simultaneously, carrier molecules that can ferry reprogrammed DNA into cells, and new enzymes that help microbes suck carbon dioxide out of the atmosphere and convert it into useful chemicals. Baker’s team and collaborators report making cages that assemble themselves from as many as 120 designer proteins, which could open the door to a new generation of molecular machines.

    f the ability to read and write DNA spawned the revolution of molecular biology, the ability to design novel proteins could transform just about everything else. “Nobody knows the implications,” because it has the potential to impact dozens of different disciplines, says John Moult, a protein-folding expert at the University of Maryland, College Park. “It’s going to be totally revolutionary.”

    Baker is by no means alone in this pursuit. Efforts to predict how proteins fold, and use that information to fashion novel versions, date back decades. But today he leads the charge. “David has really inspired the field,” says Guy Montelione, a protein structure expert at Rutgers University, New Brunswick, in New Jersey. “That’s what a great scientist does.”

    Baker, 53, didn’t start out with any such vision. Though both his parents were professors at UW—in physics and atmospheric sciences—Baker says he wasn’t drawn to science growing up. As an undergraduate at Harvard University, Baker tried studying philosophy and social studies. That was “a total waste of time,” he says now. “It was a lot of talk that didn’t necessarily add content.” Biology, where new insights can be tested and verified or discarded, drew him instead, and he pursued a Ph.D. in biochemistry. During a postdoc at the University of California, San Francisco, when he was studying how proteins move inside cells, Baker found himself captivated instead by the puzzle of how they fold. “I liked it because it’s getting at something fundamental.”

    In the early 1960s, biochemists at the U.S. National Institutes of Health (NIH) recognized that each protein folds itself into an intrinsic shape. Heat a protein in a solution and its 3D structure will generally unravel. But the NIH group noticed that the proteins they tested refold themselves as soon as they cool, implying that their structure stems from the interactions between different amino acids, rather than from some independent molecular folding machine inside cells. If researchers could determine the strength of all those interactions, they might be able to calculate how any amino acid sequence would assume its final shape. The protein-folding problem was born.

    From DNA to proteins

    The machinery for building proteins is essential for all life on earth. Click on the arrows at the bottom or swipe horizontally to learn more.

    One way around the problem is to determine protein structures experimentally, through methods such as x-ray crystallography and nuclear magnetic resonance (NMR) spectroscopy. But that’s slow and expensive. Even today, the Protein Data Bank, an international repository, holds the structures of only roughly 110,000 proteins out of the hundreds of millions or more thought to exist.

    Knowing the 3D structures of those other proteins would offer biochemists vital insights into each molecule’s function, such as whether it serves to ferry ions across a cell membrane or catalyze a chemical reaction. It would also give chemists valuable clues to designing new medicines. So, instead of waiting for the experimentalists, computer modelers such as Baker have tackled the folding problem with computer models.

    They’ve come up with two broad kinds of folding models. So-called homology models compare the amino acid sequence of a target protein with that of a template—a protein with a similar sequence and a known 3D structure. The models adjust their prediction for the target’s shape based on the differences between its amino acid sequence and that of the template. But there’s a major drawback: There simply aren’t enough proteins with known structures to provide templates—despite costly efforts to perform industrial-scale x-ray crystallography and NMR spectroscopy.

    Templates were even scarcer more than 2 decades ago, when Baker accepted his first faculty position at UW. That prompted him to pursue a second path, known as ab initio modeling, which calculates the push and pull between neighboring amino acids to predict a structure. Baker also set up a biochemistry lab to study amino acid interactions, in order to improve his models.

    Early on, Baker and Kim Simons, one of his first students, created an ab initio folding program called Rosetta, which broke new ground by scanning a target protein for short amino acid stretches that typically fold in known patterns and using that information to help pin down the molecule’s overall 3D configuration. Rosetta required such extensive computations that Baker’s team quickly found themselves outgrowing their computer resources at UW.

    Seeking more computing power, they created a crowdsourcing extension called Rosetta@home, which allows people to contribute idle computer time to crunching the calculations needed to survey all the likely protein folds. Later, they added a video game extension called Foldit, allowing remote users to apply their instinctive protein-folding insights to guide Rosetta’s search. The approach has spawned an international community of more than 1 million users and nearly two dozen related software packages that do everything from designing novel proteins to predicting the way proteins interact with DNA.

    “The most brilliant thing David has done is build a community,” says Neil King, a former Baker postdoc, now an investigator at UW’s Institute for Protein Design (IPD). Some 400 active scientists continually update and improve the Rosetta software. The program is free for academics and nonprofit users, but there’s a $35,000 fee for companies. Proceeds are plowed back into research and an annual party called RosettaCon in Leavenworth, Washington, where attendees mix mountain hikes and scientific talks.

    Despite this success, Rosetta was limited. The software was often accurate at predicting structures for small proteins, fewer than 100 amino acids in length. Yet, like other ab initio programs, it struggled with larger proteins. Several years ago, Baker began to doubt that he or anyone else would ever manage to solve most protein structures. “I wasn’t sure whether I would get there.”

    Now, he says, “I don’t feel that way anymore.”

    What changed his outlook was a technique first proposed in the 1990s by computational biologist Chris Sander, then with the European Molecular Biology Laboratory in Heidelberg, Germany, and now with Harvard. Those were the early days of whole genome sequencing, when biologists were beginning to decipher the entire DNA sequences of microbes and other organisms. Sander and others wondered whether gene sequences could help identify pairs of amino acids that, although distant from each other on the unfolded proteins, have to wind up next to each other after the protein folds into its 3D structure.

    Clues from genome sequences

    Comparing the DNA of similar proteins from different organisms shows that certain pairs of amino acids evolve in tandem—when one changes, so does the other. This suggests they are neighbors in the folded protein, a clue for predicting structure.

    Sander reasoned that the juxtaposition of those amino acids must be crucial to a protein’s function. If a mutation occurs, changing one of the amino acids so that it no longer interacts with its partner, the protein might no longer work, and the organism could suffer or die. But if both neighboring amino acids are mutated at the same time, they might continue to interact, and the protein might work as well or even better.

    The upshot, Sander proposed, was that certain pairs of amino acids necessary to a protein’s structure would likely evolve together. And researchers would be able to read out that history by comparing the DNA sequences of genes from closely related proteins in different organisms. Whenever such DNA revealed pairs of amino acids that appeared to evolve in lockstep, it would suggest that they were close neighbors in the folded protein. Put enough of those constraints on amino acid positions into an ab initio computer model, and the program might be able to work out a protein’s full 3D structure.

    Unfortunately, Sander says, his idea “was a little ahead of its time.” In the 1990s, there weren’t enough high-quality DNA sequence data from enough similar proteins to track coevolving amino acids.

    By the early part of this decade, however, DNA sequences were flooding in thanks to new gene-sequencing technology. Sander had also teamed up with Debora Marks at Harvard Medical School in Boston to devise a statistical algorithm capable of teasing out real coevolving pairs from the false positives that plagued early efforts. In a 2011 article in PLOS ONE, Sander, Marks, and colleagues reported that the coevolution technique could constrain the position of dozens of pairs of amino acids in 15 proteins—each from a different structural family—and work out their structures. Since then, Sander and Marks have shown that they can decipher the structure of a wide variety of proteins for which there are no homology templates. “It has changed the protein-folding game,” Sander says.

    It certainly did so for Baker. When he and colleagues realized that scanning genomes offered new constraints for Rosetta’s ab initio calculations, they seized the opportunity. They were already incorporating constraints from NMR and other techniques. So they rushed to write a new software program, called Gremlin, to automatically compare gene sequences and come up with all the likely coevolving amino acid pairs. “It was a natural for us to put them into Rosetta,” Baker says.

    The results have been powerful. Rosetta was already widely considered the best ab initio model. Two years ago, Baker and colleagues used their combined approach for the first time in an international protein-folding competition, the 11th Critical Assessment of protein Structure Prediction (CASP). The contest asks modelers to compute the structures of a suite of proteins for which experimental structures are just being worked out by x-ray crystallography or NMR. After modelers submit their predictions, CASP’s organizers then reveal the actual experimental structures. One submission from Baker’s team, on a large protein known as T0806, came back nearly identical to the experimental structure. Moult, who heads CASP, says the judge who reviewed the predicted structure immediately fired off an email to him saying “either someone solved the protein-folding problem, or cheated.”

    “We didn’t [cheat],” Sergey Ovchinnikov, a grad student in Baker’s lab, says with a chuckle.

    The implications are profound. Five years ago, ab initio models had determined structures for just 56 proteins of the estimated 8000 protein families for which there is no template. Since then, Baker’s team alone has added 900 and counting, and Marks believes the approach will already work for 4700 families. With genome sequence data now pouring into scientific databases, it will likely only be a couple years before protein-folding models have enough coevolution data to solve structures for nearly any protein, Baker and Sander predict. Moult agrees. “I have been waiting 10 years for a breakthrough,” he says. “This seems to me a breakthrough.”

    For Baker, it’s only the beginning. With Rosetta’s steadily improving algorithms and ever-greater computing power, his team has in essence mastered the rules for folding—and they’ve begun to use that understanding to try to one-up nature’s creations. “Almost everything in biomedicine could be impacted by an ability to build better proteins,” says Harvard synthetic biologist George Church.

    Baker notes that for decades researchers pursued a strategy he refers to as “Neandertal protein design,” tweaking the genes for existing proteins to get them to do new things. “We were limited by what existed in nature. … We can now short-cut evolution and design proteins to solve modern-day problems.”

    Take medicines, such as drugs to combat the influenza virus. Flu viruses come in many strains that mutate rapidly, which makes it difficult to find molecules that can knock them all out. But every strain contains a protein called hemagglutinin that helps it invade host cells, and a portion of the molecule, known as the stem, remains similar across many strains. Earlier this year, Baker teamed up with researchers at the Scripps Research Institute in San Diego, California, and elsewhere to develop a novel protein that would bind to the hemagglutinin stem and thereby prevent the virus from invading cells.

    The effort required 80 rounds of designing the protein, engineering microbes to make it, testing it in the lab, and reworking the structure. But in the 4 February issue of PLOS ONE, the researchers reported that when they administered their final creation to mice and then injected them with a normally lethal dose of flu virus, the rodents were protected. “It’s more effective than 10 times the dose of Tamiflu,” an antiviral drug currently on the market, says Aaron Chevalier, a former Baker Ph.D. student who now works at a Seattle biotech company called Virvio here that is working to commercialize the protein as a universal antiflu drug.

    Another potential addition to the medicine cabinet: a designer protein that chops up gluten, the infamous substance in wheat and other grains that people with Celiac disease or gluten sensitivity have trouble digesting. Ingrid Swanson Pultz began crafting the gluten-breaker even before joining Baker’s lab as a postdoc and is now testing it in animals and working with IPD to commercialize the research. And those self-assembling cages that debut this week could one day be filled with drugs or therapeutic snippets of DNA or RNA that can be delivered to disease sites throughout the body.

    The potential of these unnatural proteins isn’t limited to medicines. Baker, King, and their colleagues have also attached up to 120 copies of a molecule called green fluorescent protein to the new cages, creating nano-lanterns that could aid research by lighting up as they move through tissues.

    Church says he believes that designer proteins might soon rewrite the biology inside cells. In a paper last year in eLife, he, Baker, and colleagues designed proteins to bind to either a hormone or a heart disease drug inside cells, and then regulate the activity of a DNA-cutting enzyme, Cas9, that is part of the popular CRISPR genome-editing system. “The ability to design sensors [inside cells] is going to be big,” Church says. The strategy could allow researchers or physicians to target the powerful gene-editing system to a specific set of cells—those that are responding to a hormone or drug. Biosensors could also make it possible to switch on the expression of specific genes as needed to break down toxins or alert the immune cells to invaders or cancer.

    Protein for every purpose

    The ability to predict how an amino acid sequence will fold—and hence how the protein will function—opens the way to designing novel proteins that can catalyze specific chemical reactions or act as medicines or materials. Genes for these proteins can be synthesized and inserted into microbes, which build the proteins.

    2D arrays can be used as nanomaterials in various applications.


    Information can be coded into protein sequences, like DNA.


    Antagonists bind to a target protein, blocking its activation.


    Channels through membranes act as gateways.


    Cages can contain medicinal cargo or carry it on their surfaces.


    Sensors travel throughout the body to detect various signals.


    Baker’s lab is abuzz with other projects. Last year, his group and collaborators reported engineering into bacteria a completely new metabolic pathway, complete with a designer protein that enabled the microbes to convert atmospheric carbon dioxide into fuels and chemicals. Two years ago, they unveiled in Science proteins that spontaneously arrange themselves in a flat layer, like interlocking tiles on a bathroom floor. Such surfaces may lead to novel types of solar cells and electronic devices.

    In perhaps the most thought-provoking project, Baker’s team has designed proteins to carry information, imitating the way DNA’s four nucleic acid letters bind and entwine in the genetic molecule’s famed double helix. For now, these protein helixes can’t convey genetic information that cells can read. But they symbolize something profound: Protein designers have shed nature’s constraints and are now only limited by their imagination. “We can now build a whole new world of functional proteins,” Baker says.

    See the full article here .


    Rosetta@home runs on software from Berkeley Open Infrastructure for Network Computing (BOINC).
    Visit the BOINC website, download and install the BOINC software, attach to the Rosetta@home project. It is that simple. The project will use the available cpu cycles of your computer, tablet or cell phone to “crunch” data for the Baker Lab.

    While you are at the BOINC website, check out some of the other really important projects running at universities and institutions all over the world. They could all use your help and would run simultaneously with no conflicts on your devices.


    BOINC WallPaper

    The American Association for the Advancement of Science is an international non-profit organization dedicated to advancing science for the benefit of all people.

    Please help promote STEM in your local schools.
    STEM Icon
    Stem Education Coalition

  • richardmitnick 8:43 pm on June 13, 2012 Permalink | Reply
    Tags: , , , , The Baker Lab, ,   

    From Berkeley Lab: “Berkeley Lab Scientists Help Define the Healthy Human Microbiome” 

    Berkeley Lab

    Computing, bioinformatics, and microbial ecology resources play key role in mapping our microbial make-up

    June 13, 2012
    Dan Krotz

    You’re outnumbered. There are ten times as many microbial cells in you as there are your own cells.

    The human microbiome—as scientists call the communities of microorganisms that inhabit your skin, mouth, gut, and other parts of your body by the trillions—plays a fundamental role in keeping you healthy. These communities are also thought to cause disease when they’re perturbed. But our microbiome’s exact function, good and bad, is poorly understood. That could change.

    The bacterium, Enterococcus faecalis, which lives in the human gut, is just one type of microbe studied in NIH’s Human Microbiome Project. (Courtesy: United States Department of Agriculture)

    A National Institutes of Health (NIH)-organized consortium that includes scientists from the U.S. Department of Energy’s Lawrence Berkeley National Laboratory (Berkeley Lab) has for the first time mapped the normal microbial make-up of healthy humans. [Human Microbiome Project (HMP) is a United States National Institutes of Health initiative with the goal of identifying and characterizing the microorganisms which are found in association with both healthy and diseased humans (i.e. their microbial flora). Launched in 2008, it is a five-year project, best characterized as a feasibility study, and has a total budget of $115 million. The ultimate goal of this and similar NIH-sponsored microbiome projects is to test if changes in the human microbiome are associated with human health or disease. This topic is currently not well-understood.]

    The research will help scientists understand how our microbiome carries out vital tasks such as supporting our immune system and helping us digest food. It’ll also shed light on our microbiome’s role in diseases such as ulcerative colitis, Crohn’s disease, and psoriasis, to name a few.”

    See the full article here.

    For those interested – and you should be interested – the Human Protein Folding Project (HPF2) at the Bonneau Lab, New York University, is a participant in the HMP project. HPF2 is a project in Public Distributed Computing under the aegis of the World Community Grid (WCG), running on software from the Berkeley Open Infrastructure for Network Computing (BOINC) and using the project products of the rosetta@home project from the Baker Lab, University of Washington.

    That is a pretty long sentence. What it means is, if you visit WCG, or BOINC, and download the BOINC agent software for Windows, Linux, or Mac, you can attach to the HPF2 project and process data for HMP. While you are at it, look around at WCG website, there are about a dozen very worthwhile projects all aimed at curing illnesses and solving fundamental problems for mankind. Also, at the BOINC website the are a vast variety of projects in Biology, Chemistry, Physics, Mathematics, and Astronomy.

    Here are some pretty pictures.

    So, you know, when you see graphics, these are serious guys. Give them (us) a look.

    My BOINC stats.

  • richardmitnick 9:32 am on December 6, 2011 Permalink | Reply
    Tags: , , , , The Baker Lab, ,   

    From the New York Times: “Computer Scientists May Have What It Takes to Help Cure Cancer” – Another Blown Opportunity to boost BOINC 

    December 5, 2011

    This is copyright protected, so just a couple of hints.
    “The war against cancer is increasingly moving into cyberspace. Computer scientists may have the best skills to fight cancer in the next decade — and they should be signing up in droves….An inspirational example is the Foldit game — developed by the computer scientist Zoran Popovic at the University of Washington.

    Very nice, great article, but, huge gap. No mention of the roots of Dr Popovic’s successful adventure.

    Dr Popovic worked with The Baker Laboratory, the locus of rosetta@home, a project which runs on BOINC software from UC Berkeley. Rosetta@home has currently 37,456 “users” on 60162 “hosts”. The project does currently 58 TeraFLOPS of data per 24 hour period.

    On the one hand, you can certainly visit the Foldit web site to participate. If, on the other hand, you are not fond of games, you can visit the BOINC web site, download and install the small piece of software, and attach to the Rosetta project. You will receive small packs of data called “work units” or “WU’s” to “crunch”. As each WU is finished, your computer will return the results and you will receive more work.

    Rosetta software is also used by World Community Grid (WCG) project Human Proteome Folding. This project is based at New York University in the Bonneau Laboratory


    At both the WCG and BOINC web sites you will find many other really exciting projects in which you may participate. All WCG projects run on the BOINC software, along with the many independent projects at the BOINC web site.

    Once you have installed the BOINC software and attached to your chosen projects, you can be as active or passive in this process as you wish. You can pretty much simply let the stuff happen in the background and pay it scant attention. However, each project has its own forum covering many topics, including the science involved and the operation of the software. You can also check to see how your are doing by signing on at BOINCstats.com

    There are currently 286,105 “users” (people) on 515,015 “hosts” (computers) in all of BOINC. Currently we are doing 5,337 TeraFLOPS of work in a 24 hour period. That’s over half a PetafLOP, which would put us somewhere around 14th or 15th on the TOP500 list of supercomputers in the world. Except, in that world, we don’t count. WCG currently has 94,007 users on 211,163 hosts. We are currently at 278 TeraFLOPS.

    BOINC software will run on Windows, Mac and Linux based computers. So, whatever your flavor, why don’t you visit BOINC and WCG, give us a look, and try us out? The BOINC process never interferes with anything else that you are doing on the computer. If on occasion you require huge amounts of resources, such as “storming the castle”, BOINC will instantaneously give up its resources and pause until your battle is finished. I hope to run into you in a forum.

    Mr. Patterson work is an example of why I started this blog.

  • richardmitnick 5:29 pm on November 25, 2011 Permalink | Reply
    Tags: , , , , , The Baker Lab, ,   

    From WCG Project Human Proteome Folding (HPF2) Exciting Updates 

    Human Proteome Folding (HPF2)., a WCG project in The Bonneau Lab at New York University has posted some very exciting news. The report is copyright protected, so I will not trespass on that.

    Depictions of proteins

    HPF2 utilizes software developed by BOINC project Rosetta@home, in the The Baker Lab at University of Washington.

    You can see the report here.

    But WCG crunchers can be proud of the fact that we have contributed – this from the WCG web site – 96,695 years, 223 days, 09 hours,26 minutes, 30 seconds to this effort. This is the power of Public Distributed Computing via the BOINC software on which our projects are run.

    I cannot begin to contemplate how this work would have gotten to this point without us, except at the expensive cost of processing time on some supercomputer.


    You, too, dear reader, can be a part of this incredible process. Visit either WCG or BOINC, download and install the software, and attach to this and other worthy projects at the WCG web site and also at the BOINC website. You financial cost is about the same as a 100-150 watt light bulb. Your personal satisfaction at being a part of this is immeasurable.

Compose new post
Next post/Next comment
Previous post/Previous comment
Show/Hide comments
Go to top
Go to login
Show/Hide help
shift + esc
%d bloggers like this: