Tagged: Genomics Toggle Comment Threads | Keyboard Shortcuts

  • richardmitnick 11:41 am on December 22, 2015 Permalink | Reply
    Tags: , Genomics,   

    From Harvard: “Researchers help cells forget who they are” 

    Harvard University

    Harvard University

    December 21, 2015
    Hannah Robbins, Harvard Stem Cell Institute Communications

    Erasing a cell’s memory makes it easier to manipulate them into becoming another type of cell

    1
    Induced pluripotent stem cell colonies generated after researchers at Harvard Stem Cell Institute suppressed the CAF1 gene. Photo by Sihem Cheloufi

    They say we can’t escape our past — no matter how much we change, we still have the memory of what came before. The same can be said of our cells.

    Mature cells, such as skin or blood cells, have a cellular “memory,” or record of how the cell changed as it developed from an uncommitted embryonic cell into a specialized adult cell. Now, Harvard Stem Cell Institute researchers at Massachusetts General Hospital (MGH), in collaboration with scientists from the Institutes of Molecular Biotechnology (IMBA) and Molecular Pathology (IMP) in Vienna, have identified genes that, when suppressed effectively, erase a cell’s memory, making it more susceptible to reprogramming and, consequently, making the process of reprogramming quicker and more efficient.

    The study was recently published in Nature.

    “We began this work because we wanted to know why a skin cell is a skin cell, and why does it not change its identity the next day, or the next month, or a year later?” said co-senior author Konrad Hochedlinger, an HSCI principal faculty member at MGH and Harvard’s Department of Stem Cell and Regenerative Biology, and a world expert in cellular reprogramming.

    Every cell in the human body has the same genome, or DNA blueprint, explained Hochedlinger, and it is how those genes are turned on and off during development that determines what kind of adult cell each becomes. By manipulating those genes and introducing new factors, scientists can unlock dormant parts of an adult cell’s genome and reprogram it into another cell type.

    However, “a skin cell knows it is a skin cell,” said IMBA’s Josef Penninger, even after scientists reprogram those skin cells into induced pluripotent stem cells (iPS cells) — a process that would ideally require a cell to “forget” its identity before assuming a new one.

    Cellular memory is often conserved, acting as a roadblock to reprogramming. “We wanted to find out which factors stabilize this memory and what mechanism prevents iPS cells from forming,” Penninger said.

    To identify potential factors, the team established a genetic library targeting known chromatin regulators — genes that control the packaging and bookmarking of DNA, and are involved in creating cellular memory.

    Hochedlinger and Sihem Cheloufi, co-first author and a postdoc in Hochedlinger’s lab, designed a screening approach that tested each of these factors.

    Of the 615 factors screened, the researchers identified four chromatin regulators, three of which had not yet been described, as potential roadblocks to reprogramming. In comparison to the three- to fourfold increase seen by suppressing previously known roadblock factors, inhibiting the newly described chromatin assembly factor 1 (CAF1) made the process 50- to 200-fold more efficient. Moreover, in the absence of CAF1, reprogramming turned out to be much faster: While the process normally takes nine days, the researchers could detect the first iPS cell after four days.

    “The CAF1 complex ensures that during DNA replication and cell division, daughter cells keep their memory, which is encoded on the histones that the DNA is wrapped around,” said Ulrich Elling, a co-first author from IMBA. “When we block CAF1, daughter cells fail to wrap their DNA the same way, lose this information, and covert into blank sheets of paper. In this state, they respond more sensitively to signals from the outside, meaning we can manipulate them much more easily.”

    By suppressing CAF1 the researchers were also able to facilitate the conversion of one type of adult cell directly into another, skipping the intermediary step of forming iPS cells, via a process called direct reprogramming, or transdifferentiation. Thus, CAF1 appears to act as a general guardian of cell identity whose depletion facilitates both the interconversion of one adult cell type to another as well as the conversion of specialized cells into iPS cells.

    In finding CAF1, the researchers identified a complex that allows cell memory to be erased and rewritten. “The cells forget who they are, making it easier to trick them into becoming another type of cell,” said Cheloufi.

    CAF1 may provide a general key to facilitate the “reprogramming” of cells to model disease and test therapeutic agents, IMP’s Johannes Zuber explained. “The best-case scenario,” he said, “is that with this insight, we hold a universal key in our hands that will allow us to model cells at will.”

    See the full article here .

    Please help promote STEM in your local schools.

    STEM Icon

    Stem Education Coalition

    Harvard University campus

    Harvard is the oldest institution of higher education in the United States, established in 1636 by vote of the Great and General Court of the Massachusetts Bay Colony. It was named after the College’s first benefactor, the young minister John Harvard of Charlestown, who upon his death in 1638 left his library and half his estate to the institution. A statue of John Harvard stands today in front of University Hall in Harvard Yard, and is perhaps the University’s best known landmark.

    Harvard University has 12 degree-granting Schools in addition to the Radcliffe Institute for Advanced Study. The University has grown from nine students with a single master to an enrollment of more than 20,000 degree candidates including undergraduate, graduate, and professional students. There are more than 360,000 living alumni in the U.S. and over 190 other countries.

     
  • richardmitnick 5:49 pm on December 7, 2015 Permalink | Reply
    Tags: , , Genomics   

    From Caltech: “Unlocking the Chemistry of Life” 

    Caltech Logo
    Caltech

    12/07/2015
    Jessica Stoller-Conrad

    1
    No image credit

    In just the span of an average lifetime, science has made leaps and bounds in our understanding of the human genome and its role in heredity and health—from the first insights about DNA structure in the 1950s to the rapid, inexpensive sequencing technologies of today. However, the 20,000 genes of the human genome are more than DNA; they also encode proteins to carry out the countless functions that are key to our existence. And we know much less about how this collection of proteins supports the essential functions of life.

    In order to understand the role each of these proteins plays in human health—and what goes wrong when disease occurs—biologists need to figure out what these proteins are and how they function. Several decades ago, biologists realized that to answer these questions on the scale of the thousands of proteins in the human body, they would have to leave the comfort of their own discipline to get some help from a standard analytical-chemistry technique: mass spectrometry. Since 2006, Caltech’s Proteome Exploration Laboratory (PEL) has been building on this approach to bridge the gap between biology and chemistry, in the process unlocking important insights about how the human body works.

    Scientists can easily sequence an entire genome in just a day or two, but sequencing a proteome—all of the proteins encoded by a genome—is a much greater challenge says Ray Deshaies, protein biologist and founder of the PEL. “One challenge is the amount of protein. If you want to sequence a person’s DNA from a few of their cheek cells, you first amplify—or make copies of—the DNA so that you’ll have a lot of it to analyze. However, there is no such thing as protein amplification,” Deshaies says. “The number of protein molecules in the cells that you have is the number that you have, so you must use a very sensitive technique to identify those very few molecules.” The best means available for doing this today is called shotgun mass spectrometry, Deshaies says. In general, mass spectrometry allows researchers to identify the amount and types of molecules that are present in a biological sample by separating and analyzing the molecules as gas ions, based on mass and charge; shotgun mass spectrometry—a combination of several techniques—applies this separation process specifically to digested, broken-down proteins, allowing researchers to identify the types and amounts of proteins that are present in a heterogeneous mixture.

    The first step of shotgun mass spectroscopy entails digesting a mixture of proteins into smaller fragments called peptides. The peptides are then separated based on their physical properties, and then they are sprayed into a mass spectrometer and blasted apart via collisions with gas molecules such as helium or nitrogen—a process that creates a unique fragmentation pattern for each peptide. This pattern, or “fingerprint,” of each peptide’s fragmentation can then be searched on a database and used to identify the protein this peptide came from.


    download mp4 video here.

    “Up until this technique was invented, people had to take a mixture of proteins, run a current through a polyacrylamide gel to separate the proteins by size, stain the proteins, and then physically cut the stained bands out of the gel to have each individual protein species sequenced,” says Deshaies. “But mass spectrometry technology has gotten so good that we can now cast a broader net by sequencing everything, then use data analysis to figure out what specific information is of interest after the dust settles down.”

    Deshaies began using this shotgun mass spectrometry in the late 1990s, but because the technology was still very new, all of the protein analysis had to be done at the outside laboratories that were inventing the methodology.

    In 2001, after realizing the potential of this field-changing technology, he and colleague Barbara Wold, the Bren Professor of Molecular Biology, applied for and received a Department of Energy grant for their very own mass spectrometer. When the instrument arrived on campus, demand began to surge. “Barbara and I were first just doing experiments for our own labs, but then other people on campus wanted us to help them apply this technology to their research problems,” Deshaies says.

    So he and Wold began campaigning for a larger, ongoing center where anyone could begin using mass spectrometry resources for protein research. In 2006, Deshaies and then chair of the Division of Biology (now the Division of Biology and Biological Engineering) Elliot Meyerowitz petitioned the Gordon and Betty Moore Foundation to secure funding for a formal Proteome Exploration Laboratory, as part of the foundation’s commitment to Caltech.

    The influx of cash dramatically expanded the capabilities and resources that were available to the PEL, allowing it to purchase the best and fastest mass spectrometry instruments available. But just as importantly, it also meant that the PEL could expand its human resources, Deshaies adds. Mostly students were running the instruments in the Deshaies lab, he says, so when they graduated or moved on, gaps were left in expertise. Sonja Hess came to Caltech in 2007 to fill that gap as director of the PEL.

    Hess, who came from a proteomics lab at the National Institutes of Health, knew the challenges of running an interdisciplinary center such as the PEL. Although the field of proteomics holds great promise for understanding big questions in many fields, including biology and medicine, mass spectrometry is still a highly technical method involving analytical chemistry and data science—and it’s a technique that many biologists were never trained in. Conversely, many chemists and mass spectrometry technicians don’t necessarily understand how to apply the technique to biological processes.

    By encouraging dialogue between these two sides, Hess says that the PEL crosses that barrier, helping apply mass spectrometry techniques to diverse research questions from more than 20 laboratories on campus. Creating this interdisciplinary and resource-rich environment has enabled a wide breadth of discoveries, says Hess. One major user of the PEL, chemist David Tirrell, has used the center for many collaborations involving a technique he developed with former colleagues Erin Schuman and Daniela Dieterich called BONCAT (for “bioorthogonal noncanonical amino-acid tagging”). BONCAT uses synthetic molecules that are not normally found in proteins in nature and that carry particular chemical tags. When these artificial amino acids are incubated with certain cells, they are taken up by the cells and incorporated into all newly formed proteins in those cells.

    The tags then allow researchers to identify and pull out proteins from the cells, thus enabling them to wash away all of the other untagged proteins from other cells that aren’t of interest. When this method is combined with mass spectrometry techniques, it enables researchers to achieve specificity in their results and determine which proteins are produced in a particular subset of cells during a particular time. “In my own laboratory, we work at making sure the method is adapted appropriately to the specifics of a biological problem. But we rely on collaborations with other laboratories to help us understand what the demands on the method are and what kinds of questions would be interesting to people in those fields,” Tirrell says.

    For example, Tirrell collaborated with biologist Paul Sternberg and the PEL, using BONCAT and mass spectrometry to analyze specific proteins from a few cells within a whole organism, a feat that had never been accomplished before. Using the nematode C. elegans, Sternberg and his team applied the BONCAT technique to tag proteins in the 20 cells of the worm’s pharynx, and then used the PEL resources to analyze proteome-wide information from just those 20 cells. The results, including identification of proteins that were not previously associated with the pharynx, were published in PNAS in 2014.

    The team is now trying to target the experiment to a single pair of neurons that help the worm to sense and avoid harmful chemicals—a first step in learning which proteins are essential to producing this responsive behavior. But analyzing protein information from just two cells is a difficult experiment, says Tirrell. “The challenge comes in separating out the proteins that are made in those two cells from the proteins in the rest of the hundreds of cells in the worm’s body. You’re only interested in two cells, but to get the proteins from those two cells, you’re essentially trying to wash away everything else— about 500 times as much ‘junk’ protein as the protein that you’re really interested in,” he says. “We’re working on these separation methods now because the ultimate experiment would be to find a way to use BONCAT and mass spec to pull out proteomic information from a single cell in an animal.”

    This next step is a big one, but Tirrell says that an advantage of the PEL is that the laboratory’s staff can focus on optimizing the very technical mass spectrometry aspects of an experiment, while researchers using the PEL can focus more holistically on the question they’re trying to answer. This was also true for biologist Mitch Guttman, who asked the laboratory to help him develop a mass spectrometry–based technique for identifying the proteins that hitchhike on a class of RNA genes called lncRNAs. Long noncoding RNAs—or lncRNAs (pronounced “link RNAs”) for short—are abundant in the human genome, but scientists know very little about how they work or what they do.

    Although it’s known that protein-coding genes start out as DNA, which is transcribed into RNA, which is then translated into the gene product, a protein, lncRNAs are never translated into proteins. Instead, they’re thought to act as scaffolds, corralling important proteins and bringing them to where they’re needed in the cell. In a study published in April 2015 in Nature, Guttman used a specific example of a lncRNA, a gene called Xist, to learn more about these hitchhiking proteins.

    “The big challenge to doing this was technical; we’ve never had a way to identify what proteins are actually interacting with a lncRNA molecule. By working with the PEL, we were able to develop a method based on mass spectrometry to actually purify and identify this complex of proteins interacting with a lncRNA in living cells,”Guttman says. “Once we had that information, we could really start to ask ourselves questions about these proteins and how are they working.”

    Using this new method, called RNA antisense purification with mass spectrometry (RAP-MS), Guttman’s lab determined that 10 proteins associate with the lncRNA Xist, and that three of those 10 are essential to the gene’s function—inactivating the second X chromosome in women, a necessary process that, if interrupted, results in the death of female embryos early in development. Guttman’s findings marked the first time that anyone had uncovered the detailed mechanism of action for an lncRNA gene. For decades, other research groups had been trying to solve this problem; however, the collaborative development of RAP-MS in the PEL provided the missing piece.

    Even Deshaies, who began doing shotgun mass spectrometry experiments in his own laboratory, now exclusively uses the PEL’s resources and says that the laboratory has played an essential support role in his work. He studies the normal balance of proteins in a cell and how this balance changes during disease. In a 2013 study published in Cell, his laboratory focused on a dynamic network of protein complexes called SCF complexes, which go through cycles of assembly and dissociation in a cell, depending on when they are needed.

    Because there was no insight into how these complexes form and disassemble, Deshaies and his colleagues used the PEL to quantitatively monitor how this protein network’s dynamics were changing within cells. They determined that SCF complexes are normally very stable, but in the presence of a protein called Cand1 they become very dynamic and rapidly exchange subunits. Because some components of the SCF complex have been implicated in the development of human diseases such as cancers, work is now being done to see if Cand1 holds promise as a target for a cancer therapeutic.

    Although Deshaies says that the PEL resources have become invaluable to his work, he adds that what makes the laboratory unique is how it benefits the entire institute—a factor that he hopes will encourage further support for its mission. “The value of the PEL is not just about what it contributes to my lab or to Dave Tirrell’s lab or to anyone else’s,” he says. “It’s about the breadth of PEL’s impact—the 20 or so labs that are bringing in samples and using this operation every year to do important work, like solving the mechanism of X-chromosome inactivation in females.”

    See the full article here .

    Please help promote STEM in your local schools.

    STEM Icon

    Stem Education Coalition

    The California Institute of Technology (commonly referred to as Caltech) is a private research university located in Pasadena, California, United States. Caltech has six academic divisions with strong emphases on science and engineering. Its 124-acre (50 ha) primary campus is located approximately 11 mi (18 km) northeast of downtown Los Angeles. “The mission of the California Institute of Technology is to expand human knowledge and benefit society through research integrated with education. We investigate the most challenging, fundamental problems in science and technology in a singularly collegial, interdisciplinary atmosphere, while educating outstanding students to become creative members of society.”
    Caltech buildings

     
  • richardmitnick 11:12 am on October 19, 2015 Permalink | Reply
    Tags: 4D, , Genomics,   

    From U Washington: “Researchers win $12-million to study the human genome in 4-D” 

    U Washington

    University of Washington

    10.15.2015
    Michael McCarthy

    1
    A computer-generated three-dimensional model of the yeast genome, which UW researchers described in a paper in the journal Nature in 2010.

    In order to fit within the nucleus of a cell, the human genome must bend, fold and coil into an unimaginably compact shape – and still function. This is no mean feat: The human genome is about 6.5 feet long, and the average cell nucleus is only 6 to10 micrometers (one-millionth of a meter) in diameter.

    How this happens and the genome’s three-dimensional shape within the nucleus are unknown. Nor is it known how the shape changes over time – the fourth dimension – as a cell develops, grows and goes about its specialized functions.

    “There’s a tendency to talk about the genome as a linear sequence and to forget about the fact that it’s folded,” said Dr. Jay Shendure, University of Washington associate professor of genome sciences and investigator with the Howard Hughes Medical Institute.

    2
    William Noble, left, and Jay Shendure will co-direct the UW Center for Nuclear Organization and Function.

    “To understand how the different parts of the genome talk to each other to control gene expression, we need to understand how the different elements are arranged in relation to each other in three-dimensional space.”

    To puzzle out this information and its effect on cell function in health and disease, UW researchers will join peers at five other academic institutions to create the Nuclear Organization and Function Interdisciplinary Consortium.

    Underwriting the consortium is the National Institutes of Health’s 4D Nucleome program. The UW was awarded $12 million over five years to conduct research in its new Center for Nuclear Organization and Function. Shendure and William Stafford Noble, a professor of genome sciences and computer science, will co-lead.

    UW researchers will first develop tools to work out the three- and four-dimensional architecture of the nucleome and to create computer models that predict changes in the architecture as cells grow, divide and differentiate into different types.

    The results of this work will then be tested in mouse and human cell lines and, if confirmed, be used to understand how changes in nuclear architecture affect development of normal and abnormal heart muscle.

    All tools and data developed by the project will be shared with researchers in and outside of the 4D Nucleome network of researchers and with the public.

    Other investigators who will be working on the project include: Cole Trapnell, assistant professor of genome sciences; Christine Disteche, professor of pathology; Zhijun Duan, research assistant professor of medicine (hematology); and Dr. Charles Murry, professor of pathology and interim director of the UW Institute of Institute for Stem Cell and Regenerative Medicine.

    See the full article here .

    Please help promote STEM in your local schools.

    STEM Icon

    Stem Education Coalition

    The University of Washington is one of the world’s preeminent public universities. Our impact on individuals, on our region, and on the world is profound — whether we are launching young people into a boundless future or confronting the grand challenges of our time through undaunted research and scholarship. Ranked number 10 in the world in Shanghai Jiao Tong University rankings and educating more than 54,000 students annually, our students and faculty work together to turn ideas into impact and in the process transform lives and our world. For more about our impact on the world, every day.

    So what defines us — the students, faculty and community members at the University of Washington? Above all, it’s our belief in possibility and our unshakable optimism. It’s a connection to others, both near and far. It’s a hunger that pushes us to tackle challenges and pursue progress. It’s the conviction that together we can create a world of good. Join us on the journey.

     
  • richardmitnick 12:28 pm on October 16, 2015 Permalink | Reply
    Tags: , , Genomics, MIT Whitehead Institute   

    From Broad Institute: “Screen of human genome reveals set of genes essential for cellular viability” 

    Broad Institute

    Broad Institute

    October 15th, 2015

    Whitehead Institute MIT

    Whitehead Institute Communications

    Using two complementary analytical approaches, scientists at Whitehead Institute of MIT and Broad Institute of MIT and Harvard have for the first time identified the universe of genes in the human genome essential for the survival and proliferation of human cell lines or cultured human cells.

    Their findings and the materials they developed in conducting the research will not only serve as invaluable resources for the global research community but should also have application in the discovery of drug-targetable genetic vulnerabilities in a variety of human cancers.

    Scientists have long known the essential genes in microorganisms, such as yeast, whose genomes are smaller and more easily manipulated. Most common yeast strains, for example, are haploid, meaning that genes exist in single copies, making it fairly simple for researchers to eliminate or “knock out” individual genes and assess the impact on the organism. However, owing to their greater complexity, diploid mammalian genomes, including the human genome, have been resistant to such knockout techniques—including RNA interference, which is hampered by off-target effects and incomplete gene silencing.

    2
    Diploid cells have two homologous copies of each chromosome.

    Now, however, through use of the breakthrough CRISPR (for clustered regularly interspersed short palindromic repeats) genome editing system , researchers in the labs of Whitehead Member David Sabatini and Broad Institute Director Eric Lander have been able to generate a genome-wide library of single-guide RNAs (sgRNAs) to screen for and identify the genes required for cellular viability.

    2
    Diagram of the CRISPR prokaryotic viral defense mechanism

    The sgRNA library targeted slightly more than 18,000 genes, of which approximately 10% proved to be essential. These findings are reported online this week in the journal Science.

    “This is the first report of human cell-essential genes,” says Tim Wang, a graduate student in the Sabatini and Lander labs and first author of the Science paper. “This answers a question people have been asking for quite a long time.”

    As might have been expected, Wang says that many of the essential genes are involved in fundamental biological processes, including DNA replication, RNA transcription, and translation of messenger RNA. But, as Wang also notes, approximately 300 of these essential genes are of a class not previously characterized, are largely located in the cellular compartment known as the nucleolus, and are associated with RNA processing. Wang says the precise function of these genes is the subject of future investigation.

    1
    Nucleus. The nucleolus is contained within the cell nucleus.

    To validate the results of the CRISPR screens, the group took the added step of screening for essential genes in a unique line of haploid human cells. Using an approach known as gene-trap mutagenesis (a method pioneered in part by former Whitehead Fellow Thijn Brummelkamp) in the haploid cells and comparing it to the CRISPR results, the researchers found significant, consistent overlap in the gene sets found to be essential. In a final step, the group tested their approaches in cell lines derived from two cancers, chronic myelogenous leukemia (CML) and Burkitt’s lymphoma, both of which have been extensively studied. The novel method not only identified the essentiality of the known genes—in the case of CML, it hit on the BCR and ABL1 genes, whose translocation is the target of the successful drug Gleevec—but also highlighted additional genes that may be therapeutic targets in these cancers.

    “The ability to zero in on the essential genes in the highly complex human system will give us new insight into how diseases, such as cancer, continue to resist efforts to defeat them,” Lander says.

    Wang, Lander, and Sabatini are enthusiastic about the potential applications of their work, as it should accelerate the identification of cancer drug targets while enhancing our understanding of the evolution of drug resistance, a major contributor to therapeutic failure. The researchers attribute this vast potential to the rigor that CRISPR brings to human genetics.

    “This is really the first time we can reliably, accurately, and systematically study genetics in mammalian cells,” Sabatini says. “It’s remarkable how well it’s working.”

    This work was supported by the National Institutes of Health (grant CA103866), the National Human Genome Research Institute (grant 2U54HG003067-10), the National Science Foundation, the MIT Whitaker Health Sciences Fund, and the Howard Hughes Medical Institute.

    About Whitehead Institute

    The Whitehead Institute is a world-renowned non-profit research institution dedicated to improving human health through basic biomedical research. Wholly independent in its governance, finances, and research programs, Whitehead shares a close affiliation with Massachusetts Institute of Technology through its faculty, who hold joint MIT appointments. For more information about the Whitehead Institute, go to wi.mit.edu.

    See the full article here .

    Please help promote STEM in your local schools.

    STEM Icon

    Stem Education Coalition

    Broad Institute Campus

    The Eli and Edythe L. Broad Institute of Harvard and MIT is founded on two core beliefs:

    This generation has a historic opportunity and responsibility to transform medicine by using systematic approaches in the biological sciences to dramatically accelerate the understanding and treatment of disease.
    To fulfill this mission, we need new kinds of research institutions, with a deeply collaborative spirit across disciplines and organizations, and having the capacity to tackle ambitious challenges.

    The Broad Institute is essentially an “experiment” in a new way of doing science, empowering this generation of researchers to:

    Act nimbly. Encouraging creativity often means moving quickly, and taking risks on new approaches and structures that often defy conventional wisdom.
    Work boldly. Meeting the biomedical challenges of this generation requires the capacity to mount projects at any scale — from a single individual to teams of hundreds of scientists.
    Share openly. Seizing scientific opportunities requires creating methods, tools and massive data sets — and making them available to the entire scientific community to rapidly accelerate biomedical advancement.
    Reach globally. Biomedicine should address the medical challenges of the entire world, not just advanced economies, and include scientists in developing countries as equal partners whose knowledge and experience are critical to driving progress.

    Harvard University

    MIT Widget

     
  • richardmitnick 10:16 am on October 15, 2015 Permalink | Reply
    Tags: , Genomics   

    From The Uncovering Genome Mysteries project at WCG: “Analyzing a wealth of data about the natural world” 

    New WCG Logo

    14 Oct 2015
    Wim Degrave, Ph.D.
    Laboratório de Genômica Funcional e Bioinformática Instituto Oswaldo Cruz – Fiocruz

    Summary
    The Uncovering Genome Mysteries project has already amassed data on over 200 million proteins, with the goal of understanding the common features of life everywhere on earth. There are tens of millions of calculations still to run, but the team is also making preparations for analysis and eventual publication of the data.

    1

    For almost a year now, Uncovering Genome Mysteries has been comparing protein sequences derived from the genomes of nearly all living organisms analyzed to date. Thanks to the volunteers that contribute computer time to World Community Grid, more than 34 million results have been returned with data on functional identification and protein similarities. Along with our collaborators in Australia, we’ve paid particular attention to microorganisms from different ecosystems, with special emphasis on marine organisms. More than 200 million proteins have been compared thus far, during the equivalent of 15,000 years of computation. The resulting data are sent to our computer servers at the Fiocruz Foundation in Rio de Janeiro, Brazil and now also to the University of New South Wales, Sydney, Australia. A last set of around 20 million protein sequences, determined over the last year, is now being added to the dataset and will be run on World Community Grid in the coming months.

    However, the task of functional mapping and comparison between proteins from all these organisms does not end there. Our team of scientists is, in the meantime, investing more efforts to optimize the algorithms for further analysis and representation of the data generated by World Community Grid volunteers, and preparing for the database systems that will make the results available to the scientific community. Once our data is public, we expect that the scientific community’s understanding of the intricate network of life will gain a completely new perspective, and that results will also contribute to the development of many new applications in health, agriculture and life sciences in general.

    This project is a cooperation between World Community Grid, the laboratory of Dr. Torsten Thomas and his team from the School of Biotechnology and Biomolecular Sciences & Centre for Marine Bio-Innovation at the University of New South Wales, Sydney, Australia, and our team at the Laboratory for Functional Genomics and Bioinformatics, at the Oswaldo Cruz Foundation – Fiocruz, in Brazil.

    See the full article here.

    Please help promote STEM in your local schools.
    STEM Icon

    Stem Education Coalition

    World Community Grid (WCG) brings people together from across the globe to create the largest non-profit computing grid benefiting humanity. It does this by pooling surplus computer processing power. We believe that innovation combined with visionary scientific research and large-scale volunteerism can help make the planet smarter. Our success depends on like-minded individuals – like you.”

    WCG projects run on BOINC software from UC Berkeley.

    BOINC is a leader in the field(s) of Distributed Computing, Grid Computing and Citizen Cyberscience.BOINC is more properly the Berkeley Open Infrastructure for Network Computing.

    BOINC WallPaper

    CAN ONE PERSON MAKE A DIFFERENCE? YOU BET!

    “Download and install secure, free software that captures your computer’s spare power when it is on, but idle. You will then be a World Community Grid volunteer. It’s that simple!” You can download the software at either WCG or BOINC.

    Please visit the project pages-
    Outsmart Ebola together

    Outsmart Ebola Together

    Mapping Cancer Markers
    mappingcancermarkers2

    Uncovering Genome Mysteries
    Uncovering Genome Mysteries

    Say No to Schistosoma

    GO Fight Against Malaria

    Drug Search for Leishmaniasis

    Computing for Clean Water

    The Clean Energy Project

    Discovering Dengue Drugs – Together

    Help Cure Muscular Dystrophy

    Help Fight Childhood Cancer

    Help Conquer Cancer

    Human Proteome Folding

    FightAIDS@Home

    World Community Grid is a social initiative of IBM Corporation
    IBM Corporation
    ibm

    IBM – Smarter Planet
    sp

     
  • richardmitnick 10:52 am on August 21, 2015 Permalink | Reply
    Tags: , Genomics,   

    From UCSC: Seagate gift supports UC Santa Cruz research on genomic data storage 

    UC Santa Cruz

    UC Santa Cruz

    August 20, 2015
    Tim Stephens

    Researchers in the Baskin School of Engineering at UC Santa Cruz are working with industry partner Seagate Technologies on new ways to structure and store massive amounts of genomic data. Seagate has donated data storage devices with a total capacity of 2.5 petabytes to support this effort.

    “This gift provides the basis for a major research program on storage of genomic data,” said Andy Hospodor, executive director of the Storage Systems Research Center (SSRC) at UC Santa Cruz.

    “Seagate is pleased to be a part of this important research effort. The storage requirements for genomics are staggering and the potential for medical breakthroughs even larger,” said Mark Re, senior vice president and CTO at Seagate.

    The gift, valued at $250,000, includes 1 petabyte of Seagate’s new Kinetic disk drives for object-based storage, plus an additional 1.5 petabytes of traditional Seagate SATA disk drives for use in existing clusters within the UC Santa Cruz Genomics Institute.

    1
    Ethan Miller, professor of computer science, directs the Center for Research in Storage Systems (CRSS). (Photo by Elena Zhukova)

    Large-scale test bed

    “This gives us a large-scale test bed that we can use to explore the organization of data for large-scale disk-based storage systems. We need to develop better ways to store and organize the vast quantities of data we’re generating,” said Ethan Miller, professor of computer science and director of the Center for Research in Storage Systems (CRSS) at UCSC.

    Miller and other storage systems researchers at UC Santa Cruz work closely with industry partners such as Seagate, and several of the center’s alumni and graduate students have been working at Seagate on the company’s latest disk technology. The Seagate storage donation will support research on new ways to structure and store genomic data using object stores and newly proposed open-source standards (APIs) for genomic data that are being developed by the Global Alliance for Genomics and Health.

    “Genomic data storage is one of several areas of emerging interest where we’ll be looking at using Seagate’s new intelligent disks to build large-scale storage systems,” Miller said.

    Genomics Institute

    The donation also adds over a petabyte of storage capacity to the genomics data storage cluster maintained by the UC Santa Cruz Genomics Institute at the San Diego Supercomputing Center. For Benedict Paten, a research scientist at the Genomics Institute, it’s all about speeding up the processing of genomic data.

    “We in genomics know that we have a big data problem,” Paten said. “We need to be able to compute on much larger volumes of data than we have before. The amount of genomic data is growing exponentially, and we haven’t been keeping up.”

    Part of the solution, he said, is distributed processing of large data sets in which the processing is done where the data are stored, instead of downloading the data over a network for processing. “Now we can put a lot of disks on the compute nodes for efficient distributed computation over large amounts of data. This donation is really important for our big data genomics efforts at UC Santa Cruz,” Paten said.

    See the full article here.

    Please help promote STEM in your local schools.

    STEM Icon

    Stem Education Coalition
    The University of California, Santa Cruz, opened in 1965 and grew, one college at a time, to its current (2008-09) enrollment of more than 16,000 students. Undergraduates pursue more than 60 majors supervised by divisional deans of humanities, physical & biological sciences, social sciences, and arts. Graduate students work toward graduate certificates, master’s degrees, or doctoral degrees in more than 30 academic fields under the supervision of the divisional and graduate deans. The dean of the Jack Baskin School of Engineering oversees the campus’s undergraduate and graduate engineering programs.

     
  • richardmitnick 11:02 am on July 23, 2015 Permalink | Reply
    Tags: , Genomics,   

    From UCSC: “Keck Foundation awards UC Santa Cruz $2 million for human genome variation project” 

    UC Santa Cruz

    UC Santa Cruz

    July 22, 2015
    Tim Stephens

    The UC Santa Cruz Genomics Institute has received a $2 million grant from the W. M. Keck Foundation for ongoing research to develop a comprehensive map of human genetic variation. The Human Genome Variation Map will be a valuable new resource for medical researchers, as well as for basic research on human evolution and diversity.

    2
    Human Genome Variation Map

    The Keck grant provides funding over two years for UC Santa Cruz researchers to create a full-scale map, building on the results of a one-year pilot project funded by the Simons Foundation.

    “We’ve been experimenting with pilot regions of the genome and evaluating a variety of methods. The next steps will be to take it from a prototype to a full-scale genome reference that we can release to the community,” said Benedict Paten, a research scientist at the Genomics Institute and co-principal investigator of the project.

    1
    Benedict Paten (Photo by Summer Stiegman)

    The Human Genome Variation Map is needed to overcome the limitations of using a single reference sequence for the human genome. Currently, new data from sequencing human genomes is analyzed by mapping the new sequences to one reference set of 24 human chromosomes to identify variants. But this approach leads to biases and mapping ambiguities, and some variants simply cannot be described with respect to the reference genome, according to David Haussler, distinguished professor of biomolecular engineering and scientific director of the Genomics Institute at UC Santa Cruz.

    Global Alliance

    Haussler and Paten are coordinating their work on the new map with the Global Alliance for Genomics and Health (GA4GH), which involves more than 300 collaborating institutions that have agreed to work together to enable secure sharing of genomic and clinical data. The overall vision of the global alliance includes a genomics platform based on something akin to the planned Human Genome Variation Map, along with open-source software tools to enable researchers to mine the data for new scientific and medical breakthroughs. In the long run, the map will be used to identify genomic variants encountered in precision medical care as well, Haussler said.

    The UCSC team has been collaborating with leading genomics researchers at other institutions to develop the map, which Paten began working on in 2014 as co-chair of the GA4GH Reference Variation Task Team. The new Human Genome Variation Map will replace the current assortment of isolated, incompatible databases of human genetic variation with a single, fundamental representation formalized as a very large mathematical graph. The clean mathematical formulation is a major strength of this new approach, Paten said.

    The primary reference genome is a linear sequence of DNA bases (represented by the letters A, C, T, and G). To build the Human Genome Variation Map, each new genome will be merged into the reference genome at the points where it matches the primary sequence, with variations appearing as additional alternate paths in the map.

    Mathematical structure

    This mathematical graph-based structure will augment the existing human reference genome with all common human variations, providing a means to name, identify, and analyze variations precisely and reproducibly. “The original human reference genome project gave us a detailed picture of one human genome. This map will give us a detailed picture of the world’s variety of human genomes,” Paten said.

    In the spirit of the original human genome project, the Human Genome Variation Map will be publicly and freely available to all. Haussler’s team at UC Santa Cruz made the first human genome sequence publicly available on the Internet 15 years ago. This new project has many parallels with that earlier work, in which UCSC genomics researchers assembled and posted the first human genome sequence and went on to create the widely used UCSC Genome Browser.

    “This is an infrastructure project for genomics that everyone agrees is important,” Paten said. “It is ambitious, and it requires a fundamental shift from thinking of the reference as one sequence to thinking of it as this structure that incorporates all variation. But now is the time to do it. We need to build a model that works, and make it easy enough to use to get community acceptance.”

    The UC Santa Cruz Genomics Institute is a fundraising priority of the $300-million Campaign for UC Santa Cruz.

    W. M. Keck Foundation

    Based in Los Angeles, the W. M. Keck Foundation was established in 1954 by the late W. M. Keck, founder of the Superior Oil Company. The Foundation’s grant making is focused primarily on pioneering efforts in the areas of medical, science and engineering research. The Foundation also maintains an undergraduate education program that promotes distinctive learning and research experiences for students in the sciences and in the liberal arts, and a Southern California Grant Program that provides support for the Los Angeles community, with a special emphasis on children and youth from low-income families, special needs populations and safety-net services. For more information, please visit www. wmkeck.org.

    See the full article here.

    Please help promote STEM in your local schools.

    STEM Icon

    Stem Education Coalition
    The University of California, Santa Cruz, opened in 1965 and grew, one college at a time, to its current (2008-09) enrollment of more than 16,000 students. Undergraduates pursue more than 60 majors supervised by divisional deans of humanities, physical & biological sciences, social sciences, and arts. Graduate students work toward graduate certificates, master’s degrees, or doctoral degrees in more than 30 academic fields under the supervision of the divisional and graduate deans. The dean of the Jack Baskin School of Engineering oversees the campus’s undergraduate and graduate engineering programs.

     
  • richardmitnick 7:50 am on May 2, 2015 Permalink | Reply
    Tags: , Genomics,   

    From Princeton: “Digging for Meaning in the Big Data of Human Biology” 

    Princeton University
    Princeton University

    April 28, 2015
    No Writer Credit

    1

    Since the Human Genome Project drafted the human body’s genetic blueprint more than a decade ago, researchers around the world have generated a deluge of information related to genes and the role they play in diseases like hypertension, diabetes, and various cancers.

    Although thousands of studies have made discoveries that promise a healthier future, crucial questions remain. An especially vexing challenge has been to identify the function of genes in specific cells, tissues, and organs. Because tissues cannot be studied by direct experimentation (in living people), and many disease-relevant cell types cannot be isolated for analysis, the data have emerged in bits and pieces through studies that produced mountains of disparate signals.

    A multi-year effort by researchers from Princeton and other universities and medical schools has taken a big step toward extracting knowledge from these big data collections and opening the door to new understanding of human illnesses. Their paper, published online by the prestigious biology journal Nature Genetics, demonstrates how computer science and statistical methods can comb broad expanses of diverse data to identify how genetic circuits function and change in different tissues relevant to disease.

    Led by Olga Troyanskaya, professor in the Department of Computer Science and the Lewis-Sigler Institute of Integrative Genomics and deputy director for genomics at the Simons Center for Data Analysis in New York, the team used integrative computational analysis to dig out interconnections and relationships buried in the data pile. The study collected and integrated about 38,000 genome-wide experiments from an estimated 14,000 publications. Their findings produced molecular-level functional maps for 144 different human tissues and cell types, including many that are difficult or impossible to uncover experimentally.

    “A key challenge in human biology is that genetic circuits in human tissues and cell types are very difficult to study experimentally,” Troyanskaya said. “For example, the podocyte cells in the kidneys, which are the cells that perform the filtering that the kidneys are responsible for, cannot be isolated and studied experimentally. Yet we must understand how proteins interact in these cells if we want to understand and treat chronic kidney disease. Our approach mines big data collections to build a map of how genetic circuits function in the podocyte cells, as well as in many other disease-relevant tissues and cell types.”

    These networks allow biomedical researchers to understand the function and interactions of genes in specific cellular contexts and can illuminate the molecular basis of many complex human diseases. The researchers developed an algorithm, which they call a network-guided association study, or NetWAS, that combines these tissue-specific functional maps with standard genome-wide association studies (GWAS) in order to identify genes that are causal drivers of human disease. Because the technique is completely data-driven, NetWAS avoids biases toward well-studied genes and diseases — enabling discovery of completely new disease-associated genes, processes, and pathways.

    To put NetWAS and the tissue-specific networks in the hands of biomedical researchers around the world, the team created an interactive server called GIANT (for Genome-scale Integrated Analysis of Networks in Tissues). GIANT allows users to explore these networks, compare how genetic circuits change across tissues, and analyze data from genetic studies to find genes that cause disease.

    Aaron K. Wong, a data scientist at the Simons Center for Data Analysis and formerly a graduate student in the computer science department at Princeton, played the lead role in creating GIANT. “Our goal was to develop a resource that was accessible to biomedical researchers,” he said. “For example, with GIANT, researchers studying Parkinson’s disease can search the substantia nigra network, which represents the brain region affected by Parkinson’s, to identify new genes and pathways involved in the disease.” Wong is one of three co-first authors of the paper.

    The paper’s other two co-first authors are Arjun Krishnan, a postdoctoral fellow at the Lewis-Sigler Institute; and Casey Greene, an assistant professor of genetics at Dartmouth College, who was a postdoctoral fellow at Lewis-Sigler from 2009 to 2012. The team also included Ran Zhang, a graduate student in Princeton’s Department of Molecular Biology, and Kara Dolinski, assistant director of the Lewis-Sigler Institute.

    Looking to the future, Troyanskaya sees practical therapeutic uses for the group’s findings about the interrelatedness of genetic actions. “Biomedical researchers can use these networks and the pathways that they uncover to understand drug action and side effects, and to repurpose drugs,” she said. “They can also be useful for understanding how various therapies work and how to develop new ones.”

    Other contributors to the study were Emanuela Ricciotti, Garret A. FitzGerald, and Tilo Grosser of the Department of Pharmacology and the Institute for Translational Medicine and Therapeutics at the Perelman School of Medicine, University of Pennsylvania; Rene A. Zelaya, of Dartmouth; Daniel S. Himmelstein, of the University of California, San Francisco; Boris M. Hartmann, Elena Zaslavsky, and Stuart C. Sealfon, of the Department of Neurology at the Icahn School of Medicine at Mount Sinai, in New York; and Daniel I. Chasman, of Brigham and Women’s Hospital and Harvard Medical School in Boston.

    The Simons Center for Data Analysis was formed in 2013 by the Simons Foundation, a private organization dedicated to advancing research in mathematics and the basic sciences.

    See the full article here.

    Please help promote STEM in your local schools.

    STEM Icon

    Stem Education Coalition
    Princeton University Campus

    About Princeton: Overview

    Princeton University is a vibrant community of scholarship and learning that stands in the nation’s service and in the service of all nations. Chartered in 1746, Princeton is the fourth-oldest college in the United States. Princeton is an independent, coeducational, nondenominational institution that provides undergraduate and graduate instruction in the humanities, social sciences, natural sciences and engineering.

    As a world-renowned research university, Princeton seeks to achieve the highest levels of distinction in the discovery and transmission of knowledge and understanding. At the same time, Princeton is distinctive among research universities in its commitment to undergraduate teaching.

    Today, more than 1,100 faculty members instruct approximately 5,200 undergraduate students and 2,600 graduate students. The University’s generous financial aid program ensures that talented students from all economic backgrounds can afford a Princeton education.

    Princeton Shield

     
  • richardmitnick 4:04 pm on April 16, 2015 Permalink | Reply
    Tags: , , , Genomics,   

    From Quanta: “How Structure Arose in the Primordial Soup” 

    Quanta Magazine
    Quanta Magazine

    Life’s first epoch saw incredible advances — cells, metabolism and DNA, to name a few. Researchers are resurrecting ancient proteins to illuminate the biological dark ages.

    April 16, 2015
    Emily Singer

    1
    Olena Shmahalo/Quanta Magazine

    About 4 billion years ago, molecules began to make copies of themselves, an event that marked the beginning of life on Earth. A few hundred million years later, primitive organisms began to split into the different branches that make up the tree of life. In between those two seminal events, some of the greatest innovations in existence emerged: the cell, the genetic code and an energy system to fuel it all. All three of these are essential to life as we know it, yet scientists know disappointingly little about how any of these remarkable biological innovations came about.

    “It’s very hard to infer even the relative ordering of evolutionary events before the last common ancestor,” said Greg Fournier, a geobiologist at the Massachusetts Institute of Technology. Cells may have appeared before energy metabolism, or perhaps it was the other way around. Without fossils or DNA preserved from organisms living during this period, scientists have had little data to work from.

    Fournier is leading an attempt to reconstruct the history of life in those evolutionary dark ages — the hundreds of millions of years between the time when life first emerged and when it split into what would become the endless tangle of existence.

    He is using genomic data from living organisms to infer the DNA sequence of ancient genes as part of a growing field known as paleogenomics. In research published online in March in the Journal of Molecular Evolution, Fournier showed that the last chemical letter added to the code was a molecule called tryptophan — an amino acid most famous for its presence in turkey dinners. The work supports the idea that the genetic code evolved gradually.

    Using similar methods, he hopes to decipher the temporal order of more of the code — determining when each letter was added to the genetic alphabet — and to date key events in the origins of life, such as the emergence of cells.

    Dark Origins

    Life emerged so long ago that even the rock formations covering the planet at that time have been destroyed — and with them, most chemical and geological clues to early evolution. “There’s a huge chasm between the origins of life and the last common ancestor,” said Eric Gaucher, a biologist at the Georgia Institute of Technology in Atlanta.

    2
    The stretch of time between the origins of life and the last universal common ancestor saw a series of remarkable innovations — the origins of cells, metabolism and the genetic code. But scientists know little about when they happened or the order in which they occurred. Olena Shmahalo/Quanta Magazine

    Scientists do know that at some point in that time span, living creatures began using a genetic code, a blueprint for making complex proteins. It is those proteins that carry out the vital functions of the cell. (The structure of DNA and RNA also enables genetic information to be replicated and passed on from generation to generation, but that’s a separate process from the creation of proteins.) The components of the code and the molecular machinery that assembles them “are some of the oldest and most universal aspects of cells, and biologists are very interested in understanding the mechanisms by which they evolved,” said Paul Higgs, a biophysicist at McMaster University in Hamilton, Ontario.

    How the code came into being presents a chicken-and-egg problem. The key players in the code — DNA, RNA, amino acids, and proteins — are chemically complicated structures that work together to make proteins. But in modern cells, proteins are used to make the components of the code. So how did a highly structured code emerge?

    Most researchers believe that the code began simply with basic proteins made from a limited alphabet of amino acids. It then grew in complexity over time, as these proteins learned to make more sophisticated molecules. Eventually, it developed into a code capable of creating all the diversity we see today. “It’s long been hypothesized that life’s ‘standard alphabet’ of 20 amino acids evolved from a simpler, earlier alphabet, much as the English alphabet has accumulated extra letters over its history,” said Stephen Freeland, a biologist at the University of Maryland, Baltimore County.

    The earliest amino acid letters in the code were likely the simplest in structure, those that can be made from purely chemical means, without the assistance of a protein helper. (For example, the amino acids glycine, alanine and glutamic acid have been found on meteorites, suggesting they can form spontaneously in a variety of environments.) These are like the letters A, E and S — primordial units that served as the foundation for what came later.

    Tryptophan, in comparison, has a complex structure and is comparatively rare in the protein code, like a Y or Z, leading scientists to theorize that it was one of the latest additions to the code.

    That chemical evidence is compelling, but circumstantial. Enter Fournier. He suspected that by extending his work on paleogenomics, he would be able to prove tryptophan’s status as the last letter added to the code.

    The Last Letter

    Scientists have been reconstructing ancient proteins for more than a decade, primarily to figure out how ancient proteins differed from modern ones — what they looked like and how they functioned. But these efforts have focused on the period of evolution after the last universal common ancestor (or LUCA, as researchers call it). Fournier’s work delves further back than any other previous efforts. To do so, he had to move beyond the standard application of comparative genomics, which analyzes the differences between branches on the tree of life. “By definition, anything pre-LUCA lies beyond the deepest split in the tree,” he said.

    Fournier started with two related proteins, TrpRS (tryptophanyl tRNA synthetase) and TyrRS (tyrosyl tRNA synthetase), which help decode RNA letters into the amino acids tryptophan and tyrosine. TrpRS and TyrRS are more closely related to each other than to any other protein, indicating that they evolved from the same ancestor protein. Sometime before LUCA, that parent protein mutated slightly to produce these two new proteins with distinct functions. Fournier used computational techniques to decipher what that ancestral protein must look like.

    4
    Greg Fournier, a geobiologist at MIT, is searching for the origins of the genetic code. Helen Hill

    He found that the ancestral protein has all the amino acids but tryptophan, suggesting that its addition was the finishing touch to the genetic code. “It shows convincingly that tryptophan was the last amino acid added, as has been speculated before but not really nailed as has been done here,” said Nigel Goldenfeld, a physicist at the University of Illinois, Urbana-Champaign, who was not involved in the study.

    Fournier now plans to use tryptophan as a marker to date other major pre-LUCA events such as the evolution of metabolism, cells and cell division, and the mechanisms of inheritance. These three processes form a sort of biological triumvirate that laid the foundation for life as we know it today. But we know little about how they came into existence. “If we understand the order of those basic steps, it creates an arrow pointing to possible scenarios for the origins of life,” Fournier said.

    For example, if the ancestral proteins involved in metabolism lack tryptophan, some form of metabolism probably evolved early. If proteins that direct cell division are studded with tryptophan, it suggests those proteins evolved comparatively late.

    Different models for the origins of life make different predictions for which of these three processes came first. Fournier hopes his approach will provide a way to rule out some of these models. However, he cautions that it won’t definitively sort out the timing of these events.

    Fournier plans to use the same techniques to figure out the order in which other amino acids were added to the code. “It really reinforces the idea that evolution of the code itself was a progressive process,” said Paul Schimmel, a professor of molecular and cell biology at the Scripps Research Institute, who was not involved in the study. “It speaks to the refinement and subtlety that nature was using to perfect these proteins and the diversity it needed to form this vast tree of life.”

    See the full article here.

    Please help promote STEM in your local schools.

    STEM Icon

    Stem Education Coalition

    Formerly known as Simons Science News, Quanta Magazine is an editorially independent online publication launched by the Simons Foundation to enhance public understanding of science. Why Quanta? Albert Einstein called photons “quanta of light.” Our goal is to “illuminate science.” At Quanta Magazine, scientific accuracy is every bit as important as telling a good story. All of our articles are meticulously researched, reported, edited, copy-edited and fact-checked.

     
  • richardmitnick 11:42 am on March 8, 2015 Permalink | Reply
    Tags: , , , Genomics,   

    From NYT: “Is Most of Our DNA Garbage?” 

    New York Times

    The New York Times

    MARCH 5, 2015
    CARL ZIMMER

    T. Ryan Gregory’s lab at the University of Guelph in Ontario is a sort of genomic menagerie, stocked with creatures, living and dead, waiting to have their DNA laid bare. Scorpions lurk in their terrariums. Tarantulas doze under bowls. Flash-frozen spiders and crustaceans — collected by Gregory, an evolutionary biologist, and his students on expeditions to the Arctic — lie piled in beige metal tanks of liquid nitrogen. A bank of standing freezers holds samples of mollusks, moths and beetles. The cabinets are crammed with slides splashed with the fuchsia-stained genomes of fruit bats, Siamese fighting fish and ostriches.

    1
    Moths in the lab of T. Ryan Gregory at the University of Guelph. Credit Jamie Campbell for The New York Times

    Gregory’s investigations into all these genomes has taught him a big lesson about life: At its most fundamental level, it’s a mess. His favorite way to demonstrate this is through what he calls the “onion test,” which involves comparing the size of an onion’s genome to that of a human. To run the test, Gregory’s graduate student Nick Jeffery brought a young onion plant to the lab from the university greenhouse. He handed me a single-edged safety razor, and then the two of us chopped up onion stems in petri dishes. An emerald ooze, weirdly luminous, filled my dish. I was so distracted by the color that I slashed my ring finger with the razor blade, but that saved me the trouble of poking myself with a syringe — I was to supply the human genome. Jeffery raised a vial, and I wiped my bleeding finger across its rim. We poured the onion juice into the vial as well and watched as the green and red combined to produce a fluid with both the tint and viscosity of maple syrup.

    3
    T. Ryan Gregory in his lab at University of Guelph. Credit Jamie Campbell for The New York Times

    After adding a fluorescent dye that attaches to DNA, Jeffrey loaded the vial into a boxy device called a flow cytometer, which sprayed the onion juice and blood through a laser beam. Each time a cell was hit, its DNA gave off a bluish glow; bigger genomes glowed more brightly. On a monitor, we watched the data accumulate on a graph. The cells produced two distinct glows, one dim, one bright, which registered on the graph as a pair of peaks.

    One peak represented my genome, or the entirety of my DNA. Genomes are like biological books, written in genetic letters known as bases; the human genome contains about 3.2 billion bases. Print them out as letters on a page, and they would fill a book a thousand times longer than “War and Peace.” Gregory leaned toward the screen. At 39, with a chestnut-colored goatee and an intense gaze, he somewhat resembles a pre-Heisenberg Walter White. He pointed out the onion’s peak. It showed that the onion’s genome was five times bigger than mine.

    “The onion wins,” Gregory said. The onion always does.

    But why? Why does an onion carry around so much more genetic material than a human? Or why, for that matter, do the broad-footed salamander (65.5 billion bases), the African lungfish (132 billion) and the Paris japonica flower (149 billion)? These organisms don’t appear to be more complex than we are, so Gregory rejects the idea that they’re accomplishing more with all their extra DNA. Instead, he champions an idea first developed in the 1970s but still startling today: that the size of an animal’s or plant’s genome has essentially no relationship to its complexity, because a vast majority of its DNA is — to put it bluntly — junk.

    The human genome contains around 20,000 genes, that is, the stretches of DNA that encode proteins. But these genes account for only about 1.2 percent of the total genome. The other 98.8 percent is known as noncoding DNA. Gregory believes that while some noncoding DNA is essential, most probably does nothing for us at all, and until recently, most biologists agreed with him. Surveying the genome with the best tools at their disposal, they believed that only a small portion of noncoding DNA showed any evidence of having any function.

    But in the past few years, the tide has shifted within the field. Recent studies have revealed a wealth of new pieces of noncoding DNA that do seem to be as important to our survival as our more familiar genes. Many of them may encode molecules that help guide our development from a fertilized egg to a healthy adult, for example. If these pieces of noncoding DNA become damaged, we may suffer devastating consequences like brain damage or cancer, depending on what pieces are affected. Large-scale surveys of the genome have led a number of researchers to expect that the human genome will turn out to be even more full of activity than previously thought.

    In January, Francis Collins, the director of the National Institutes of Health, made a comment that revealed just how far the consensus has moved. At a health care conference in San Francisco, an audience member asked him about junk DNA. “We don’t use that term anymore,” Collins replied. “It was pretty much a case of hubris to imagine that we could dispense with any part of the genome — as if we knew enough to say it wasn’t functional.” Most of the DNA that scientists once thought was just taking up space in the genome, Collins said, “turns out to be doing stuff.”

    For Gregory and a group of like-minded biologists, this idea is not just preposterous but also perilous, something that could yield bad science. The turn against the notion of junk DNA, they argue, is based on overinterpretations of wispy evidence and a willful ignorance of years of solid research on the genome. They’ve challenged their opponents face to face at scientific meetings. They’ve written detailed critiques in biology journals. They’ve commented on social media. When the N.I.H.’s official Twitter account relayed Collins’s claim about not using the term “junk DNA” anymore, Michael Eisen, a professor at the University of California, Berkeley, tweeted back with a profanity.

    The junk DNA wars are being waged at the frontiers of biology, but they’re really just the latest skirmish in an intellectual struggle that has played out over the past 200 years. Before Charles Darwin articulated his theory of evolution, most naturalists saw phenomena in nature, from an orchid’s petal to the hook of a vulture’s beak, as things literally designed by God. After Darwin, they began to see them as designs produced, instead, by natural selection. But some of our greatest biologists pushed back against the idea that everything we discover in an organism had to be an exquisite adaptation. To these biologists, a fully efficient genome would be inconsistent with the arbitrariness of our genesis, with the fact that every species emerged through pure happenstance, over eons of false starts. Where some look at all those billions of bases and see a finely tuned machine, others, like Gregory, see a disorganized, glorious mess.

    In 1953, Francis Crick and James Watson published a short paper in the journal Nature setting out the double-helix structure of DNA. That brief note sent biologists into a frenzy of discovery, leading eventually to multiple Nobel Prizes and to an unprecedented depth of understanding about how living things grow and reproduce. To make a protein from DNA, they learned, a cell makes a single-stranded copy of the relevant gene, using a molecule called RNA. It then builds a corresponding protein using the RNA as a guide.

    This research led scientists to assume that the genome was mostly made up of protein-coding DNA. But eventually scientists found this assumption hard to square with reality. In 1964, the German biologist Friedrich Vogel did a rough calculation of how many genes a typical human must carry. Scientists had already discovered how big the human genome was by staining the DNA in cells, looking at the cells through microscopes and measuring its size. If the human genome was made of nothing but genes, Vogel found, it would need to have an awful lot of them — 6.7 million genes by his estimate, a number that, when he published it in Nature, he admitted was “disturbingly high.” There was no evidence that our cells made 6.7 million proteins or anything close to that figure.

    Vogel speculated that a lot of the genome was made up of essential noncoding DNA — possibly operating as something like switches, for example, to turn genes on and off. But other scientists recognized that even this idea couldn’t make sense mathematically. On average, each baby is born with roughly 100 new mutations. If every piece of the genome were essential, then many of those mutations would lead to significant birth defects, with the defects only multiplying over the course of generations; in less than a century, the species would become extinct.

    4
    Cells are gathered from spiders for DNA studies at the lab of T. Ryan Gregory at the University of Guelph. Credit Jamie Campbell for The New York Times

    Faced with this paradox, Crick and other scientists developed a new vision of the genome during the 1970s. Instead of being overwhelmingly packed with coding DNA, the genome was made up mostly of noncoding DNA. And, what’s more, most of that noncoding DNA was junk — that is, pieces of DNA that do nothing for us. These biologists argued that some pieces of junk started out as genes, but were later disabled by mutations. Other pieces, called transposable elements, were like parasites, simply making new copies of themselves that were usually inserted harmlessly back in the genome.

    Junk DNA’s recognition was part of a bigger trend in biology at the time. A number of scientists were questioning the assumption that biological systems are invariably “well designed” by evolution. In a 1979 paper in The Proceedings of the Royal Society of London, Stephen Jay Gould and Richard Lewontin, both of Harvard, groused that too many scientists indulged in breezy storytelling to explain every trait, from antlers to jealousy, as an adaptation honed by natural selection for some essential function. Gould and Lewontin refer to this habit as the Panglossian paradigm, a reference to Voltaire’s “Candide,” in which the foolish Professor Pangloss keeps insisting, in the face of death and disaster, that we live in “the best of all possible worlds.” Gould and Lewontin did not deny that natural selection was a powerful force, but they stressed that it was not the only explanation for why species are the way they are. Male nipples are not adaptations, for example; they’re just along for the ride.

    Gould and Lewontin called instead for a broader vision of evolution, with room for other forces, for flukes and historical contingencies, for processes unfolding at different levels of life — what Gould often called “pluralism.” At the time, geneticists were getting their first glimpses of the molecular secrets of the human genome, and Gould and Lewontin saw more evidence for pluralism and against the Panglosses. Any two people may have millions of differences in their genomes. Most of those differences aren’t a result of natural selection’s guiding force; they just arise through random mutations, without any effect for good or ill.

    When Crick and others began to argue for junk DNA, they were guided by a similar vision of nature as slipshod. Just as male nipples are a useless vestige of evolution, so, in their theory, is a majority of our genome. Far from the height of machine-like perfection, the genome is largely a palimpsest of worthless instructions, a den of harmless parasites. Crick and his colleagues argued that transposable elements were common in our genome not because they did something essential for us, but because they could exploit us for their own replication. Gould delighted at this good intellectual company, arguing that transposable elements behaved like miniature organisms, evolving to become better at adding new copies to their host genomes. Our genomes were their ocean, their savanna. “They are merely playing Darwin’s game, but at the ‘wrong level,’ ” Gould wrote in 1981.

    Soon after Gould wrote those words, scientists set out to decipher the precise sequence of the entire human genome. It wasn’t until 2001, shortly before Gould’s death, that they published their first draft. They identified thousands of segments that had the hallmarks of dead genes. They found transposable elements by the millions. The Human Genome Project team declared that our DNA consisted of isolated oases of protein-coding genes surrounded by “vast expanses of unpopulated desert where only noncoding ‘junk’ DNA can be found.” Junk DNA had started out as a theoretical argument, but now the messiness of our evolution was laid bare for all to see.

    If you want to see the genome in a fundamentally different way, the best place to go is the third floor of Harvard’s Department of Stem Cell and Regenerative Biology, in a maze of cluttered benches, sequencing machines and microscopes. This is the lab of John Rinn, a 38-year-old former competitive snowboarder who likes to ponder biological questions on top of a skateboard, which he rides from one wall of his office to the other and back. Rinn is overseeing more than a dozen research projects looking for pieces of noncoding DNA that might once have been classified as junk but actually are essential for life.

    5
    John Rinn in his lab at Harvard. Credit Jamie Campbell for The New York Times

    Rinn studies RNA, but not the RNA that our cells use as a template for making proteins. Scientists have long known that the human genome contains some genes for other types of RNA: strands of bases that carry out other jobs in the cell, like helping to weld together the building blocks of proteins. In the early 2000s, Rinn and other scientists discovered that human cells were reading thousands of segments of their DNA, not just the coding parts, and producing RNA molecules in the process. They wondered whether these RNA molecules could be serving some vital function.
    Continue reading the main story

    As a postdoctoral fellow at Stanford University, Rinn decided he would try to show that one of these new RNA molecules had some important role. After a couple years of searching, he and a professor there, Howard Chang, settled on an RNA molecule that, somewhat bizarrely, was produced widely by skin cells below the waist but not above. Rinn and Chang were well aware that this pattern might be meaningless, but they set out to investigate it nevertheless. They had to give their enigmatic molecule a name, so they picked one that was a joke at their own expense: hotair. (“If it ends up being hot air, at least we tried,” Rinn said.)

    Rinn ran a series of experiments on skin cells to figure out what, if anything, hotair was doing. He carefully pulled hotair molecules out of the cells and examined them to see if they had attached to any other molecules. They had, in fact: they were stuck to a protein called Polycomb.

    Polycomb belongs to a group of proteins that are essential to the development of animals from a fertilized egg. They turn genes on and off in different patterns, so that a uniform clump of cells can give rise to bone, muscle and brain. Polycomb latches onto a number of genes and muzzles them, preventing them from making proteins. Rinn’s research revealed that hotair acts as a kind of guide for Polycomb, attaching to it and escorting it through the jungle of the cell to the precise spots on our DNA where it needs to silence genes.

    When Rinn announced this result in 2007, other geneticists were stunned. Cell, the journal that released it, hailed it as a breakthrough, calling Rinn’s paper one of the most important they had ever published. In the years since, Chang and other researchers have continued to examine hotair, using even more sophisticated tools. They bred engineered mice that lack the hotair gene, for example, and found that the mice developed a constellation of deformities, like stunted wrists and jumbled vertebrae. It appears very likely that hotair performs important jobs throughout the body, not just in the skin but in the skeleton and in other tissues too.

    In 2008, having been lured to Harvard, Rinn set up his new lab entirely in hopes of finding more hotair-like molecules. The first day I visited, a research associate named Diana Sanchez was dissecting mouse embryos the size of pinto beans. In a bowl of ice next to her were tubes for the parts she delicately removed — liver, leg, kidney, lung — that would be searched for cells making RNA molecules. After Rinn and I left Sanchez to her dissections, we ran into Martin Sauvageau, a blue-eyed Quebecer carrying a case of slides, each affixed with a slice of a mouse’s brain, with stains revealing cells making different RNA molecules. I tagged along with Sauvageau as he headed to a darkened microscope room to look at the slides with a pink-haired grad student named Abbie Groff. On one slide, a mouse’s brain looked as if it wore a cerulean mustache. To Groff, every pattern comes as a surprise. She once discovered an RNA molecule that created thousands of tiny rings on a mouse’s body, each encircling a hair follicle. “You come in in the morning, and it’s like Christmas,” she said.

    In December 2013, Rinn and his colleagues published the first results of their search: three potential new genes for RNA that appear to be essential for a mouse’s survival. To investigate each potential gene, the scientists removed one of the two copies in mice. When the mice mated, some of their embryos ended up with two copies of the gene, some with one and some with none. If these mice lacked any of these three pieces of DNA, they died in utero or shortly after birth. “You take away a piece of junk DNA, and the mouse dies,” Rinn said. “If you can come up with a criticism of that, go ahead. But I’m pretty satisfied. I’ve found a new piece of the genome that’s required for life.”

    As the scientists find new RNA molecules that look to be important, they are picking out a few to examine in close molecular detail. “I’m totally in love with this one,” Rinn said, standing at a whiteboard wall and drawing a looping line to illustrate yet another RNA molecule, one that he calls “firre.” The experiments that Rinn’s team has run on firre suggest that it performs a spectacular lasso act, grabbing onto three different chromosomes at once and drawing them together. Rinn suspects that there are thousands of RNA molecules encoded in our genomes that perform similar feats: bending DNA, unspooling it, bringing it in contact with certain proteins and otherwise endowing it with a versatility it would lack on its own.

    “It’s genomic origami,” Rinn said about this theory. “In every cell, you have the same piece of paper. Stem cell, brain cell, liver cell, it’s all made from the same piece of paper. How you fold that paper determines if you get a paper airplane or a duck. It’s the shape that you fold it into that matters. This has to be the 3-D code of biology.”

    To some biologists, discoveries like Rinn’s hint at a hidden treasure house in our genome. Because a few of these RNA molecules have turned out to be so crucial, they think, the rest of the noncoding genome must be crammed with riches. But to Gregory and others, that is a blinkered optimism worthy of Dr. Pangloss. They, by contrast, are deeply pessimistic about where this research will lead. Most of the RNA molecules that our cells make will probably not turn out to perform the sort of essential functions that hotair and firre do. Instead, they are nothing more than what happens when RNA-making proteins bump into junk DNA from time to time.

    “You say, ‘I found it — America!’ ” says Alex Palazzo, a biochemist at the University of Toronto who co-wrote a spirited defense of junk DNA with Gregory last year in the journal PLOS Genetics. “But probably what you found is a little bit of noise.”

    Palazzo and his colleagues also roll their eyes at the triumphant declarations being made about recent large-scale surveys of the human genome. One news release from an N.I.H. project declared, “Much of what has been called ‘junk DNA’ in the human genome is actually a massive control panel with millions of switches regulating the activity of our genes.” Researchers like Gregory consider this sort of rhetoric to be leaping far beyond the actual evidence. Gregory likens the search for useful pieces of noncoding DNA to using a metal detector to find gold buried at the beach. “The idea of combing the beach is a great idea,” he says. But you have to make sure your metal detector doesn’t go off when it responds to any metal. “You’re going to find bottle caps and nails,” Gregory says.

    He expects that as we examine the genome more closely, we’ll find many bottle caps and nails. It’s a prediction based, he and others argue, on the deep evolutionary history of our genome. Over millions of years, essential genes haven’t changed very much, while junk DNA has picked up many harmless mutations. Scientists at the University of Oxford have measured evolutionary change over the past 100 million years at every spot in the human genome. “I can today say, hand on my heart, that 8 percent, plus or minus 1 percent, is what I would consider functional,” Chris Ponting, an author of the study, says. And the other 92 percent? “It doesn’t seem to matter that much,” he says.

    It’s no coincidence, researchers like Gregory argue, that bona fide creationists have used recent changes in the thinking about junk DNA to try to turn back the clock to the days before Darwin. (The recent studies on noncoding DNA “clearly demonstrate we are ‘fearfully and wonderfully made’ by our Creator God,” declared the Institute for Creation Research.) In a sense, this debate stretches back to Darwin himself, whose 1859 book, “On the Origin of Species,” set the course for our understanding natural selection as a natural “designer.” Later in his life, Darwin took pains to stress that there was more to evolution than natural selection. He was frustrated to see how many of his readers thought he was arguing that natural selection was the only force behind life’s diversity. “Great is the power of steady misrepresentation,” Darwin grumbled when he updated the book for its sixth edition in 1872. In fact, he wrote, he was quite open-minded about other forces that might drive evolution, like “variations that seem to us in our ignorance to arise spontaneously.”

    Darwin was certainly ignorant about genomes, as scientists would continue to be for decades after his death. But Gregory argues that genomes embody the very mix of adaptation and arbitrariness that Darwin had in mind. Over millions of years, the human genome has spontaneously gotten bigger, swelling with useless copies of genes and new transposable elements. Our ancestors tolerated all that extra baggage because it wasn’t actually all that heavy. It didn’t make them inordinately sick. Copying all that extra DNA didn’t require them to draw off energy required for other tasks. They couldn’t add an infinite amount of junk to the genome, but they could accept an awful lot. To subtract junk, meanwhile, would require swarms of proteins to chop out every single dead gene or transposable element — without chopping out an essential gene. A genome evolving away its junk would lose the race to sloppier genomes, which left more resources for fighting diseases or having children.

    The blood-drenched slides that pack Gregory’s lab with their giant genomes only make sense, he argues, if we give up thinking about life as always evolving to perfection. To him, junk DNA isn’t a sign of evolution’s failure. It is, instead, evidence of its slow and slovenly triumph.

    See the full article here.

    Please help promote STEM in your local schools.

    STEM Icon

    Stem Education Coalition

     
c
Compose new post
j
Next post/Next comment
k
Previous post/Previous comment
r
Reply
e
Edit
o
Show/Hide comments
t
Go to top
l
Go to login
h
Show/Hide help
shift + esc
Cancel
Follow

Get every new post delivered to your Inbox.

Join 536 other followers

%d bloggers like this: