Tagged: Genomics Toggle Comment Threads | Keyboard Shortcuts

  • richardmitnick 7:24 am on March 4, 2018 Permalink | Reply
    Tags: , Barbara Engelhardt, , , Genomics, GTEx-Genotype-Tissue Expression Consortium, , ,   

    From Quanta Magazine: “A Statistical Search for Genomic Truths” 

    Quanta Magazine
    Quanta Magazine

    February 27, 2018
    Jordana Cepelewicz

    Barbara Engelhardt, a Princeton University computer scientist, wants to strengthen the foundation of biological knowledge in machine-learning approaches to genomic analysis. Sarah Blesener for Quanta Magazine.

    We don’t have much ground truth in biology.” According to Barbara Engelhardt, a computer scientist at Princeton University, that’s just one of the many challenges that researchers face when trying to prime traditional machine-learning methods to analyze genomic data. Techniques in artificial intelligence and machine learning are dramatically altering the landscape of biological research, but Engelhardt doesn’t think those “black box” approaches are enough to provide the insights necessary for understanding, diagnosing and treating disease. Instead, she’s been developing new statistical tools that search for expected biological patterns to map out the genome’s real but elusive “ground truth.”

    Engelhardt likens the effort to detective work, as it involves combing through constellations of genetic variation, and even discarded data, for hidden gems. In research published last October [Nature], for example, she used one of her models to determine how mutations relate to the regulation of genes on other chromosomes (referred to as distal genes) in 44 human tissues. Among other findings, the results pointed to a potential genetic target for thyroid cancer therapies. Her work has similarly linked mutations and gene expression to specific features found in pathology images.

    The applications of Engelhardt’s research extend beyond genomic studies. She built a different kind of machine-learning model, for instance, that makes recommendations to doctors about when to remove their patients from a ventilator and allow them to breathe on their own.

    She hopes her statistical approaches will help clinicians catch certain conditions early, unpack their underlying mechanisms, and treat their causes rather than their symptoms. “We’re talking about solving diseases,” she said.

    To this end, she works as a principal investigator with the Genotype-Tissue Expression (GTEx) Consortium, an international research collaboration studying how gene regulation, expression and variation contribute to both healthy phenotypes and disease.


    Right now, she’s particularly interested in working on neuropsychiatric and neurodegenerative diseases, which are difficult to diagnose and treat.

    Quanta Magazine recently spoke with Engelhardt about the shortcomings of black-box machine learning when applied to biological data, the methods she’s developed to address those shortcomings, and the need to sift through “noise” in the data to uncover interesting information. The interview has been condensed and edited for clarity.

    What motivated you to focus your machine-learning work on questions in biology?

    I’ve always been excited about statistics and machine learning. In graduate school, my adviser, Michael Jordan [at the University of California, Berkeley], said something to the effect of: “You can’t just develop these methods in a vacuum. You need to think about some motivating applications.” I very quickly turned to biology, and ever since, most of the questions that drive my research are not statistical, but rather biological: understanding the genetics and underlying mechanisms of disease, hopefully leading to better diagnostics and therapeutics. But when I think about the field I am in — what papers I read, conferences I attend, classes I teach and students I mentor — my academic focus is on machine learning and applied statistics.

    We’ve been finding many associations between genomic markers and disease risk, but except in a few cases, those associations are not predictive and have not allowed us to understand how to diagnose, target and treat diseases. A genetic marker associated with disease risk is often not the true causal marker of the disease — one disease can have many possible genetic causes, and a complex disease might be caused by many, many genetic markers possibly interacting with the environment. These are all challenges that someone with a background in statistical genetics and machine learning, working together with wet-lab scientists and medical doctors, can begin to address and solve. Which would mean we could actually treat genetic diseases — their causes, not just their symptoms.

    You’ve spoken before about how traditional statistical approaches won’t suffice for applications in genomics and health care. Why not?

    First, because of a lack of interpretability. In machine learning, we often use “black-box” methods — [classification algorithms called] random forests, or deeper learning approaches. But those don’t really allow us to “open” the box, to understand which genes are differentially regulated in particular cell types or which mutations lead to a higher risk of a disease. I’m interested in understanding what’s going on biologically. I can’t just have something that gives an answer without explaining why.

    The goal of these methods is often prediction, but given a person’s genotype, it is not particularly useful to estimate the probability that they’ll get Type 2 diabetes. I want to know how they’re going to get Type 2 diabetes: which mutation causes the dysregulation of which gene to lead to the development of the condition. Prediction is not sufficient for the questions I’m asking.

    A second reason has to do with sample size. Most of the driving applications of statistics assume that you’re working with a large and growing number of data samples — say, the number of Netflix users or emails coming into your inbox — with a limited number of features or observations that have interesting structure. But when it comes to biomedical data, we don’t have that at all. Instead, we have a limited number of patients in the hospital, a limited number of genotypes we can sequence — but a gigantic set of features or observations for any one person, including all the mutations in their genome. Consequently, many theoretical and applied approaches from statistics can’t be used for genomic data.

    What makes the genomic data so challenging to analyze?

    The most important signals in biomedical data are often incredibly small and completely swamped by technical noise. It’s not just about how you model the real, biological signal — the questions you’re trying to ask about the data — but also how you model that in the presence of this incredibly heavy-handed noise that’s driven by things you don’t care about, like which population the individuals came from or which technician ran the samples in the lab. You have to get rid of that noise carefully. And we often have a lot of questions that we would like to answer using the data, and we need to run an incredibly large number of statistical tests — literally trillions — to figure out the answers. For example, to identify an association between a mutation in a genome and some trait of interest, where that trait might be the expression levels of a specific gene in a tissue. So how can we develop rigorous, robust testing mechanisms where the signals are really, really small and sometimes very hard to distinguish from noise? How do we correct for all this structure and noise that we know is going to exist?

    So what approach do we need to take instead?

    My group relies heavily on what we call sparse latent factor models, which can sound quite mathematically complicated. The fundamental idea is that these models partition all the variation we observed in the samples, with respect to only a very small number of features. One of these partitions might include 10 genes, for example, or 20 mutations. And then as a scientist, I can look at those 10 genes and figure out what they have in common, determine what this given partition represents in terms of a biological signal that affects sample variance.

    So I think of it as a two-step process: First, build a model that separates all the sources of variation as carefully as possible. Then go in as a scientist to understand what all those partitions represent in terms of a biological signal. After this, we can validate those conclusions in other data sets and think about what else we know about these samples (for instance, whether everyone of the same age is included in one of these partitions).

    When you say “go in as a scientist,” what do you mean?

    I’m trying to find particular biological patterns, so I build these models with a lot of structure and include a lot about what kinds of signals I’m expecting. I establish a scaffold, a set of parameters that will tell me what the data say, and what patterns may or may not be there. The model itself has only a certain amount of expressivity, so I’ll only be able to find certain types of patterns. From what I’ve seen, existing general models don’t do a great job of finding signals we can interpret biologically: They often just determine the biggest influencers of variance in the data, as opposed to the most biologically impactful sources of variance. The scaffold I build instead represents a very structured, very complex family of possible patterns to describe the data. The data then fill in that scaffold to tell me which parts of that structure are represented and which are not.

    So instead of using general models, my group and I carefully look at the data, try to understand what’s going on from the biological perspective, and tailor our models based on what types of patterns we see.

    How does the latent factor model work in practice?

    We applied one of these latent factor models to pathology images [pictures of tissue slices under a microscope], which are often used to diagnose cancer. For every image, we also had data about the set of genes expressed in those tissues. We wanted to see how the images and the corresponding gene expression levels were coordinated.

    We developed a set of features describing each of the images, using a deep-learning method to identify not just pixel-level values but also patterns in the image. We pulled out over a thousand features from each image, give or take, and then applied a latent factor model and found some pretty exciting things.

    For example, we found sets of genes and features in one of these partitions that described the presence of immune cells in the brain. You don’t necessarily see these cells on the pathology images, but when we looked at our model, we saw a component there that represented only genes and features associated with immune cells, not brain cells. As far as I know, no one’s seen this kind of signal before. But it becomes incredibly clear when we look at these latent factor components.

    Video: Barbara Engelhardt, a computer scientist at Princeton University, explains why traditional machine-learning techniques have often fallen short for genomic analysis, and how researchers are overcoming that challenge. Sarah Blesener for Quanta Magazine

    You’ve worked with dozens of human tissue types to unpack how specific genetic variations help shape complex traits. What insights have your methods provided?

    We had 44 tissues, donated from 449 human cadavers, and their genotypes (sequences of their whole genomes). We wanted to understand more about the differences in how those genotypes expressed their genes in all those tissues, so we did more than 3 trillion tests, one by one, comparing every mutation in the genome with every gene expressed in each tissue. (Running that many tests on the computing clusters we’re using now takes about two weeks; when we move this iteration of GTEx to the cloud as planned, we expect it to take around two hours.) We were trying to figure out whether the [mutant] genotype was driving distal gene expression. In other words, we were looking for mutations that weren’t located on the same chromosome as the genes they were regulating. We didn’t find very much: a little over 600 of these distal associations. Their signals were very low.

    But one of the signals was strong: an exciting thyroid association, in which a mutation appeared to distally regulate two different genes. We asked ourselves: How is this mutation affecting expression levels in a completely different part of the genome? In collaboration with Alexis Battle’s lab at Johns Hopkins University, we looked near the mutation on the genome and found a gene called FOXE1, for a transcription factor that regulates the transcription of genes all over the genome. The FOXE1 gene is only expressed in thyroid tissues, which was interesting. But we saw no association between the mutant genotype and the expression levels of FOXE1. So we had to look at the components of the original signal we’d removed before — everything that had appeared to be a technical artifact — to see if we could detect the effects of the FOXE1 protein broadly on the genome.

    We found a huge impact of FOXE1 in the technical artifacts we’d removed. FOXE1, it seems, regulates a large number of genes only in the thyroid. Its variation is driven by the mutant genotype we found. And that genotype is also associated with thyroid cancer risk. We went back to the thyroid cancer samples — we had about 500 from the Cancer Genome Atlas — and replicated the distal association signal. These things tell a compelling story, but we wouldn’t have learned it unless we had tried to understand the signal that we’d removed.

    What are the implications of such an association?

    Now we have a particular mechanism for the development of thyroid cancer and the dysregulation of thyroid cells. If FOXE1 is a druggable target — if we can go back and think about designing drugs to enhance or suppress the expression of FOXE1 — then we can hope to prevent people at high thyroid cancer risk from getting it, or to treat people with thyroid cancer more effectively.

    The signal from broad-effect transcription factors like FOXE1 actually looks a lot like the effects we typically remove as part of the noise: population structure, or the batches the samples were run in, or the effects of age or sex. A lot of those technical influences are going to affect approximately similar numbers of genes — around 10 percent — in a similar way. That’s why we usually remove signals that have that pattern. In this case, though, we had to understand the domain we were working in. As scientists, we looked through all the signals we’d gotten rid of, and this allowed us to find the effects of FOXE1 showing up so strongly in there. It involved manual labor and insights from a biological background, but we’re thinking about how to develop methods to do it in a more automated way.

    So with traditional modeling techniques, we’re missing a lot of real biological effects because they look too similar to noise?

    Yes. There are a ton of cases in which the interesting pattern and the noise look similar. Take these distal effects: Pretty much all of them, if they are broad effects, are going to look like the noise signal we systematically get rid of. It’s methodologically challenging. We have to think carefully about how to characterize when a signal is biologically relevant or just noise, and how to distinguish the two. My group is working fairly aggressively on figuring that out.

    Why are those relationships so difficult to map, and why look for them?

    There are so many tests we have to do; the threshold for the statistical significance of a discovery has to be really, really high. That creates problems for finding these signals, which are often incredibly small; if our threshold is that high, we’re going to miss a lot of them. And biologically, it’s not clear that there are many of these really broad-effect distal signals. You can imagine that natural selection would eliminate the kinds of mutations that affect 10 percent of genes — that we wouldn’t want that kind of variability in the population for so many genes.

    But I think there’s no doubt that these distal associations play an enormous role in disease, and that they may be considered as druggable targets. Understanding their role broadly is incredibly important for human health.

    See the full article here .

    Please help promote STEM in your local schools.

    STEM Icon

    Stem Education Coalition

    Formerly known as Simons Science News, Quanta Magazine is an editorially independent online publication launched by the Simons Foundation to enhance public understanding of science. Why Quanta? Albert Einstein called photons “quanta of light.” Our goal is to “illuminate science.” At Quanta Magazine, scientific accuracy is every bit as important as telling a good story. All of our articles are meticulously researched, reported, edited, copy-edited and fact-checked.

  • richardmitnick 12:51 pm on February 13, 2018 Permalink | Reply
    Tags: , , DropSynth, , Genomics,   

    From UCLA Newsroom: “UCLA scientists develop low-cost way to build gene sequences” 

    UCLA Newsroom

    February 12, 2018
    Sarah C.P. Williams

    UCLA scientists used DropSynth to make thousands of bacterial genes with different versions of phosphopantetheine adenylyltransferase, or PPAT (pictured). Sriram Kosuri/UCLA.

    A new technique pioneered by UCLA researchers could enable scientists in any typical biochemistry laboratory to make their own gene sequences for only about $2 per gene. Researchers now generally buy gene sequences from commercial vendors for $50 to $100 per gene.

    The approach, DropSynth, which is described in the January issue of the journal Science, makes it possible to produce thousands of genes at once. Scientists use gene sequences to screen for gene’s roles in diseases and important biological processes.

    “Our method gives any lab that wants the power to build its own DNA sequences,” said Sriram Kosuri, a UCLA assistant professor of chemistry and biochemistry and senior author of the study. “This is the first time that, without a million dollars, an average lab can make 10,000 genes from scratch.”

    Increasingly, scientists studying a wide range of subjects in medicine — from antibiotic resistance to cancer — are conducting “high-throughput” experiments, meaning that they simultaneously screen hundreds or thousands of groups of cells. Analyzing large numbers of cells, each with slight differences in their DNA, for their ability to carry out a behavior or survive a drug treatment can reveal the importance of particular genes, or sections of genes, in those abilities.

    Such experiments require not only large numbers of genes but also that those genes are sequenced. Over the past 10 years, advances in sequencing have enabled researchers to simultaneously determine the sequences of many strands of DNA. So the cost of sequencing has plummeted, even as the process of generating genes has remained comparatively slow and expensive.

    “There’s an ongoing need to develop new gene synthesis techniques,” said Calin Plesa, a UCLA postdoctoral research fellow and co-first author of the paper. “The more DNA you can synthesize, the more hypotheses you can test.”

    The current methods for synthesizing genes, he said, either limit the length of a gene to about 200 base pairs — the sets of nucleotides that made up DNA — or are prohibitively expensive for most labs.

    The new method involves isolating small sections of thousands of genes in tiny droplets of water suspended in an oil. Each section of DNA is assigned a molecular “bar code,” which identifies the longer gene to which it belongs.

    Then, the sections, which initially are present in only very small amounts, are copied many times to increase their number. Finally, small beads are used to sort the mixture of DNA fragments into the right combinations to make longer genes, and the sections are combined. The result is a mixture of thousands of the desired genes, which can be used in experiments.

    To show that technique worked, the scientists used DropSynth to make thousands of bacterial genes — each as long as 669 base pairs in length. Each gene encoded a different bacterium’s version of the metabolic protein phosphopantetheine adenylyltransferase, or PPAT, which bacteria need to survive. Because PPAT is critical to bacteria that cause everything from sinus infections to pneumonia and food poisoning, it’s being studied as a potential antibiotic target.

    The researchers created a mixture of the thousands of versions of PPAT with DropSynth, and then added each gene to a version of E. coli that lacked PPAT and tested which ones allowed E. coli to survive. The surviving cells could then be used to screen potential antibiotics very quickly and at a low cost.

    DropSynth could potentially also be useful in engineering new proteins. Currently, scientists can use computer programs to design proteins that meet certain parameters, such as the ability to bind to certain molecules, but DropSynth could offer researchers hundreds or even thousands of options from which to choose the proteins that best fit their needs.

    The team is still working on reducing DropSynth’s error rate. In the meantime, though, the scientists have made the instructions publicly available on their website. All of the chemical substances needed to replicate the approach are commercially available.

    The study’s other authors are graduate students Nathan Lubock and Angus Sidore of UCLA, and Di Zhang of the University of Pennsylvania.

    Funding for the study was provided by the Netherlands Organisation for Scientific Research, the Human Frontier Science Program, the National Science Foundation, the National Institutes of Health, the Searle Scholars Program, the U.S. Department of Energy, and Linda and Fred Wudl.

    See the full article here .

    Please help promote STEM in your local schools.

    STEM Icon

    Stem Education Coalition

    UC LA Campus

    For nearly 100 years, UCLA has been a pioneer, persevering through impossibility, turning the futile into the attainable.

    We doubt the critics, reject the status quo and see opportunity in dissatisfaction. Our campus, faculty and students are driven by optimism. It is not naïve; it is essential. And it has fueled every accomplishment, allowing us to redefine what’s possible, time after time.

    This can-do perspective has brought us 12 Nobel Prizes, 12 Rhodes Scholarships, more NCAA titles than any university and more Olympic medals than most nations. Our faculty and alumni helped create the Internet and pioneered reverse osmosis. And more than 100 companies have been created based on technology developed at UCLA.

  • richardmitnick 2:17 pm on December 31, 2017 Permalink | Reply
    Tags: , Genomics, HiCRep method to accurately assess the reproducibility of data from Hi-C experiments, New insights into how the genome works inside of a cell, New statistical method for evaluating reproducibility in studies of genome organization" 2017, , Quite often correlation is treated as a proxy of reproducibility in many scientific disciplines but they actually are not the same thing, , With the massive amount of data that is being produced in whole-genome studies it is vital to ensure the quality of the data   

    From Pennsylvania State University: “New statistical method for evaluating reproducibility in studies of genome organization” 2017 

    Penn State Bloc

    Pennsylvania State University

    03 October 2017
    Qunhua Li:
    (814) 863-7395

    Barbara K. Kennedy:
    (814) 863-4682

    Sam Sholtis

    A new, statistical method to evaluate the reproducibility of data from Hi-C — a cutting-edge tool for studying how the genome works in three dimensions inside of a cell — will help ensure that the data in these “big data” studies is reliable.

    Schematic representation of the HiCRep method. HiCRep uses two steps to accurately assess the reproducibility of data from Hi-C experiments. Step 1: Data from Hi-C experiments (represented in triangle graphs) is first smoothed in order to allow researchers to see trends in the data more clearly. Step 2: The data is stratified based on distance to account for the overabundance of nearby interactions in Hi-C data. Credit: Li Laboratory, Penn State University

    “Hi-C captures the physical interactions among different regions of the genome,” said Qunhua Li, assistant professor of statistics at Penn State and lead author of the paper. “These interactions play a role in determining what makes a muscle cell a muscle cell instead of a nerve or cancer cell. However, standard measures to assess data reproducibility often cannot tell if two samples come from the same cell type or from completely unrelated cell types. This makes it difficult to judge if the data is reproducible. We have developed a novel method to accurately evaluate the reproducibility of Hi-C data, which will allow researchers to more confidently interpret the biology from the data.”

    The new method, called HiCRep, developed by a team of researchers at Penn State and the University of Washington, is the first to account for a unique feature of Hi-C data — interactions between regions of the genome that are close together are far more likely to happen by chance and therefore create spurious, or false, similarity between unrelated samples. A paper describing the new method appears in the journal Genome Research.

    “With the massive amount of data that is being produced in whole-genome studies, it is vital to ensure the quality of the data,” said Li. “With high-throughput technologies like Hi-C, we are in a position to gain new insight into how the genome works inside of a cell, but only if the data is reliable and reproducible.”

    Inside the nucleus of a cell there is a massive amount of genetic material in the form of chromosomes — extremely long molecules made of DNA and proteins. The chromosomes, which contain genes and the regulatory DNA sequences that control when and where the genes are used, are organized and packaged into a structure called chromatin. The cell’s fate, whether it becomes a muscle or nerve cell, for example, depends, at least in part, on which parts of the chromatin structure is accessible for genes to be expressed, which parts are closed, and how these regions interact. HiC identifies these interactions by locking the interacting regions of the genome together, isolating them, and then sequencing them to find out where they came from in the genome.

    The HiCRep method is able to accurately reconstruct the biological relationship between different cell types, where other methods fail. Credit: Li Laboratory, Penn State University

    “It’s kind of like a giant bowl of spaghetti in which every place the noodles touch could be a biologically important interaction,” said Li. “Hi-C finds all of these interactions, but the vast majority of them occur between regions of the genome that are very close to each other on the chromosomes and do not have specific biological functions. A consequence of this is that the strength of signals heavily depends on the distance between the interaction regions. This makes it extremely difficult for commonly-used reproducibility measures, such as correlation coefficients, to differentiate Hi-C data because this pattern can look very similar even between very different cell types. Our new method takes this feature of Hi-C into account and allows us to reliably distinguish different cell types.”

    “This reteaches us a basic statistical lesson that is often overlooked in the field,” said Li. “Quite often, correlation is treated as a proxy of reproducibility in many scientific disciplines, but they actually are not the same thing. Correlation is about how strongly two objects are related. Two irrelevant objects can have high correlation by being related to a common factor. This is the case here. Distance is the hidden common factor in the Hi-C data that drives the correlation, making the correlation fail to reflect the information of interest. Ironically, while this phenomenon, known as the confounding effect in statistical terms, is discussed in every elementary statistics course, it is still quite striking to see how often it is overlooked in practice, even among well-trained scientists.“

    The researchers designed HiCRep to systematically account for this distance-dependent feature of Hi-C data. In order to accomplish this, the researchers first smooth the data to allow them to see trends in the data more clearly. They then developed a new measure of similarity that is able to more easily distinguish data from different cell types by stratifying the interactions based on the distance between the two regions. “This is like studying the effect of drug treatment for a population with very different ages. Stratifying by age helps us focus on the drug effect. For our case, stratifying by distance helps us focus on the true relationship between samples.”

    To test their method, the research team evaluated Hi-C data from several different cell types using HiCRep and two traditional methods. Where the traditional methods were tripped up by spurious correlations based on the excess of nearby interactions, HiCRep was able to reliably differentiate the cell types. Additionally, HiCRep could quantify the amount of difference between cell types and accurately reconstruct which cells were more closely related to one another.

    In addition to Li, the research team includes Tao Yang, Feipeng Zhang, Fan Song, Ross C. Hardison, and Feng Yue at Penn State; and Galip Gürkan Yardımcı and William Stafford Noble at the University of Washington. The research was supported by the U.S. National Institutes of Health, a Computation, Bioinformatics, and Statistics (CBIOS) training grant at Penn State, and the Huck Institutes of the Life Sciences at Penn State.

    See the full article here .

    Please help promote STEM in your local schools.

    STEM Icon

    Stem Education Coalition

    Penn State Campus


    We teach students that the real measure of success is what you do to improve the lives of others, and they learn to be hard-working leaders with a global perspective. We conduct research to improve lives. We add millions to the economy through projects in our state and beyond. We help communities by sharing our faculty expertise and research.

    Penn State lives close by no matter where you are. Our campuses are located from one side of Pennsylvania to the other. Through Penn State World Campus, students can take courses and work toward degrees online from anywhere on the globe that has Internet service.

    We support students in many ways, including advising and counseling services for school and life; diversity and inclusion services; social media sites; safety services; and emergency assistance.

    Our network of more than a half-million alumni is accessible to students when they want advice and to learn about job networking and mentor opportunities as well as what to expect in the future. Through our alumni, Penn State lives all over the world.

    The best part of Penn State is our people. Our students, faculty, staff, alumni, and friends in communities near our campuses and across the globe are dedicated to education and fostering a diverse and inclusive environment.

  • richardmitnick 7:34 pm on November 24, 2017 Permalink | Reply
    Tags: , Genomics, ,   

    From Uncovering Genome Mysteries Project at WCG: “Analysis Underway on 30 Terabytes of Data” 

    New WCG Logo


    World Community Grid (WCG)

    24 Nov 2017

    The Uncovering Genome Mysteries data (all 30 terabytes) was transferred to the research teams in Brazil and Australia this year. Now, the researchers are analyzing this vast amount of data, and looking for ways to make it easy for other scientists and the public to understand.

    In this video, Dr. Torsten Thomas explains the primary goals of the Uncovering Genome Mysteries project.

    Last year, World Community Grid volunteers completed the calculations for the Uncovering Genome Mysteries project, which examined approximately 200 million genes from a wide variety of life forms to help discover new protein functions. The project’s main goals include:

    Discovering new protein functions and augmenting knowledge about biochemical processes in general
    Identifying how organisms interact with each other and the environment
    Documenting the current baseline microbial diversity, allowing a better understanding of how microorganisms change under environmental stresses, such as climate change
    Understanding and modeling complex microbial systems

    Transferring 30 Terabytes of Data

    The data generated by World Community Grid volunteers has been regrouped on the new bioinformatics server at the Oswaldo Cruz Foundation (Fiocruz), under the direction of Dr. Wim Degrave. Additionally, a full copy of all data has been sent to co-investigator Dr. Torsten Thomas and his team from the Centre for Marine Bio-Innovation & the School of Biological, Earth and Environmental Sciences at the University of New South Wales in Sydney, Australia. At the University of New South Wales, the results from protein comparisons will help to interpret the analyses of marine bacterial ecosystems, where micro-organisms, coral reef, sponges and many other intriguing creatures interact and form their life communities. The dataset, more than 30 terabytes under highly compressed form, took a few months to be transferred from Brazil to Australia.

    Data Processing and Analysis at Fiocruz

    The Fiocruz team has been busy with the further processing of the primary output of the project. In the workflow, raw data are expanded and deciphered, associated with the correct inter-genome comparisons, checked for errors, tabulated, and associated with many different data objects to transform that into meaningful information.

    The team is dealing with the rapidly growing size of the database, and purchased and installed new hardware (600 Tb) to help accommodate all the data. They also wish to build a database interface that appeals to the general public interested in biodiversity, and not only to scientists who specialize in functional analysis of encoded proteins in genomes of particular life forms.

    Some of the data are currently being used in projects such as vaccine and drug design against arboviruses such as Zika, dengue, and yellow fever viruses, but also for understanding of the interaction of bacteria with their environment and how this reflects in their metabolic pathways, when free living bacteria are compared with their close relatives that are human pathogens, such as Mycobacterium tuberculosis versus environmental mycobacteria.

    Searching for Partnerships

    Fiocruz is looking for partnerships that would add extra data analytics and artificial intelligence to the project. The researchers would like to include visualizations of functional connections between organisms as well as particularities from a wide variety of organisms, including deep sea thermal vent archaeal bacteria; bacteria and protists (any one-celled organism that is not an animal, plant or fungus) from soil, water, land, and sea or important for human, animal, or plant health; and highly complex plant, animal, and human genomes.

    We thank everyone who participated in the World Community Grid portion of this project, and look forward to sharing more updates as we continue to analyze the data.

    See the full article here.

    Ways to access the blog:

    Please help promote STEM in your local schools.
    STEM Icon

    Stem Education Coalition

    World Community Grid (WCG) brings people together from across the globe to create the largest non-profit computing grid benefiting humanity. It does this by pooling surplus computer processing power. We believe that innovation combined with visionary scientific research and large-scale volunteerism can help make the planet smarter. Our success depends on like-minded individuals – like you.”
    WCG projects run on BOINC software from UC Berkeley.

    BOINC is a leader in the field(s) of Distributed Computing, Grid Computing and Citizen Cyberscience.BOINC is more properly the Berkeley Open Infrastructure for Network Computing.

    BOINC WallPaper


    My BOINC
    “Download and install secure, free software that captures your computer’s spare power when it is on, but idle. You will then be a World Community Grid volunteer. It’s that simple!” You can download the software at either WCG or BOINC.

    Please visit the project pages-

    FightAIDS@home Phase II

    FAAH Phase II

    Rutgers Open Zika

    Help Stop TB
    WCG Help Stop TB
    Outsmart Ebola together

    Outsmart Ebola Together

    Mapping Cancer Markers

    Uncovering Genome Mysteries
    Uncovering Genome Mysteries

    Say No to Schistosoma

    GO Fight Against Malaria

    Drug Search for Leishmaniasis

    Computing for Clean Water

    The Clean Energy Project

    Discovering Dengue Drugs – Together

    Help Cure Muscular Dystrophy

    Help Fight Childhood Cancer

    Help Conquer Cancer

    Human Proteome Folding




    World Community Grid is a social initiative of IBM Corporation
    IBM Corporation

    IBM – Smarter Planet

  • richardmitnick 11:53 am on July 29, 2017 Permalink | Reply
    Tags: 3D structure of human chromatin, , ChromEMT, , , Genomics, , ,   

    From Salk: “Salk scientists solve longstanding biological mystery of DNA organization” 

    Salk Institute bloc

    Salk Institute for Biological Studies

    July 27, 2017

    Stretched out, the DNA from all the cells in our body would reach Pluto. So how does each tiny cell pack a two-meter length of DNA into its nucleus, which is just one-thousandth of a millimeter across?

    The answer to this daunting biological riddle is central to understanding how the three-dimensional organization of DNA in the nucleus influences our biology, from how our genome orchestrates our cellular activity to how genes are passed from parents to children.

    Now, scientists at the Salk Institute and the University of California, San Diego, have for the first time provided an unprecedented view of the 3D structure of human chromatin—the combination of DNA and proteins—in the nucleus of living human cells.

    In the tour de force study, described in Science on July 27, 2017, the Salk researchers identified a novel DNA dye that, when paired with advanced microscopy in a combined technology called ChromEMT, allows highly detailed visualization of chromatin structure in cells in the resting and mitotic (dividing) stages. By revealing nuclear chromatin structure in living cells, the work may help rewrite the textbook model of DNA organization and even change how we approach treatments for disease.

    “One of the most intractable challenges in biology is to discover the higher-order structure of DNA in the nucleus and how is this linked to its functions in the genome,” says Salk Associate Professor Clodagh O’Shea, a Howard Hughes Medical Institute Faculty Scholar and senior author of the paper. “It is of eminent importance, for this is the biologically relevant structure of DNA that determines both gene function and activity.”

    A new technique enables 3D visualization of chromatin (DNA plus associated proteins) structure and organization within a cell nucleus (purple, bottom left) by painting the chromatin with a metal cast and imaging it with electron microscopy (EM). The middle block shows the captured EM image data, the front block illustrates the chromatin organization from the EM data, and the rear block shows the contour lines of chromatin density from sparse (cyan and green) to dense (orange and red). Credit: Salk Institute.

    Ever since Francis Crick and James Watson determined the primary structure of DNA to be a double helix, scientists have wondered how DNA is further organized to allow its entire length to pack into the nucleus such that the cell’s copying machinery can access it at different points in the cell’s cycle of activity. X-rays and microscopy showed that the primary level of chromatin organization involves 147 bases of DNA spooling around proteins to form particles approximately 11 nanometers (nm) in diameter called nucleosomes. These nucleosome “beads on a string” are then thought to fold into discrete fibers of increasing diameter (30, 120, 320 nm etc.), until they form chromosomes. The problem is, no one has seen chromatin in these discrete intermediate sizes in cells that have not been broken apart and had their DNA harshly processed, so the textbook model of chromatin’s hierarchical higher-order organization in intact cells has remained unverified.

    To overcome the problem of visualizing chromatin in an intact nucleus, O’Shea’s team screened a number of candidate dyes, eventually finding one that could be precisely manipulated with light to undergo a complex series of chemical reactions that would essentially “paint” the surface of DNA with a metal so that its local structure and 3D polymer organization could be imaged in a living cell. The team partnered with UC San Diego professor and microscopy expert Mark Ellisman, one of the paper’s coauthors, to exploit an advanced form of electron microscopy that tilts samples in an electron beam enabling their 3D structure to be reconstructed. By combining their chromatin dye with electron-microscope tomography, they created ChromEMT.

    The team used ChromEMT to image and measure chromatin in resting human cells and during cell division when DNA is compacted into its most dense form—the 23 pairs of mitotic chromosomes that are the iconic image of the human genome. Surprisingly, they did not see any of the higher-order structures of the textbook model anywhere.

    From left: Horng Ou and Clodagh O’Shea. Credit: Salk Institute.

    “The textbook model is a cartoon illustration for a reason,” says Horng Ou, a Salk research associate and the paper’s first author. “Chromatin that has been extracted from the nucleus and subjected to processing in vitro—in test tubes—may not look like chromatin in an intact cell, so it is tremendously important to be able to see it in vivo.”

    What O’Shea’s team saw, in both resting and dividing cells, was chromatin whose “beads on a string” did not form any higher-order structure like the theorized 30 or 120 or 320 nanometers. Instead, it formed a semi-flexible chain, which they painstakingly measured as varying continuously along its length between just 5 and 24 nanometers, bending and flexing to achieve different levels of compaction. This suggests that it is chromatin’s packing density, and not some higher-order structure, that determines which areas of the genome are active and which are suppressed.

    With their 3D microscopy reconstructions, the team was able to move through a 250 nm x 1000 nm x 1000 nm volume of chromatin’s twists and turns, and envision how a large molecule like RNA polymerase, which transcribes (copies) DNA, might be directed by chromatin’s variable packing density, like a video game aircraft flying through a series of canyons, to a particular spot in the genome. Besides potentially upending the textbook model of DNA organization, the team’s results suggest that controlling access to chromatin could be a useful approach to preventing, diagnosing and treating diseases such as cancer.

    “We show that chromatin does not need to form discrete higher-order structures to fit in the nucleus,” adds O’Shea. “It’s the packing density that could change and limit the accessibility of chromatin, providing a local and global structural basis through which different combinations of DNA sequences, nucleosome variations and modifications could be integrated in the nucleus to exquisitely fine-tune the functional activity and accessibility of our genomes.”

    Future work will examine whether chromatin’s structure is universal among cell types or even among organisms.

    Other authors included Sébastien Phan, Thomas Deerinck and Andrea Thor of the UC San Diego.

    The work was largely funded by the W. M. Keck Foundation, the NIH 4D Nucleome Roadmap Initiative and the Howard Hughes Medical Institute, with additional support from the William Scandling Trust, the Price Family Foundation and the Leona M. and Harry B. Helmsley Charitable Trust.

    See the full article here .

    Please help promote STEM in your local schools.

    STEM Icon

    Stem Education Coalition

    Salk Institute Campus

    Every cure has a starting point. Like Dr. Jonas Salk when he conquered polio, Salk scientists are dedicated to innovative biological research. Exploring the molecular basis of diseases makes curing them more likely. In an outstanding and unique environment we gather the foremost scientific minds in the world and give them the freedom to work collaboratively and think creatively. For over 50 years this wide-ranging scientific inquiry has yielded life-changing discoveries impacting human health. We are home to Nobel Laureates and members of the National Academy of Sciences who train and mentor the next generation of international scientists. We lead biological research. We prize discovery. Salk is where cures begin.

  • richardmitnick 10:26 am on July 20, 2017 Permalink | Reply
    Tags: , Elizabeth Davis spent 21 years trying to receive a correct diagnosis from doctors about her condition which prevented her toes from uncurling causing her to walk with crutches for the most of her life, , Genomics, GTPCH1 impairs her ability to produce dopa, , Mutagenesis, NCGENES project, They were able to treat it — with something as simple as a pill. A pill that has been on the market since 1988 used to treat patients with Parkinson’s disease,   

    From UNC: “The Cure Code” 

    University of North Carolina

    July 18th, 2017
    Alyssa LaFaro

    Davis can now walk fully unsupported and live a relatively normal life thanks to a correct diagnosis from UNC researchers within the NCGENES project. No image credit.

    “Consider this: In 1969, if a disease-linked gene was found in humans, scientists had no simple means to understand the nature of the mutation, no mechanism to compare the altered gene to normal form, and no obvious method to reconstruct the gene mutation in a different organism to study its function. By 1979, that same gene could be shuttled into bacteria, spliced into a viral vector, delivered into the genome of a mammalian cell, cloned, sequenced, and compared to the normal form.” —Siddhartha Mukherjee, “The Gene: An Intimate History”

    “I can move my toes,” Elizabeth Davis says.

    Her 9-year-old son looks at her in awe. The two stand, wide-eyed in the middle of a Verizon Wireless store in Goldsboro, North Carolina. Davis leans hard against her crutches, staring at her feet. She looks up and smiles.

    At age 37 — for the first time in 31 years — Davis can uncurl her toes from a locked position, the symptom of a condition gone misdiagnosed for just as long. Three months later, she sheds her crutches, walking fully unsupported — something she hasn’t done since she was 14 years old.

    In 1975, the same year Davis was born, UNC microbiologists Clyde Hutchison and Marshall Edgell experienced a different kind of life-changing event. They’d been working rigorously to isolate DNA within the smallest-known virus at the time, Phi-X174. More than anything, they wanted to understand how to read the genetic code. Then, later that year and across the pond at St. John’s College in Cambridge, Fred Sanger figured it out. The British biochemist became the first person to develop a relatively rapid method for sequencing DNA, a discovery that won him a Nobel Prize in Chemistry — for the second time.

    In response to Sanger’s discovery, Hutchison took a sabbatical and headed to England to work in his lab. During his first year there, he helped uncover the entire sequence of Phi-X174 — the first time this had been done for any organism. While there, he realized the new ability to read DNA could help him and Edgell solve a different problem they’d been having back in North Carolina: fusing two pieces of DNA code together to create an entirely different sequence.

    After returning to Chapel Hill, Hutchison continued his work with Edgell and also Michael Smith, a researcher at the University of British Columbia who he met while working in Sanger’s lab. Together, the trio successfully fused two differing DNA strands using a more flexible approach to site-directed mutagenesis — a technique that makes gene therapy possible today. They published their results in 1978. Smith would go on to receive the Nobel Prize for this work in 1992.


    The scientific breakthroughs of the 1970s changed the field of genetics forever. In 1980, Sanger received the Nobel Prize for Chemistry for his contributions, along with Walter Gilbert (Harvard), who discovered that individual modules from different genes mix and match to construct entirely new genes; and Paul Berg (Stanford), who developed a technique for splicing recombinant DNA.

    Meanwhile, researchers in Chapel Hill continued to chip away at the mysteries of the gene. Oliver Smithies, who came to UNC in 1987, would later win the Nobel Prize for his work in gene targeting using mouse models. That same year, UNC cancer geneticist Michael Swift and team discover the AT gene, which predisposes women to breast cancer; and George McCoy becomes the first clinical trial participant in the world to receive the genetically engineered Factor VIII gene to treat his hemophilia at the then UNC-Thrombosis and Hemostasis Center.

    Genetics was changing the world. And this was only the beginning.

    An unsolved mystery

    One year after Sanger won the Nobel Prize, Elizabeth Davis turned 6. She soon began walking on her toes, which had suddenly, one day, curled under in pain, making it nearly impossible for her to stride with feet flat on the ground. Her knees knocked together as she struggled to move with the swift pace characteristic of a child her age. Davis continued to walk on her toes for years.

    “I would even brace the school walls when walking down the hallway,” she says. Eventually, the pain became unbearable. By the time she was 12, she’d resigned herself to crutches.

    Doctors believed Davis’ condition could be treated with foot surgery, misdiagnosing her condition for years. By age 14, she had already undergone three procedures — two to lengthen her Achilles tendons and an experimental bone fusion. But each surgery offered little to no relief, and walking only grew more painful for Davis, both physically and emotionally. As her condition worsened, her classmates became cruel — so much so that she dropped out of high school when she was just 16.

    By age 20, Davis grew restless. “The pain was constant,” she remembers. “I could hardly move my legs — they just felt weak. I would drag them behind me as I used my crutches. I couldn’t even lift them.” Doctors suggested she undergo a third Achilles tendon lengthening surgery, the result of which minimally improved her condition.

    “By that age, I just wanted more,” Davis says. “I just wanted to do things, to go places. I wanted the surgery to work. But it didn’t. And the pain continued.”

    It would be another 17 years before doctors realized the problem was hidden in her genome.

    The birth of a department

    In 1990, the start of the Human Genome Project — an international research program to map out the 20,000 genes that define human beings — further fueled new discoveries in the field of genetics. So when Jeff Houpt, then-UNC School of Medicine dean, formed a research advisory committee in 1997 and asked his faculty what the number-one research program the university needed to focus on, they responded: genetics and genome sciences.

    Great minds think alike. At the same time, the College of Arts and Sciences was also hosting its own committee that vied to develop a genetics department. “At this point, I had a vision for a pan-university program,” Houpt shares. “This wasn’t just going to be a program of the medical school.”

    Along with the College, the schools of public health, dentistry, pharmacy, nursing, and information and library science all wanted in, offering financial assistance to the program. Then-Provost Robert Shelton and Chancellor James Moeser both signed off on it as well. “What we wanted from Shelton and Moeser was more money and more positions,” Houpt remembers. “And they agreed to that.”

    By 2000, a hiring committee was ready to interview candidates to chair the new department and genomics center. Terry Magnuson quickly emerged as the lead candidate. He and his team had spent the past 16 years researching developmental abnormalities using genetics and mouse models, successfully changing the genetic background of a mutated gene.

    “It was obvious he was going to have a following,” Houpt remembers. “People were going to listen to him because he’s a good scientist. But more than that, it was pretty clear that Terry was interested in building a program, and this university-wide effort appealed to him.”

    Unanswered pain

    By the time she reached her 30s, Davis’ condition had spread to her arms. She underwent multiple MRIs, nerve and muscle testing, and a spinal tap. She even endured a fifth, unsuccessful surgery on her feet. Physicians misdiagnosed her yet again. A few believed she suffered from hereditary spastic paraplegia, a genetic condition that causes weakness in the legs and hips. Another told her she had cerebral palsy. “But I didn’t want to believe him,” she says — and it’s a good thing she didn’t.

    As Davis continued her search for answers, walking grew more and more painful. “I was always in pain,” she admits. “But some weeks were really, really bad — to the point where I couldn’t even move.” She finally succumbed to the assistance of a wheelchair. “I hated it so much. I barely went anywhere.” And when she did, she needed help.

    Her mother assisted her regularly with everyday tasks like grocery shopping. Her youngest son, Alex, learned to expertly navigate her around high school gyms, baseball fields, and the local YMCA pool so she could watch her other son, Myles, compete in the plethora of sports he participated in.

    “Myles really experienced the worst of it,” Davis says. “I remember one time, in particular. I was taking a shower and knew I was about to fall. I called for him and he came running. He was always there to pick me back up.”

    Sequences and algorithms

    After the Human Genome Project published its results in 2004, genomic sequencing became an option for people with undiagnosed diseases. But analyzing and understanding the 3 billion base pairs that make up a person’s genetic identity was an expensive process. As time progressed and technology improved, though, the technique became more manageable for both physicians and patients.

    Using these new genomic technologies for outpatient care intrigued UNC geneticists James Evans and Jonathan Berg. In 2009, after gathering enough preliminary data, the NIH granted the team the funds to start the North Carolina Clinical Genomic Evaluation by NextGen Exome Sequencing (NCGENES), which uses whole exome sequencing (WES) to uncover the root cause of undiagnosed diseases. Using just two tablespoons of blood, WES tests 1 percent of the genome — a feat that is both miraculous and controversial, creating a whole new wave of ethical questions.

    Simply put: “Some people want information that other people don’t,” Evans explains. Most people want to know about genetic disorders that have treatment options, but when it comes to those that don’t, they’d rather not hear it. “Navigating those different viewpoints can be a challenge,” he says. Privacy and confidentiality also present problems within the insurance world. Although protections exist in the realm of medical insurance, major genetic predispositions could have large implications for life, disability, and long-term care insurance.

    Today, upward of 50 researchers from across Carolina participate in NCGENES to study everything from the protection of data to the delivery of results. More than 750 people with undiagnosed diseases have undergone testing.

    NCGENES wouldn’t exist without the technical infrastructure that tracks, categorizes, and helps analyze genetic material as it makes its way through multiple laboratories — all of which is provided by UNC’s Renaissance Computing Institute (RENCI). A developer of data science cyberinfrastructure, RENCI provides the software programming that helps the team at NCGENES analyze genomes more effectively.

    “You need new computer algorithms to solve new science problems,” RENCI Director Stan Ahalt says. “It takes a multidisciplinary team to understand science problems like genetics — and computer code to make that process go fast.”

    A transformative experience

    By 2013, Davis was in desperate need of a new algorithm. Thankfully, that year, she was referred to Jane Fan, a pediatric neurologist at UNC. After studying Davis’ file, Fan felt sure that the doctors who tried to diagnose her condition failed, making her the perfect candidate for NCGENES.

    Four tubes of blood, 100,000 possible genetic locations, and just over six months later, Fan called Davis. A single gene mutation called GTPCH1 impairs her ability to produce dopa, an amino acid crucial for nervous system function. “I had to hear it in person before I believed it,” Davis admits. “I had been misdiagnosed many times before.”

    Not only were UNC geneticist James Evans and his NCGENES team finally able to accurately diagnose Davis, but they were able to treat it — with something as simple as a pill. A pill that has been on the market since 1988, used to treat patients with Parkinson’s disease.

    And just like that, Davis ‘life was changed forever by genome sequencing.

    Three days after she took one-quarter of a pill, movement returned to her toes while standing in the middle of a Verizon Wireless store in Goldsboro. She began to cry.

    Top-five in the country

    UNC’s genetics department has ranked in the top-five programs for NIH funding across the nation every year since 2012 (and top-10 each year since 2006). “I think we’ve built one of the best genetics departments in the country,” Magnuson says. In 2016 alone, genetics department faculty brought $38 million to Carolina.

    Houpt agrees with Magnuson’s sentiment. “The genetics department is a great example of how universities should run,” he says. “People need to put aside their own interests and see what’s needed. Terry is a leader who’s made each school involved feel like it’s their program and not just a medical school program – which is why he’s now the vice chancellor for research.”

    Today, more than 80 faculty members from across campus conduct world-recognized genetics research in multiple disciplines.

    Ned Sharpless, for example, focuses on cancer. Most recently, the director of the UNC Lineberger Comprehensive Cancer Center lead a study that paired UNCseq — a genetic sequencing protocol that produces volumes of genetic information from a patient’s tumor — with IBM Watson’s ability to quickly pull information from millions of medical papers. A procedure much too intense and time-consuming for the human mind, this data analysis can help physicians make more informed decisions about patient care.

    Another member of Carolina’s Cancer Genetics Program, Charles Perou uses genomics to characterize the diversity of breast cancer tumors — research that helps doctors guarantee patients more individualized care. In 2011, he cofounded GeneCentric, which uses personalized molecular diagnostic assays and targeted drug development to treat cancer.

    In 2015, geneticist Aravind Asokan started StrideBio with University of Florida biochemist Mavis Agbandje-McKenna. The gene therapy company develops novel adeno-associated viral (AAV) vector technologies for treating rare diseases. Although still in its infancy, the company has already partnered with CRISPR Therapeutics and received an initial investment from Hatteras Venture Partners. Asokan has spent nearly a decade studying AAV — and even helped to, previously, cofound Bamboo Therapeutics, acquired by Pfizer for $645 million just last year.

    In 2016, current genetics department Chair Fernando Pardo-Manuel de Villena challenged both Darwin’s theory of natural selection and Mendel’s law of segregation through researching a mouse gene called R2d2. In doing so, he found that a selfish gene can become fixed in a population of organisms while, at the same time, being detrimental to “reproductive fitness” — a discovery that shows the swiftness at which the genome can change, creating implications for an array of fields from basic biology to agriculture and human health.

    A former student of Oliver Smithies, Beverly Koller uses gene targeting in mice to better understand diseases like cystic fibrosis, asthma, and arthritis — research that will ultimately lead to better treatments. Similarly, Mark Heise observes mice to study diseases caused by viruses including infectious arthritis and encephalitis (inflammation of the brain). Both researchers are part of the Collaborative Cross project, a large panel of inbred mouse strains that help map genetic traits — a resource that is UNC lead, according to Magnuson.

    Genetics research stems far beyond the UNC School of Medicine. In 2009, for example, chemist Kevin Weeks and his research team decoded the HIV genome, advancing the development of new therapies and treatments. UNC sociologist Gail Henderson runs the Center for Genomics and Society, which provides research and training on ethical, legal, and social implications of genomic research. In 2015, UNC Eshelman School of Pharmacy Dean Bob Blouin helped the school become the first U.S. hub to join the international Structural Genomics Consortium — focused on discovering selective, small molecules and protein kinases to help speed the creation of new medicines for patients.

    From crutches to a 5K

    After just three months of treatment, Davis walked fully unsupported for the first time since she was 6 years old. She’s since traversed Hershey Park in Pennsylvania, strolled around the World Trade Center in New York, and regularly participated in yoga and spin classes. This past May, she walked her first 5K. “I have crazy endurance,” she says. “When your body feels good, you just want to keep on going.”

    Perhaps, more importantly, Davis is able attend Alex’s sports games without assistance. “When I used to walk into the gym on crutches to watch my oldest son play basketball, everyone would look at my crutches and my legs,” she says. “Now, when I go watch my youngest son play, I have so much more confidence walking in to the gym. People see me.”

    See the full article here .

    Please help promote STEM in your local schools.

    STEM Icon

    Stem Education Coalition</a

    U NC bloc

    U NC campus

    Carolina’s vibrant people and programs attest to the University’s long-standing place among leaders in higher education since it was chartered in 1789 and opened its doors for students in 1795 as the nation’s first public university. Situated in the beautiful college town of Chapel Hill, N.C., UNC has earned a reputation as one of the best universities in the world. Carolina prides itself on a strong, diverse student body, academic opportunities not found anywhere else, and a value unmatched by any public university in the nation.

  • richardmitnick 8:47 am on July 17, 2017 Permalink | Reply
    Tags: , CRISPR-Cas3, , , Genomics, ,   

    From HMS: “Bringing CRISPR into Focus” 

    Harvard University

    Harvard University

    Harvard Medical School

    Harvard Medical School

    June 29, 2017

    CRISPR-Cas3 is a subtype of the CRISPR-Cas system, a widely adopted molecular tool for precision gene editing in biomedical research. Aspects of its mechanism of action, however, particularly how it searches for its DNA targets, were unclear, and concerns about unintended off-target effects have raised questions about the safety of CRISPR-Cas for treating human diseases.

    Harvard Medical School and Cornell University scientists have now generated near-atomic resolution snapshots of CRISPR that reveal key steps in its mechanism of action. The findings, published in Cell on June 29, provide the structural data necessary for efforts to improve the efficiency and accuracy of CRISPR for biomedical applications.

    Through cryo-electron microscopy, the researchers describe for the first time the exact chain of events as the CRISPR complex loads target DNA and prepares it for cutting by the Cas3 enzyme. These structures reveal a process with multiple layers of error detection—a molecular redundancy that prevents unintended genomic damage, the researchers say.

    High-resolution details of these structures shed light on ways to ensure accuracy and avert off-target effects when using CRISPR for gene editing.

    “To solve problems of specificity, we need to understand every step of CRISPR complex formation,” said Maofu Liao, assistant professor of cell biology at Harvard Medical School and co-senior author of the study. “Our study now shows the precise mechanism for how invading DNA is captured by CRISPR, from initial recognition of target DNA and through a process of conformational changes that make DNA accessible for final cleavage by Cas3.”

    Target search

    Discovered less than a decade ago, CRISPR-Cas is an adaptive defense mechanism that bacteria use to fend off viral invaders. This process involves bacteria capturing snippets of viral DNA, which are then integrated into its genome and which produce short RNA sequences known as crRNA (CRISPR RNA). These crRNA snippets are used to spot “enemy” presence.

    Acting like a barcode, crRNA is loaded onto members of the CRISPR family of enzymes, which perform the function of sentries that roam the bacteria and monitor for foreign code. If these riboprotein complexes encounter genetic material that matches its crRNA, they chop up that DNA to render it harmless. CRISPR-Cas subtypes, notably Cas9, can be programmed with synthetic RNA in order to cut genomes at precise locations, allowing researchers to edit genes with unprecedented ease.

    To better understand how CRISPR-Cas functions, Liao partnered with Ailong Ke of Cornell University. Their teams focused on type 1 CRISPR, the most common subtype in bacteria, which utilizes a riboprotein complex known as CRISPR Cascade for DNA capture and the enzyme Cas3 for cutting foreign DNA.

    Through a combination of biochemical techniques and cryo-electron microscopy, they reconstituted stable Cascade in different functional states, and further generated snapshots of Cascade as it captured and processed DNA at a resolution of up to 3.3 angstroms—or roughly three times the diameter of a carbon atom.

    A sample cryo-electron microscope image of CRISPR molecules(left). The research team combined hundreds of thousands of particles into 2D averages (right), before turning them into 3D projections. Image: Xiao et al.

    Seeing is believing

    In CRISPR-Cas3, crRNA is loaded onto CRISPR Cascade, which searches for a very short DNA sequence known as PAM that indicates the presence of foreign viral DNA.

    Liao, Ke and their colleagues discovered that as Cascade detects PAM, it bends DNA at a sharp angle, forcing a small portion of the DNA to unwind. This allows an 11-nucleotide stretch of crRNA to bind with one strand of target DNA, forming a “seed bubble.”

    The seed bubble acts as a fail-safe mechanism to check whether the target DNA matches the crRNA. If they match correctly, the bubble is enlarged and the remainder of the crRNA binds with its corresponding target DNA, forming what is known as an “R-loop” structure.

    Once the R-loop is completely formed, the CRISPR Cascade complex undergoes a conformational change that locks the DNA into place. It also creates a bulge in the second, non-target strand of DNA, which is run through a separate location on the Cascade complex.

    Only when a full R-loop state is formed does the Cas3 enzyme bind and cut the DNA at the bulge created in the non-target DNA strand.

    The findings reveal an elaborate redundancy to ensure precision and avoid mistakenly chopping up the bacteria’s own DNA.

    CRISPR forms a “seed bubble” state, which acts as an initial fail-safe mechanism to ensure that CRISPR RNA matches its target DNA. Image: Liao Lab/HMS.

    “To apply CRISPR in human medicine, we must be sure the system is accurate and that it does not target the wrong genes,” said Ke, who is co-senior author of the study. “Our argument is that the CRISPR-Cas3 subtype has evolved to be a precise system that carries the potential to be a more accurate system to use for gene editing. If there is mistargeting, we know how to manipulate the system because we know the steps involved and where we might need to intervene.”

    Setting the sights

    Structures of CRISPR Cascade without target DNA and in its post-R-loop conformational states have been described, but this study is the first to reveal the full sequence of events from seed bubble formation to R-loop formation at high resolution.

    In contrast to the scalpel-like Cas9, CRISPR-Cas3 acts like a shredder that chews DNA up beyond repair. While CRISPR-Cas3 has, thus far, limited utility for precision gene editing, it is being developed as a tool to combat antibiotic-resistant strains of bacteria. A better understanding of its mechanisms may broaden the range of potential applications for CRISPR-Cas3.

    In addition, all CRISPR-Cas subtypes utilize some version of an R-loop formation to detect and prepare target DNA for cleavage. The improved structural understanding of this process can now enable researchers to work toward modifying multiple types of CRISPR-Cas systems to improve their accuracy and reduce the chance of off-target effects in biomedical applications.

    “Scientists hypothesized that these states existed but they were lacking the visual proof of their existence,” said co-first author Min Luo, postdoctoral fellow in the Liao lab at HMS. “The main obstacles came from stable biochemical reconstitution of these states and high-resolution structural visualization. Now, seeing really is believing.”

    “We’ve found that these steps must occur in a precise order,” Luo said. “Evolutionarily, this mechanism is very stringent and has triple redundancy, to ensure that this complex degrades only invading DNA.”

    Additional authors on the study include Yibei Xiao, Robert P. Hayes, Jonathan Kim, Sherwin Ng, and Fang Ding.

    This work is supported by National Institutes of Health grants GM 118174 and GM102543.

    See the full article here .

    Please help promote STEM in your local schools.

    STEM Icon

    Stem Education Coalition

    HMS campus

    Established in 1782, Harvard Medical School began with a handful of students and a faculty of three. The first classes were held in Harvard Hall in Cambridge, long before the school’s iconic quadrangle was built in Boston. With each passing decade, the school’s faculty and trainees amassed knowledge and influence, shaping medicine in the United States and beyond. Some community members—and their accomplishments—have assumed the status of legend. We invite you to access the following resources to explore Harvard Medical School’s rich history.

    Harvard University campus

    Harvard is the oldest institution of higher education in the United States, established in 1636 by vote of the Great and General Court of the Massachusetts Bay Colony. It was named after the College’s first benefactor, the young minister John Harvard of Charlestown, who upon his death in 1638 left his library and half his estate to the institution. A statue of John Harvard stands today in front of University Hall in Harvard Yard, and is perhaps the University’s best known landmark.

    Harvard University has 12 degree-granting Schools in addition to the Radcliffe Institute for Advanced Study. The University has grown from nine students with a single master to an enrollment of more than 20,000 degree candidates including undergraduate, graduate, and professional students. There are more than 360,000 living alumni in the U.S. and over 190 other countries.

  • richardmitnick 1:21 pm on July 8, 2017 Permalink | Reply
    Tags: , , , , , , Genomics, , , UCSD Comet supercomputer   

    From Science Node: “Cracking the CRISPR clock” 

    Science Node bloc
    Science Node

    05 Jul, 2017
    Jan Zverina

    SDSC Dell Comet supercomputer

    Capturing the motion of gyrating proteins at time intervals up to one thousand times greater than previous efforts, a team led by University of California, San Diego (UCSD) researchers has identified the myriad structural changes that activate and drive CRISPR-Cas9, the innovative gene-splicing technology that’s transforming the field of genetic engineering.

    By shedding light on the biophysical details governing the mechanics of CRISPR-Cas9 (clustered regularly interspaced short palindromic repeats) activity, the study provides a fundamental framework for designing a more efficient and accurate genome-splicing technology that doesn’t yield ‘off-target’ DNA breaks currently frustrating the potential of the CRISPR-Cas9- system, particularly for clinical uses.

    Shake and bake. Gaussian accelerated molecular dynamics simulations and state-of-the-art supercomputing resources reveal the conformational change of the HNH domain (green) from its inactive to active state. Courtesy Giulia Palermo, McCammon Lab, UC San Diego.

    “Although the CRISPR-Cas9 system is rapidly revolutionizing life sciences toward a facile genome editing technology, structural and mechanistic details underlying its function have remained unknown,” says Giulia Palermo, a postdoctoral scholar with the UC San Diego Department of Pharmacology and lead author of the study [PNAS].

    See the full article here

    Please help promote STEM in your local schools.
    STEM Icon

    Stem Education Coalition

    Science Node is an international weekly online publication that covers distributed computing and the research it enables.

    “We report on all aspects of distributed computing technology, such as grids and clouds. We also regularly feature articles on distributed computing-enabled research in a large variety of disciplines, including physics, biology, sociology, earth sciences, archaeology, medicine, disaster management, crime, and art. (Note that we do not cover stories that are purely about commercial technology.)

    In its current incarnation, Science Node is also an online destination where you can host a profile and blog, and find and disseminate announcements and information about events, deadlines, and jobs. In the near future it will also be a place where you can network with colleagues.

    You can read Science Node via our homepage, RSS, or email. For the complete iSGTW experience, sign up for an account or log in with OpenID and manage your email subscription from your account preferences. If you do not wish to access the website’s features, you can just subscribe to the weekly email.”

  • richardmitnick 2:36 pm on July 5, 2017 Permalink | Reply
    Tags: A Whole-Genome Sequenced Rice Mutant Resource for the Study of Biofuel Feedstocks, , Fast-neutron irradiation causes different types of mutations, , Genomics, Kitaake: a model rice variety with a short life cycle,   

    From LBNL: “A Whole-Genome Sequenced Rice Mutant Resource for the Study of Biofuel Feedstocks” 

    Berkeley Logo

    Berkeley Lab

    July 5, 2017
    Sarah Yang
    (510) 486-4575

    JBEI researchers create open-access web portal to accelerate functional genetic research in plants.

    Genome-wide distribution of fast-neutron-induced mutations in the Kitaake rice mutant population (green). Even distribution of mutations is important to achieve saturation of the genome. Colored lines (center) represent translocations of DNA fragments from one chromosome to another. (Credit: Guotian Li and Rashmi Jain/Berkeley Lab).

    Rice is a staple food for over half of the world’s population and a model for studies of candidate bioenergy grasses such as sorghum, switchgrass, and Miscanthus. To optimize crops for biofuel production, scientists are seeking to identify genes that control key traits such as yield, resistance to disease, and water use efficiency.

    Populations of mutant plants, each one having one or more genes altered, are an important tool for elucidating gene function. With whole-genome sequencing at the single nucleotide level, researchers can infer the functions of the genes by observing the gain or loss of particular traits. But the utility of existing rice mutant collections has been limited by several factors, including the cultivars’ relatively long six-month life cycle and the lack of sequence information for most of the mutant lines.

    In a paper published in The Plant Cell, a team led by Pamela Ronald, a professor in the Genome Center and the Department of Plant Pathology at UC Davis and director of Grass Genetics at the Department of Energy’s (DOE’s) Joint BioEnergy Institute (JBEI), with collaborators from UC Davis and the DOE Joint Genome Institute (JGI), reported the first whole-genome sequenced fast-neutron induced mutant population of Kitaake, a model rice variety with a short life cycle.

    Kitaake (Oryza sativa L. ssp. japonica) completes its life cycle in just nine weeks and is not sensitive to photoperiod changes. This novel collection will accelerate functional genetic research in rice and other monocots, a type of flowering plant species that includes grasses.

    “Some of the most popular rice varieties people use right now only have two generations per year. Kitaake has up to four, which really speeds up functional genomics work,” said Guotian Li, a project scientist at Lawrence Berkeley National Laboratory (Berkeley Lab) and deputy director of Grass Genetics at JBEI.

    In a previously published pilot study [Molecular Plant], Li, Mawsheng Chern, and Rashmi Jain, co-first authors on The Plant Cell paper, demonstrated that fast-neutron irradiation produced abundant and diverse mutations in Kitaake, including single base substitutions, deletions, insertions, inversions, translocations, and duplications. Other techniques that have been used to generate rice mutant populations, such as the insertion of gene and chromosome segments and the use of gene editing tools like CRISPR-Cas9, generally produce a single type of mutation, Li noted.

    “Fast-neutron irradiation causes different types of mutations and gives different alleles of genes so we really can get something that’s not achievable from other collections,” he said.

    Whole-genome sequencing of this mutant population – 1,504 lines in total with 45-fold coverage – allowed the researchers to pinpoint each mutation at a single-nucleotide resolution. They identified 91,513 mutations affecting 32,307 genes, 58 percent of all genes in the roughly 389-megabase rice genome. A high proportion of these were loss-of-function mutations.

    Using this mutant collection, the Grass Genetics group identified an inversion affecting a single gene as the causative mutation for the short-grain phenotype in one mutant line with a population containing just 50 plants. In contrast, researchers needed more than 16,000 plants to identify the same gene using the conventional approach.

    “This comparison clearly demonstrates the power of the sequenced mutant population for rapid genetic analysis,” said Ronald.

    This high-density, high-resolution catalog of mutations, developed with JGI’s help, provides researchers opportunities to discover novel genes and functional elements controlling diverse biological pathways. To facilitate open access to this resource, the Grass Genetics group has established a web portal called KitBase, which allows users to find information related to the mutant collection, including sequence, mutation and phenotypic data for each rice line. Additional information about the database can be found through JGI.

    Additional Berkeley Lab scientists who contributed to this work include co-first authors Rashmi Jain and Mawsheng Chern; Tong Wei and Deling Ruan, both affiliated with JBEI’s Feedstocks Division and with Berkeley Lab’s Environmental Genomics and Systems Biology Division; Nikki Pham and Kyle Jones of JBEI’s Feedstocks Division; and Joel Martin, Wendy Schackwitz, Anna Lipzen, Diane Bauer, Yi Peng, and Kerrie Barry of the JGI.

    Support for the research at JBEI, a DOE Bioenergy Research Center, and JGI, a DOE Office of Science User Facility, was provided by DOE’s Office of Science. Additional support was provide by the National Institutes of Health and the National Science Foundation.

    See the full article here .

    Please help promote STEM in your local schools.

    STEM Icon

    Stem Education Coalition

    A U.S. Department of Energy National Laboratory Operated by the University of California

    University of California Seal

    DOE Seal

  • richardmitnick 12:20 pm on June 5, 2017 Permalink | Reply
    Tags: and personalized therapies, , DiscovEHR, , Genomics, Harvard Graduate School of Arts and Sciences, Learning to read your DNA, Population sequencing lands a knockout punch, Precision medicine, Scaling up genetic sequencing studies, The future: population sequencing   

    From Harvard: “Strength in Numbers: genetic sequencing of large populations is shaping the future of medicine” 

    Harvard University

    Harvard University

    June 5, 2017
    Ryan L. Collins
    Figures by Brad Wierbowski

    Thanks to modern genetics, “precision medicine” is slowly becoming a reality: doctors can perform genetic tests to determine your risk for dozens of diseases, like stroke or liver disease, and can prescribe treatments or therapies tailored to your individual genetic makeup. Yet before doctors can provide you with precision medicine in practice, they first need to understand the genes of tens of thousands of other people. Excitingly, recent breakthroughs in genetics research have made it possible to study the genes of whole populations at once, and the lessons we are learning from those studies are rapidly changing our approach to diagnosing and treating disease.

    Learning to read your DNA

    You can picture your genome as a book with 3.1 billion letters, known as nucleotides, that encode a list of the ~20,000 different molecular parts, or proteins, that comprise every cell in your body. Much like a book with 3.1 billion letters, your genome isn’t exactly a light read: for example, the first time a human genome was ever read (“sequenced”) in its entirety required the combined efforts of more than 200 scientists, a timespan of 12 years (finally finishing in 2003), and a staggering total cost of $2.7 billion.

    In the subsequent 14 years, massive advances in sequencing technologies [NIH] [National Human Genome Research Institute] have transformed scientists into genetic speed-readers, with cutting-edge sequencing methods able to process your entire genome in under two days for a mere ~$1,500. Additional technologies have improved our efficiency by allowing targeted sequencing of the bits of your genome that spell out the blueprints for all human proteins, known as the exome. Surprisingly, the exome takes up barely more than ~1% of all of the nucleotides in your whole genome, which means exome sequencing is both faster and cheaper than genome sequencing (see Figure 1 for a comparison of exome vs genome sequencing). While the other ~99% of your DNA plays various “helper” functions in your cells, it does not code for proteins. Since proteins are the most important cellular building blocks, and thus the most important determinants of disease, exome sequencing is a great way to hone in on the most critical sections of your DNA.


    Scaling up genetic sequencing studies

    The development of these groundbreaking sequencing technologies has opened countless promising research avenues. Not least among these is an effort known as “population sequencing,” or the process of sequencing the exomes or genomes of entire human communities, illustrated in Figure 2. Early examples of population sequencing, including the 1000 Genomes Project or the Exome Sequencing Project, combined genetic data from thousands of volunteers into vast datasets bursting with new knowledge about human biology.

    Figure 2: Overview of population sequencing. Studying the genomes of many thousands of people at once, known as population sequencing, is an exciting area of research and is producing countless new insights into human biology and medicine. Typically, a population sequencing experiment will involve sequencing the exomes or genomes of large groups of individuals with and without a trait, such as a disease like cancer, then comparing the differences in the genes of the two groups. Once researchers identify DNA changes associated with the trait of interest, they can then use that information to answer many important questions that help guide drug development, clinical practice, and therapeutic selection.

    The initial successes of these population sequencing projects triggered a tidal wave of similar studies. Dozens of research groups rushed to apply similar methods, and the results came pouring in. For example, one 2014 study [Nature Genetics] sequenced the genomes of several thousand people from Iceland, identifying specific genes that may predispose Icelanders to early-onset heart disease and liver disease. As a second example, multiple studies [Nature] have used exome and genome sequencing in children to pinpoint over 60 genes strongly linked to autism. The list of these sequencing success stories is already lengthy, and continues to grow every month. Even more importantly, sequencing studies like these produce the information doctors and researchers need to screen your genome and provide a more complete picture of how your genes influence your individual health.

    Data sharing: ExACtly what the doctor ordered

    As of 2017, well over one million human exomes and genomes have been sequenced worldwide. In the realm of human genetics, bigger datasets are almost always more informative, so analyzing all of these data together seems like an obvious choice. These population-scale studies depend on volunteers like you to contribute their DNA, but unfortunately this process isn’t always straightforward. Genetic data sharing, even if performed strictly in a research context where no personal health information is transferred, is still fraught with ethical, legal, practical, and bureaucratic hurdles.

    In 2014, a large international alliance of researchers, led by Daniel MacArthur at The Broad Institute of M.I.T. and Harvard, set out to tackle these obstacles. They formed a collaborative group known as the Exome Aggregation Consortium (abbreviated “ExAC”), and combined exome sequences from over 60,000 healthy individuals from more than two-dozen independent studies conducted around the world to build a dataset nearly ten times larger than any other ever assembled. Their results, reported in the journal Nature, outlined the most comprehensive atlas of human genetic diversity to date, including individuals from nearly all major global populations and uncovering nearly five-and-a-half million genetic changes, known as mutations, never seen in any previous studies. This detailed mutation map immediately changed the landscape of human genetics research: in the short span since the team publicly released a draft of their results in late 2015, hundreds of scientific groups around the world have used the ExAC dataset, with over 600 peer-reviewed scientific publications citing ExAC in the last two years alone.

    The effects of ExAC on human genetics research have been profound. For instance, specific mutations in important, disease-causing genes might not always result in disease for certain people; researchers have now used the ExAC dataset to decipher why this might happen for a peculiar gene, known as PRNP, that causes multiple neurological disorders, such as fatal familial insomnia. A different 2016 study [Nature] performed exome sequencing on 14,133 individuals from northern Europe to identify “ultra-rare” mutations—genetic changes never seen in any of the 60,708 ExAC participants—and showed the number of these ultra-rare mutations in genes important for brain development can partially predict how many years an otherwise healthy individual is likely to stay in school. These and similar discoveries are already having a palpable impact in translational research and clinical medicine, and these advances wouldn’t have been possible without population-scale resources like ExAC.

    Combining genetics with medical records to make DiscovEHRies

    Like ExAC, a recent collaboration between Regeneron Pharmaceuticals, Inc., and Geisinger Health System, dubbed DiscovEHR, combined the exome sequences and full medical records of over 50,000 volunteers recruited at one of Geisinger’s clinics in Pennsylvania. As depicted in Figure 3, this fusion of genetic and medical data proved to be an even more powerful approach than analyzing just the genetic data alone. The DiscovEHR study, published in 2016 in the journal Science, compared medical data between patients with and without mutations in certain genes, and found that patients with mutations that disabled a small group of specific genes had lower cholesterol levels, which lowered their risk of serious heart disease. Current pharmaceutical strategies involve identifying such genes as “targets” for drug development, aimed at recreating the lower cholesterol levels caused by the mutation that disabled the gene in patients, with the ultimate goal of providing those drugs to individuals at high risk for heart disease but who lack these rare, protective gene mutations.

    Figure 3: Combining population sequencing with health records can identify new drug targets. A recent sequencing study of over 50,000 individuals, known as DiscovEHR, demonstrated the ability for population sequencing to generate new medical knowledge that can directly inform drug development and patient treatment choices. For example, the DiscovEHR study found genes that caused individuals to have lower blood cholesterol levels when inactivated by rare mutations, resulting in a reduced risk for major heart disease.

    Population sequencing lands a knockout punch

    Except for genes on the sex chromosomes (X & Y), there are two copies of every gene in your genome, one inherited from your father and one from your mother. Gene-inactivating mutations are generally rare events, so it is extremely uncommon for a single individual to inherit disabled copies of the same gene from both parents. When this does occur, it’s called a “gene knockout,” and means that individual lacks the ability to produce any of the protein encoded by that gene. Not surprisingly, gene knockouts are the known cause for hundreds (if not thousands) of rare diseases.

    In some cultures, marriages between first-cousins is commonplace. Since first-cousins are genetically closely related, their children are at a much higher risk of inheriting gene knockouts, making those children ideal individuals to study the effects of gene knockouts in humans (such studies are usually conducted in mice or other “model organisms”). By studying populations of children from first-cousin marriages, Last month, a team of researchers led by Sekar Kathiresan at the Massachusetts General Hospital reported in Nature on a population sequencing study of over 10,000 individuals from Pakistan, known as the PROMIS study, where the rate of first-cousin marriages is particularly high. Remarkably, the team found that at least 7% of all known protein-coding genes were knocked out in at least one individual without resulting in any obvious medical issues, meaning these genes might represent safe drug targets with little side-effect risks, as shown in Figure 4. Conversely, the PROMIS study also reported on a subset of individuals who were knockouts for current drug target genes that are thought to protect against heart disease, but those individuals developed heart disease at the same rate as the general population. Whatever the conclusion, this study drives home the point that population sequencing can inform—and, in some cases, correct—drug development and prescription of clinical treatments.

    Figure 4: Rare gene knockouts can teach us why some drugs work and others don’t. Gene knockouts, or the situation where an individual inherits inactivated copies of a gene from both parents, are the known cause of hundreds of diseases because that individual is unable to produce any of the protein encoded by the knocked-out gene. However, not all gene knockouts cause disease. By studying healthy individuals with gene knockouts, scientists can learn about genes that might be safely knocked out by drugs, allowing for development of new targeted therapies. Conversely, scientists can also find genes that are targets of existing drugs yet, when knocked out in individuals in the general population, doesn’t result in the expected health benefit.

    The future: population sequencing, precision medicine, and personalized therapies

    Population sequencing is changing the way companies design drugs and doctors diagnose diseases and choose therapies. Landmark large-scale studies like ExAC and DiscovEHR have proven that pooling genetic data across tens of thousands of individuals dramatically improves researchers’ abilities to make new discoveries about the causes of disease. Cataloguing healthy individuals with rare gene knockouts, like the PROMIS study, can advance our understanding of human physiology, produce new drug targets, and shed new light onto why drugs might fail in certain patients. Yet numerous issues continue to hinder advances in medical genetics: we still know very little about the genetics of disease in people of African or Asian ancestry, genetic data remains difficult to share between research groups, and we have still only sequenced less than 0.01% of all people on earth.

    Even despite these challenges, our knowledge of the genome has already revolutionized medicine. Modern clinics can now perform dozens of genetic tests to evaluate your risk for cancer, Alzheimer’s disease, and heart attacks, or can tell you the odds your future children might have autism, epilepsy, or physical birth defects. For certain diseases, especially cancer, having your genome sequenced helps doctors select drugs designed specifically for your genetic makeup, making sure you get the most effective personalized treatment possible.

    Just like the famous Greek philosopher Socrates once quipped, “to know thyself is the beginning of wisdom.” Today’s geneticists and doctors would probably agree with him: genetic testing can teach you more about yourself, and in the process, help you live a longer, happier, healthier life.

    See the full article here .

    Please help promote STEM in your local schools.

    STEM Icon

    Stem Education Coalition

    Harvard University campus

    Harvard is the oldest institution of higher education in the United States, established in 1636 by vote of the Great and General Court of the Massachusetts Bay Colony. It was named after the College’s first benefactor, the young minister John Harvard of Charlestown, who upon his death in 1638 left his library and half his estate to the institution. A statue of John Harvard stands today in front of University Hall in Harvard Yard, and is perhaps the University’s best known landmark.

    Harvard University has 12 degree-granting Schools in addition to the Radcliffe Institute for Advanced Study. The University has grown from nine students with a single master to an enrollment of more than 20,000 degree candidates including undergraduate, graduate, and professional students. There are more than 360,000 living alumni in the U.S. and over 190 other countries.

Compose new post
Next post/Next comment
Previous post/Previous comment
Show/Hide comments
Go to top
Go to login
Show/Hide help
shift + esc
%d bloggers like this: