Tagged: NERSC – National Energy Research for Scientific Computing Center Toggle Comment Threads | Keyboard Shortcuts

  • richardmitnick 11:43 am on May 1, 2019 Permalink | Reply
    Tags: "The ‘Little’ Computer Cluster That Could" The Parallel Distributed Systems Facility (PDSF) cluster, , , , NERSC - National Energy Research for Scientific Computing Center,   

    From Lawrence Berkeley National Lab: “The ‘Little’ Computer Cluster That Could” 

    Berkeley Logo

    From Lawrence Berkeley National Lab

    May 1, 2019
    Glenn Roberts Jr.
    geroberts@lbl.gov
    (510) 486-5582

    Decades before “big data” and “the cloud” were a part of our everyday lives and conversations, a custom computer cluster based at the Department of Energy’s Lawrence Berkeley National Laboratory (Berkeley Lab) enabled physicists around the world to remotely and simultaneously analyze and visualize data.

    2
    The PDSF computer cluster in 2003. (Credit: Berkeley lab)

    The Parallel Distributed Systems Facility (PDSF) cluster, which had served as a steady workhorse in supporting groundbreaking and even Nobel-winning research around the world since the 1990s, switched off last month.

    NERSC PDSF

    During its lifetime the cluster and its dedicated support team racked up many computing achievements and innovations in support of large collaborative efforts in nuclear physics and high-energy physics. Some of these innovations have persevered and evolved in other systems.

    The cluster handled data for experiments that produce a primordial “soup” of subatomic particles to teach us about the makings of matter, search for intergalactic particle signals deep within Antarctic ice, and hunt for dark matter in a mile-deep tank of liquid xenon at a former mine site. It also handled data for a space observatory mapping the universe’s earliest light, and for Earth-based observations of supernovas.

    It supported research leading to the discoveries of the morphing abilities of ghostly particles called neutrinos, the existence of the Higgs boson and the related Higgs field that generates mass through particle interactions, and the accelerating expansion rate of the universe that is attributed to a mysterious force called dark energy.

    CERN CMS Higgs Event


    CERN ATLAS Higgs Event

    Lambda-Cold Dark Matter, Accelerated Expansion of the Universe, Big Bang-Inflation (timeline of the universe) Date 2010 Credit: Alex Mittelmann Cold creation

    Dark Energy Camera Enables Astronomers a Glimpse at the Cosmic Dawn. CREDIT National Astronomical Observatory of Japan

    Some of PDSF’s collaboration users have transitioned to the Cori supercomputer at Berkeley Lab’s National Energy Research Scientific Computing Center (NERSC), with other participants moving to other systems. The transition to Cori gives users access to more computing power in an era of increasingly hefty and complex datasets and demands.

    NERSC

    NERSC Cray Cori II supercomputer at NERSC at LBNL, named after Gerty Cori, the first American woman to win a Nobel Prize in science

    NERSC Hopper Cray XE6 supercomputer


    LBL NERSC Cray XC30 Edison supercomputer


    The Genepool system is a cluster dedicated to the DOE Joint Genome Institute’s computing needs. Denovo is a smaller test system for Genepool that is primarily used by NERSC staff to test new system configurations and software.

    NERSC PDSF


    PDSF is a networked distributed computing cluster designed primarily to meet the detector simulation and data analysis requirements of physics, astrophysics and nuclear science collaborations.

    Future:

    Cray Shasta Perlmutter SC18 AMD Epyc Nvidia pre-exascale supeercomputer

    “A lot of great physics and science was done at PDSF,” said Richard Shane Canon, a project engineer at NERSC who served as a system lead for PDSF from 2003-05. “We learned a lot of cool things from it, and some of those things even became part of how we run our supercomputers today. It was also a unique partnership between experiments and a supercomputing facility – it was the first of its kind.”

    PDSF was small when compared to its supercomputer counterparts that handle a heavier load of computer processors, data, and users, but it had developed a reputation for being responsive and adaptable, and its support crew over the years often included physicists who understood the science as well as the hardware and software capabilities and limitations.

    “It was ‘The Little Engine That Could,’” said Iwona Sakrejda, a nuclear physicist who supported PDSF and its users for over a decade in a variety of roles at NERSC and retired from Berkeley Lab in 2015. “It was the ‘boutique’ computer cluster.”

    PDSF, because it was small and flexible, offered an R&D environment that allowed researchers to test out new ideas for analyzing and visualizing data. Such an environment may have been harder to find on larger systems, she said. Its size also afforded a personal touch.

    “When things didn’t work, they had more handholding,” she added, recalling the numerous researchers that she guided through the PDSF system – including early career researchers working on their theses.

    “It was gratifying. I developed a really good relationship with the users,” Sakrejda said. “I understood what they were trying to do and how their programs worked, which was important in creating the right architecture for what they were trying to accomplish.”

    She noted that because the PDSF system was constantly refreshed, it sometimes led to an odd assortment of equipment put together from different generations of hardware, in sharp contrast to the largely homogenous architecture of today’s supercomputers.

    PDSF participants included collaborations for the Sudbury Neutrino Observatory (SNO) in Canada, the Solenoid Tracker at Brookhaven National Laboratory’s Relativistic Heavy Ion Collider (STAR), IceCube near the South Pole, Daya Bay in China, the Cryogenic Underground Observatory for Rare Events (CUORE) in Italy, the Large Underground Xenon (LUX), LUX-ZEPLIN (LZ), and MAJORANA experiments in South Dakota, the Collider Detector at Fermilab (CDF), and the ATLAS Experiment and A Large Ion Collider Experiment (ALICE) at Europe’s CERN laboratory, among others. The most data-intensive experiments use a distributed system of clusters like PDSF.

    SNOLAB, a Canadian underground physics laboratory at a depth of 2 km in Vale’s Creighton nickel mine in Sudbury, Ontario

    BNL/RHIC Star Detector

    U Wisconsin ICECUBE neutrino detector at the South Pole

    Daya Bay, approximately 52 kilometers northeast of Hong Kong and 45 kilometers east of Shenzhen, China

    CUORE experiment,at the Italian National Institute for Nuclear Physics’ (INFN’s) Gran Sasso National Laboratories (LNGS) in Italy,a search for neutrinoless double beta decay

    LBNL LZ project at SURF, Lead, SD, USA

    U Washington Majorana Demonstrator Experiment at SURF

    FNAL/Tevatron CDF detector

    CERN ATLAS Image Claudia Marcelloni ATLAS CERN

    CERN/ALICE Detector

    3
    This chart shows the physics collaborations that used PDSF over the years, with the heaviest usage by the STAR and ALICE collaborations. (Credit: Berkeley Lab)

    The STAR collaboration was the original participant and had by far the highest overall use of PDSF, and the ALICE collaboration had grown to become one of the largest PDSF users by 2010. Both experiments have explored the formation and properties of an exotic superhot particle soup known as the quark-gluon plasma by colliding heavy particles.

    SNO researchers’ findings about neutrinos’ mass and ability to change into different forms or flavors led to the 2015 Nobel Prize in physics. And PDSF played a notable role in the early analyses of SNO data.

    Art McDonald, who shared that Nobel as director of the SNO Collaboration, said, “The PDSF computing facility was used extensively by the SNO Collaboration, including our collaborators at Berkeley Lab.”

    He added, “This resource was extremely valuable in simulations and data analysis over many years, leading to our breakthroughs in neutrino physics and resulting in the award of the 2015 Nobel Prize and the 2016 Breakthrough Prize in Fundamental Physics to the entire SNO Collaboration. We are very grateful for the scientific opportunities provided to us through access to the PDSF facility.”

    PDSF’s fast processing of data from the Daya Bay nuclear reactor-based experiment was also integral in precise measurements of neutrino properties.

    The cluster was a trendsetter for a so-called condo model in shared computing. This model allowed collaborations to buy a share of computing power and dedicated storage space that was customized for their own needs, and a participant’s allocated computer processors on the system could also be temporarily co-opted by other cluster participants when they were not active.

    In this condo analogy, “You could go use your neighbor’s house if your neighbor wasn’t using it,” said Canon, a former experimental physicist. “If everybody else was idle you could take advantage of the free capacity.” Canon noted that many universities have adopted this kind of model for their computer users.

    Importantly, the PDSF system was also designed to provide easy access and support for individual collaboration members rather than requiring access to be funneled through one account per project or experiment. “If everybody had to log in to submit their jobs, it just wouldn’t work in these big collaborations,” Canon said.

    The original PDSF cluster, called the Physics Detector Simulation Facility, was launched in March 1991 to support analyses and simulations for a planned U.S. particle collider project known as the Superconducting Super Collider. It was set up in Texas, the planned home for the collider, though the collider project was ultimately canceled in 1993.

    Superconducting Super Collider map, in the vicinity of Waxahachie, Texas, Cancelled by The U.S. Congress in 1993 because it showed no “immediate economic benefit”

    5
    A diagram showing the Phase 3 design of the original PDSF system. (Credit: “Superconducting Super Collider: A Retrospective Summary 1989-1993,” Superconducting Super Collider Laboratory, Dallas, Texas)

    A 1994 retrospective report on the collider project notes that the original PDSF had been built up to perform a then-impressive 7 billion instructions per second and that the science need for PDSF to simulate complex particle collisions had driven “substantial technological advances” in the nation’s computer industry.

    At the time, PDSF was “the world’s most powerful high-energy physics computing facility,” the report also noted, and was built using non-proprietary systems and equipment from different manufacturers “at a fraction of the cost” of supercomputers.

    Longtime Berkeley Lab physicist Stu Loken, who had led the Lab’s Information and Computing Sciences Division from 1988-2000, had played a pivotal role in PDSF’s development and in siting the cluster at Berkeley Lab.

    7
    PDSF moved to Berkeley Lab’s Oakland Scientific Facility in 2000 before returning to the lab’s main site. (Credit: Berkeley Lab)

    PDSF moved to Berkeley Lab in 1996 with a new name and a new role. It was largely rebuilt with new hardware and was moved to a computer center in Oakland, Calif., in 2000 before returning once again to the Berkeley Lab site.

    “A lot of the tools that we deployed to facilitate the data processing on PDSF are now being used by data users at NERSC,” said Lisa Gerhardt, a big-data architect at NERSC who worked on the PDSF system. She previously had served as a neutrino astrophysicist for the IceCube experiment.

    Gerhardt noted that the cluster was nimble and responsive because of its focused user community. “Having a smaller and cohesive user pool made it easier to have direct relationships,” she said.

    And Jan Balewski, computing systems engineer at NERSC who worked to transition PDSF users to the new system, said the scientific background of PDSF staff through the years was beneficial for the cluster’s users.

    Balewski, a former experimental physicist, said, “Having our background, we were able to discuss with users what they really needed. And maybe, in some cases, what they were asking for was not what they really needed. We were able to help them find a solution.”

    R. Jefferson “Jeff” Porter, a computer systems engineer and physicist in Berkeley Lab’s Nuclear Science Division who began working with the PDSF cluster and users as a postdoctoral researcher at Berkeley Lab in the mid-1990s, said, “PDSF was a resource that dealt with big data – many years before big data became a big thing for the rest of the world.”

    It had always used off-the-shelf hardware and was steadily upgraded – typically twice a year. Even so, it was dwarfed by its supercomputer counterparts. About seven years ago the PDSF cluster had about 1,500 computer cores, compared to about 100,000 on a neighboring supercomputer at NERSC at the time. A core is the part of a computer processor that performs calculations

    Porter was later hired by NERSC to support grid computing, a distributed form of computing in which computers in different locations can work together to perform larger tasks. He returned to the Nuclear Science Division to lead the ALICE USA computing project, which established PDSF as one of about 80 grid sites for CERN’s ALICE experiment. Use of PDSF by ALICE was an easy fit, since the PDSF community “was at the forefront of grid computing,” Porter said.

    In some cases, the unique demands of PDSF cluster users would also lead to the adoption of new tools at supercomputer systems. “Our community would push NERSC in ways they hadn’t been thinking,” he said. CERN developed a system to distribute software that was adopted by PDSF about five years ago, and that has also been adopted by many scientific collaborations. NERSC put in a big effort, Porter said, to integrate this system into larger machines: Cori and Edison.

    8
    PDSF’s configuration in 2017. (Credit: Berkeley Lab)

    Supporting multiple projects on a single system was a challenge for PDSF since each project had unique software needs, so Canon led the development of a system known as Chroot OS (CHOS) to enable each project to have a custom computing environment.

    Porter explained that CHOS was an early form of “container computing” that has since enjoyed widespread adoption.

    PDSF was run by a Berkeley Lab-based steering committee that typically had a member from each participating experiment and a member from NERSC, and Porter had served for about five years as the committee chair. He had been focused for the past year on how to transition users to the Cori supercomputer and other computing resources, as needed.

    Balewski said that the leap of users from PDSF to Cori brings them access to far greater computing power, and allows them to “ask questions they could never ask on a smaller system.”

    He added, “It’s like moving from a small town – where you know everyone but resources are limited – to a big city that is more crowded but also offers more opportunities.”

    See the full article here .

    five-ways-keep-your-child-safe-school-shootings

    Please help promote STEM in your local schools.

    Stem Education Coalition

    Bringing Science Solutions to the World

    In the world of science, Lawrence Berkeley National Laboratory (Berkeley Lab) is synonymous with “excellence.” Thirteen Nobel prizes are associated with Berkeley Lab. Seventy Lab scientists are members of the National Academy of Sciences (NAS), one of the highest honors for a scientist in the United States. Thirteen of our scientists have won the National Medal of Science, our nation’s highest award for lifetime achievement in fields of scientific research. Eighteen of our engineers have been elected to the National Academy of Engineering, and three of our scientists have been elected into the Institute of Medicine. In addition, Berkeley Lab has trained thousands of university science and engineering students who are advancing technological innovations across the nation and around the world.

    Berkeley Lab is a member of the national laboratory system supported by the U.S. Department of Energy through its Office of Science. It is managed by the University of California (UC) and is charged with conducting unclassified research across a wide range of scientific disciplines. Located on a 202-acre site in the hills above the UC Berkeley campus that offers spectacular views of the San Francisco Bay, Berkeley Lab employs approximately 3,232 scientists, engineers and support staff. The Lab’s total costs for FY 2014 were $785 million. A recent study estimates the Laboratory’s overall economic impact through direct, indirect and induced spending on the nine counties that make up the San Francisco Bay Area to be nearly $700 million annually. The Lab was also responsible for creating 5,600 jobs locally and 12,000 nationally. The overall economic impact on the national economy is estimated at $1.6 billion a year. Technologies developed at Berkeley Lab have generated billions of dollars in revenues, and thousands of jobs. Savings as a result of Berkeley Lab developments in lighting and windows, and other energy-efficient technologies, have also been in the billions of dollars.

    Berkeley Lab was founded in 1931 by Ernest Orlando Lawrence, a UC Berkeley physicist who won the 1939 Nobel Prize in physics for his invention of the cyclotron, a circular particle accelerator that opened the door to high-energy physics. It was Lawrence’s belief that scientific research is best done through teams of individuals with different fields of expertise, working together. His teamwork concept is a Berkeley Lab legacy that continues today.

    A U.S. Department of Energy National Laboratory Operated by the University of California.

    University of California Seal

    DOE Seal

     
  • richardmitnick 10:52 am on March 25, 2019 Permalink | Reply
    Tags: , , , , , , ExaLearn, , , NERSC - National Energy Research for Scientific Computing Center, ,   

    From insideHPC: “ExaLearn Project to bring Machine Learning to Exascale” 

    From insideHPC

    March 24, 2019

    As supercomputers become ever more capable in their march toward exascale levels of performance, scientists can run increasingly detailed and accurate simulations to study problems ranging from cleaner combustion to the nature of the universe. Enter ExaLearn, a new machine learning project supported by DOE’s Exascale Computing Project (ECP), aims to develop new tools to help scientists overcome this challenge by applying machine learning to very large experimental datasets and simulations.

    1
    The first research area for ExaLearn’s surrogate models will be in cosmology to support projects such a the LSST (Large Synoptic Survey Telescope) now under construction in Chile and shown here in an artist’s rendering. (Todd Mason, Mason Productions Inc. / LSST Corporation)

    “The challenge is that these powerful simulations require lots of computer time. That is, they are “computationally expensive,” consuming 10 to 50 million CPU hours for a single simulation. For example, running a 50-million-hour simulation on all 658,784 compute cores on the Cori supercomputer NERSC would take more than three days.

    NERSC

    NERSC Cray Cori II supercomputer at NERSC at LBNL, named after Gerty Cori, the first American woman to win a Nobel Prize in science

    NERSC Hopper Cray XE6 supercomputer


    LBL NERSC Cray XC30 Edison supercomputer


    The Genepool system is a cluster dedicated to the DOE Joint Genome Institute’s computing needs. Denovo is a smaller test system for Genepool that is primarily used by NERSC staff to test new system configurations and software.

    NERSC PDSF


    PDSF is a networked distributed computing cluster designed primarily to meet the detector simulation and data analysis requirements of physics, astrophysics and nuclear science collaborations.

    Future:

    Cray Shasta Perlmutter SC18 AMD Epyc Nvidia pre-exascale supeercomputer

    Running thousands of these simulations, which are needed to explore wide ranges in parameter space, would be intractable.

    One of the areas ExaLearn is focusing on is surrogate models. Surrogate models, often known as emulators, are built to provide rapid approximations of more expensive simulations. This allows a scientist to generate additional simulations more cheaply – running much faster on many fewer processors. To do this, the team will need to run thousands of computationally expensive simulations over a wide parameter space to train the computer to recognize patterns in the simulation data. This then allows the computer to create a computationally cheap model, easily interpolating between the parameters it was initially trained on to fill in the blanks between the results of the more expensive models.

    “Training can also take a long time, but then we expect these models to generate new simulations in just seconds,” said Peter Nugent, deputy director for science engagement in the Computational Research Division at LBNL.

    From Cosmology to Combustion

    Nugent is leading the effort to develop the so-called surrogate models as part of ExaLearn. The first research area will be cosmology, followed by combustion. But the team expects the tools to benefit a wide range of disciplines.

    “Many DOE simulation efforts could benefit from having realistic surrogate models in place of computationally expensive simulations,” ExaLearn Principal Investigator Frank Alexander of Brookhaven National Lab said at the recent ECP Annual Meeting.

    “These can be used to quickly flesh out parameter space, help with real-time decision making and experimental design, and determine the best areas to perform additional simulations.”

    The surrogate models and related simulations will aid in cosmological analyses to reduce systematic uncertainties in observations by telescopes and satellites. Such observations generate massive datasets that are currently limited by systematic uncertainties. Since we only have a single universe to observe, the only way to address these uncertainties is through simulations, so creating cheap but realistic and unbiased simulations greatly speeds up the analysis of these observational datasets. A typical cosmology experiment now requires sub-percent level control of statistical and systematic uncertainties. This then requires the generation of thousands to hundreds of thousands of computationally expensive simulations to beat down the uncertainties.

    These parameters are critical in light of two upcoming programs:

    The Dark Energy Spectroscopic Instrument, or DESI, is an advanced instrument on a telescope located in Arizona that is expected to begin surveying the universe this year.

    LBNL/DESI Dark Energy Spectroscopic Instrument for the Nicholas U. Mayall 4-meter telescope at Kitt Peak National Observatory near Tucson, Ariz, USA


    NOAO/Mayall 4 m telescope at Kitt Peak, Arizona, USA, Altitude 2,120 m (6,960 ft)

    DESI seeks to map the large-scale structure of the universe over an enormous volume and a wide range of look-back times (based on “redshift,” or the shift in the light of distant objects toward redder wavelengths of light). Targeting about 30 million pre-selected galaxies across one-third of the night sky, scientists will use DESI’s redshifts data to construct 3D maps of the universe. There will be about 10 terabytes (TB) of raw data per year transferred from the observatory to NERSC. After running the data through the pipelines at NERSC (using millions of CPU hours), about 100 TB per year of data products will be made available as data releases approximately once a year throughout DESI’s five years of operations.

    The Large Synoptic Survey Telescope, or LSST, is currently being built on a mountaintop in Chile.

    LSST


    LSST Camera, built at SLAC



    LSST telescope, currently under construction on the El Peñón peak at Cerro Pachón Chile, a 2,682-meter-high mountain in Coquimbo Region, in northern Chile, alongside the existing Gemini South and Southern Astrophysical Research Telescopes.


    LSST Data Journey, Illustration by Sandbox Studio, Chicago with Ana Kova

    When completed in 2021, the LSST will take more than 800 panoramic images each night with its 3.2 billion-pixel camera, recording the entire visible sky twice each week. Each patch of sky it images will be visited 1,000 times during the survey, and each of its 30-second observations will be able to detect objects 10 million times fainter than visible with the human eye. A powerful data system will compare new with previous images to detect changes in brightness and position of objects as big as far-distant galaxy clusters and as small as nearby asteroids.

    For these programs, the ExaLearn team will first target large-scale structure simulations of the universe since the field is more developed than others and the scale of the problem size can easily be ramped up to an exascale machine learning challenge.

    As an example of how ExaLearn will advance the field, Nugent said a researcher could run a suite of simulations with the parameters of the universe consisting of 30 percent dark energy and 70 percent dark matter, then a second simulation with 25 percent and 75 percent, respectively. Each of these simulations generates three-dimensional maps of tens of billions of galaxies in the universe and how the cluster and spread apart as time goes by. Using a surrogate model trained on these simulations, the researcher could then quickly run another surrogate model that would generate the output of a simulation in between these values, at 27.5 and 72.5 percent, without needing to run a new, costly simulation — that too would show the evolution of the galaxies in the universe as a function of time. The goal of the ExaLearn software suite is that such results, and their uncertainties and biases, would be a byproduct of the training so that one would know the generated models are consistent with a full simulation.

    Toward this end, Nugent’s team will build on two projects already underway at Berkeley Lab: CosmoFlow and CosmoGAN. CosmoFlow is a deep learning 3D convolutional neural network that can predict cosmological parameters with unprecedented accuracy using the Cori supercomputer at NERSC. CosmoGAN is exploring the use of generative adversarial networks to create cosmological weak lensing convergence maps — maps of the matter density of the universe as would be observed from Earth — at lower computational costs.

    See the full article here .

    five-ways-keep-your-child-safe-school-shootings

    Please help promote STEM in your local schools.

    Stem Education Coalition

    Founded on December 28, 2006, insideHPC is a blog that distills news and events in the world of HPC and presents them in bite-sized nuggets of helpfulness as a resource for supercomputing professionals. As one reader said, we’re sifting through all the news so you don’t have to!

    If you would like to contact me with suggestions, comments, corrections, errors or new company announcements, please send me an email at rich@insidehpc.com. Or you can send me mail at:

    insideHPC
    2825 NW Upshur
    Suite G
    Portland, OR 97239

    Phone: (503) 877-5048

     
  • richardmitnick 9:33 am on March 23, 2019 Permalink | Reply
    Tags: "NERSC taps NVIDIA compiler team for Perlmutter Supercomputer", Cray Shasta architecture, Dr. Saul Perlmutter UC Berkeley Nobel laureate, , NERSC - National Energy Research for Scientific Computing Center   

    From insideHPC: “NERSC taps NVIDIA compiler team for Perlmutter Supercomputer” 

    From insideHPC

    March 22, 2019

    1
    Dr. Saul Perlmutter (left) holds an animated conservation with John Kirkley at SC13. Photo by Sharan Kalwani, Fermilab

    NERSC has signed a contract with NVIDIA to enhance GPU compiler capabilities for Berkeley Lab’s next-generation Perlmutter supercomputer.

    Cray Shasta Perlmutter SC18 AMD Epyc Nvidia pre-exascale supeercomputer

    NERSC

    NERSC Cray Cori II supercomputer at NERSC at LBNL, named after Gerty Cori, the first American woman to win a Nobel Prize in science

    NERSC Hopper Cray XE6 supercomputer


    LBL NERSC Cray XC30 Edison supercomputer


    The Genepool system is a cluster dedicated to the DOE Joint Genome Institute’s computing needs. Denovo is a smaller test system for Genepool that is primarily used by NERSC staff to test new system configurations and software.

    NERSC PDSF


    PDSF is a networked distributed computing cluster designed primarily to meet the detector simulation and data analysis requirements of physics, astrophysics and nuclear science collaborations.

    2
    DOE and Cray announced on Oct. 30, 2018 that NERSC’s next supercomputer will be a Cray pre-exascale system to be delivered in 2020.

    To highlight NERSC’s commitment to advancing research, the new system will be named “Perlmutter” in honor of Saul Perlmutter, an astrophysicist at Berkeley Lab and a professor of physics at the University of California, Berkeley who shared the 2011 Nobel Prize in Physics for his contributions to research showing that the expansion of the universe is accelerating. Dr. Perlmutter is also director of the Berkeley Institute for Data Science and leads the international Supernova Cosmology Project. He has been a NERSC user for many years, and part of his Nobel Prize winning work was carried out on NERSC machines.

    Perlmutter, a Cray system code-named “Shasta”, will be a heterogeneous system comprising both CPU-only and GPU-accelerated nodes, with a performance of more than 3 times Cori, NERSC’s current platform. It will include a number of innovations designed to meet the diverse computational and data analysis needs of NERSC’s user base and speed their scientific productivity. The new system derives performance from advances in hardware and software, including a new Cray system interconnect, code-named Slingshot, which is designed for data-centric computing. Slingshot’s Ethernet compatibility, advanced adaptive routing, first-of-a-kind congestion control, and sophisticated quality of service capabilities improve system utilization and performance and scalability of supercomputing and AI applications and workflows. The system will also feature NVIDIA GPUs with new Tensor Core technology, direct liquid cooling and will be NERSC’s first supercomputer with an all-flash scratch filesystem. Developed by Cray to accelerate I/O, the 30-petabyte Lustre filesystem will move data at a rate of more than 4 terabytes/sec.

    “We are excited to work with NVIDIA to enable OpenMP GPU computing using their PGI compilers,” said Nick Wright, the Perlmutter chief architect. “Many NERSC users are already successfully using the OpenMP API to target the manycore architecture of the NERSC Cori supercomputer. This project provides a continuation of our support of OpenMP and offers an attractive method to use the GPUs in the Perlmutter supercomputer. We are confident that our investment in OpenMP will help NERSC users meet their application performance portability goals.”

    Under the new non-recurring engineering contract with NVIDIA, worth approximately $4 million, Berkeley Lab researchers will work with NVIDIA engineers to enhance the NVIDIA’s PGI C, C++ and Fortran compilers to enable OpenMP applications to run on NVIDIA GPUs. This collaboration will help NERSC users, and the HPC community as a whole, efficiently port suitable applications to target GPU hardware in the Perlmutter system.

    Programming using compiler directives of any form are an important part of code portability and developer productivity. NERSC participation in both OpenMP and OpenACC organizations helps advance the entire ecosystem of important tools and the specifications on which they rely.

    “Together with OpenACC, this OpenMP collaboration gives HPC developers more options for directives-based programming from a single compiler on GPUs and CPUs,” said Doug Miles, senior director of PGI compilers and tools at NVIDIA. “Our joint effort on programming tools for the Perlmutter supercomputer highlights how NERSC and NVIDIA are simplifying migration and development of science and engineering applications to pre-exascale systems and beyond.”

    2
    The Perlmutter Supercomputer will be based on the Cray Shasta architecture.

    In addition, through this partnership, NERSC and NVIDIA will develop a set of GPU-based high performance data analytic tools using Python, the primary language used for data analytics at NERSC and a robust platform for machine learning and deep learning libraries. The new Python tools will allow NERSC to train staff and users through hack-a-thons where NERSC users will be able to work directly with NVIDIA personnel on their codes.

    “NERSC supports thousands of researchers in diverse sciences at universities, national laboratories, and in industry,” commented Data Architect Rollin Thomas, who is leading the partnership at NERSC. “Our users increasingly want productive high-performance tools for interacting with their data, whether it comes from a massively parallel simulation or an experimental or observational science facility like a particle accelerator, astronomical observatory, or genome sequencer. We look forward to working with NVIDIA to accelerate discovery across all these disciplines.”

    See the full article here .

    five-ways-keep-your-child-safe-school-shootings

    Please help promote STEM in your local schools.

    Stem Education Coalition

    Founded on December 28, 2006, insideHPC is a blog that distills news and events in the world of HPC and presents them in bite-sized nuggets of helpfulness as a resource for supercomputing professionals. As one reader said, we’re sifting through all the news so you don’t have to!

    If you would like to contact me with suggestions, comments, corrections, errors or new company announcements, please send me an email at rich@insidehpc.com. Or you can send me mail at:

    insideHPC
    2825 NW Upshur
    Suite G
    Portland, OR 97239

    Phone: (503) 877-5048

     
  • richardmitnick 11:22 am on January 31, 2019 Permalink | Reply
    Tags: , , , Blandford-Znajek mechanism, , , NERSC - National Energy Research for Scientific Computing Center, New Supercomputer Simulations Show How Plasma Jets Escape Black Holes, Penrose process   

    From Motherboard: “New Supercomputer Simulations Show How Plasma Jets Escape Black Holes” 

    motherboard

    From Motherboard

    Jan 30 2019
    Daniel Oberhaus

    Black holes swallow everything that comes in contact with them, so how do plasma jets manage to escape their intense gravity?

    1
    Visualization of a general-relativistic collisionless plasma simulation. Image: Parfrey/LBNL

    Researchers used one of the world’s most powerful supercomputers to better understand how jets of high energy plasma escapes the intense gravity of a black hole, which swallows everything else in its path—including light.

    Before stars and other matter cross a black hole’s point of no return—a boundary known as the “event horizon”—and get consumed by the black hole, they get swept up in the black hole’s rotation. A question that has vexed physicists for decades was how some energy managed to escape the process and get channeled into streams of plasma that travel through space near the speed of light.

    As detailed in a paper published last week in Physical Review Letters, researchers affiliated with the Department of Energy and the University of California Berkeley used a supercomputer at the DoE’s Lawrence Berkeley National Laboratory to simulate the jets of plasma, an electrically charged gas-like substance.

    NERSC PDSF


    PDSF is a networked distributed computing cluster designed primarily to meet the detector simulation and data analysis requirements of physics, astrophysics and nuclear science collaborations.

    The simulations ultimately reconciled two decades-old theories that attempt to explain how energy can be extracted from a rotating black hole.

    The first theory describes how electric currents around a black hole twist its magnetic field to create a jet, which is known as the Blandford-Znajek mechanism. This theory posits that material caught in the gravity of a rotating black hole will become increasingly magnetized the closer it gets to the event horizon. The black hole acts like a massive conductor spinning in a huge magnetic field, which will cause an energy difference (voltage) between the poles of the black hole and its equator. This energy difference is then diffused as jets at the poles of the black hole.

    “There is a region around a rotating black hole, called the ergosphere, inside of which all particles are forced to rotate in the same direction as the black hole,” Kyle Parfrey, the lead author of the paper and a theoretical astrophysicist at NASA, told me in an email. “In this region it’s possible for a particle to effectively have negative energy in some sense, if it tries to orbit against the hole’s rotation.”

    In other words, if one half of the split particle is launched against the spin of the black hole, it will reduce the black hole’s angular momentum or rotation. But that rotational energy has to go somewhere. In this case, it’s converted into energy that propels the other half of the particle away from the black hole.

    According to Parfrey, the Penrose process observed in their simulations was a bit different from the classical situation of a particle splitting that was described above, however. Rather than particles splitting, charged particles in the plasma are acted on by electromagnetic forces, some of which are propelled against the rotation of the black hole on a negative energy trajectory. It is in this sense, Parfrey told me, that they are still considered a type of Penrose process.

    The surprising part of the simulation, Parfrey told me, was that it appeared to establish a link between the Penford process and Blandford-Znajek mechanism, which had never been seen before.

    To create the twisting magnetic fields that extract energy from the black hole in the Blandford-Znajek mechanism requires the electric current carried by particles inside the plasma and a substantial number of these particles had the negative energy property characteristic of the Penrose process.

    “So it appears that, at least in some cases, the two mechanisms are linked,” Parfrey said.

    Parfrey and his colleagues hope that their models will provide much needed context for photos from the Event Horizon Telescope, an array of telescopes that aim to directly image the event horizon where these plasma jets form. Until that first image is produced, however, Parfery said he and his colleagues want to refine these simulations so that they conform even better to existing observations.

    See the full article here .

    five-ways-keep-your-child-safe-school-shootings

    Please help promote STEM in your local schools.

    Stem Education Coalition

    The future is wonderful, the future is terrifying. We should know, we live there. Whether on the ground or on the web, Motherboard travels the world to uncover the tech and science stories that define what’s coming next for this quickly-evolving planet of ours.

    Motherboard is a multi-platform, multimedia publication, relying on longform reporting, in-depth blogging, and video and film production to ensure every story is presented in its most gripping and relatable format. Beyond that, we are dedicated to bringing our audience honest portraits of the futures we face, so you can be better informed in your decision-making today.

     
  • richardmitnick 11:46 am on September 25, 2018 Permalink | Reply
    Tags: , , , Coherence, Critical Decision 1 or CD-1, Lab founder Ernest Lawrence’s construction of the first cyclotron particle accelerator in 1930, , , NERSC - National Energy Research for Scientific Computing Center, Smaller-scale explorations of magnetic properties in multilayer data-storage materials, The dozens of beamlines maintained and operated by Berkeley Lab staff and scientists at the ALS conduct experiments simultaneously at all hours, The upgrade project is dubbed ALS-U, Toward a New Light: Advanced Light Source Upgrade Project Moves Forward,   

    From Lawrence Berkeley National Lab: “Toward a New Light: Advanced Light Source Upgrade Project Moves Forward” 

    Berkeley Logo

    From Lawrence Berkeley National Lab

    September 25, 2018
    Glenn Roberts Jr.
    geroberts@lbl.gov
    (510) 486-5582


    VIDEO: Berkeley Lab’s Advanced Light Source takes a next step toward a major upgrade. (Credit: Berkeley Lab)

    The Advanced Light Source (ALS), a scientific user facility at the Department of Energy’s (DOE) Lawrence Berkeley National Laboratory (Berkeley Lab), has received federal approval to proceed with preliminary design, planning and R&D work for a major upgrade project that will boost the brightness of its X-ray beams at least a hundredfold.

    LBNL/ALS

    The upgrade will give the ALS, which this year celebrates its 25th anniversary, brighter beams with a more ordered structure – like evenly spaced ripples in a pond – that will better reveal nanoscale details in complex chemical reactions and in new materials, expanding the envelope for scientific exploration.

    “This upgrade will make it possible for Berkeley Lab to be the leader in soft X-ray research for another 25 years, and for the ALS to remain at the center of this Laboratory for that time,” said Berkeley Lab Director Mike Witherell.

    Steve Kevan, ALS Director, added, “The upgrade will transform the ALS. It will expand our scientific frontiers, enabling studies of materials and phenomena that are at the edge of our understanding today. And it will renew the ALS’s innovative spirit, attracting the best researchers from around the world to our facility to conduct their experiments in collaboration with our scientists.”

    2
    This computer rendering provides a top view of the ALS and shows equipment that will be installed during the ALS-U project. (Credit: Berkeley Lab)

    The latest approval by the DOE, known as Critical Decision 1 or CD-1, authorizes the start of engineering and design work to increase the brightness and to more precisely focus the beams of light produced at the ALS that drive a broad range of science experiments. The upgrade project is dubbed ALS-U.

    The dozens of beamlines maintained and operated by Berkeley Lab staff and scientists at the ALS conduct experiments simultaneously at all hours, attracting more than 2,000 researchers each year from across the country and around the globe through its role in a network of DOE Office of Science User Facilities.

    This upgrade is intended to make the ALS the brightest storage ring-based source of soft X-rays in the world. Soft X-rays have an energy range that is especially useful for observing chemistry in action and for studying a material’s electronic and magnetic properties in microscopic detail.

    3
    4
    Click the play button on the full article at bottom left to view a slideshow. This slideshow chronicles the history of the Advanced Light Source and the building that houses it, which was formerly home to a 184-inch cyclotron – another type of particle accelerator. It also shows the science conducted at the ALS and includes computer renderings of new equipment that will be installed as a part of the ALS-U project. (Credit: Berkeley Lab)

    The planned upgrade will significantly increase the brightness of the ALS by focusing more light on a smaller spot. X-ray beams that today are about 100 microns (thousandths of an inch) across – smaller than the diameter of a human hair – will be squeezed down to just a few microns after the upgrade.

    “That’s very exciting for us,” said Elke Arenholz, a senior staff scientist at the ALS. The upgrade will imbue the X-rays with a property known as “coherence” that will allow scientists to explore more complex and disordered samples with high precision. The high coherence of the soft X-ray light generated by the ALS-U will approach a theoretical limit.

    “We can take materials that are more in their natural state, resolve any fluctuations, and look much more closely at the structure of materials, down to the nanoscale,” Arenholz said.

    Among the many applications of these more precise beams are smaller-scale explorations of magnetic properties in multilayer data-storage materials, she said, and new observations of battery chemistry and other reactions as they occur. The upgrade should also enable faster data collection, which can allow researchers to speed up their experiments, she noted.

    “We will have a lot of very interesting, new data that we couldn’t acquire before,” she said. Analyzing that data and feeding it back into new experiments will also draw upon other Berkeley Lab capabilities, including sample fabrication, complementary study techniques, and theory work at the Lab’s Molecular Foundry; as well as data processing, simulation and analysis work at the Lab’s National Energy Research Scientific Computing Center (NERSC).

    William Chueh, an assistant professor of materials science at Stanford University who also heads up the users’ association for researchers who use the ALS or are interested in using the ALS, said that the upgrade will aid his studies by improving the resolution in tracking how charged particles move through batteries and fuel cells, for example.

    “I am very excited by the science that the ALS-U project will enable. Such a tool will provide insights and design rules that help us to develop tomorrow’s materials,” Chueh said.

    The upgrade project is a massive undertaking that will draw upon most areas at the Lab, said ALS-U Project Director David Robin, requiring the expertise of accelerator physicists, mechanical and electrical engineers, computer scientists, beamline optics and controls specialists, and safety and project management personnel, among a long list.

    Berkeley Lab’s pioneering history of innovation and achievements in accelerator science, beginning with Lab founder Ernest Lawrence’s construction of the first cyclotron particle accelerator in 1930, have well-prepared the Lab for this latest project, Robin said.

    He noted the historic contribution by the late Klaus Halbach, a Berkeley Lab scientist whose design of compact, powerful magnetic instruments known as permanent magnet insertion devices paved the way for the design of the current ALS and other so-called third-generation light sources of its kind.

    4
    An interior view of the Advanced Light Source. (Credit: Berkeley Lab)

    The ALS-U project will remove more than 400 tons of equipment associated with the existing ALS storage ring, which is used to circulate electrons at nearly the speed of light to generate the synchrotron radiation that is ultimately emitted as X-rays and other forms of light.

    A new magnetic array known as a “multi-bend achromat lattice” will take its place, and a secondary, “accumulator” ring will be added that will enhance beam brightness. Also, several new ALS beamlines are already optimized for the high brightness and coherence of the ALS-U beams, and there are plans for additional beamline upgrades.

    5
    This 1940s photograph shows the original building that housed a 184-inch cyclotron and that now contains the ALS. (Credit: Berkeley Lab)

    The iconic domed building that houses the ALS – which was designed in the 1930s by Arthur Brown Jr., the architect for San Francisco landmark Coit Tower – will be preserved in the upgrade project. The ALS dome originally housed an accelerator known as the 184-inch cyclotron.

    Robin credited the ALS-U project team, with support from all areas of the Lab, in the continuing progress toward the upgrade. “They have done a tremendous job in getting us to the point that we are at today,” he said.

    Witherell said, “The fact that we will have this upgraded Advanced Light Source is an enormous vote of confidence in us by the federal government and the taxpayers.”

    Berkeley Lab’s ALS, Molecular Foundry, and NERSC are all DOE Office of Science user facilities.

    More information:

    ALS-U Overview
    Transformational X-ray Project Takes a Step Forward, Oct. 3, 2016
    A Brief History of the ALS

    See the full article here .


    five-ways-keep-your-child-safe-school-shootings
    Please help promote STEM in your local schools.

    Stem Education Coalition

    A U.S. Department of Energy National Laboratory Operated by the University of California

    University of California Seal

    DOE Seal

     
  • richardmitnick 7:44 pm on July 3, 2018 Permalink | Reply
    Tags: , , , NERSC - National Energy Research for Scientific Computing Center, ,   

    From Fermilab: “Fermilab computing experts bolster NOvA evidence, 1 million cores consumed” 

    FNAL II photo

    FNAL Art Image
    FNAL Art Image by Angela Gonzales

    From Fermilab , an enduring source of strength for the US contribution to scientific research world wide.

    July 3, 2018
    No writer credit found

    How do you arrive at the physical laws of the universe when you’re given experimental data on a renegade particle that interacts so rarely with matter, it can cruise through light-years of lead? You call on the power of advanced computing.

    The NOvA neutrino experiment, in collaboration with the Department of Energy’s Scientific Discovery through Advanced Computing (SciDAC-4) program and the HEPCloud program at DOE’s Fermi National Accelerator Laboratory, was able to perform the largest-scale analysis ever to support the recent evidence of antineutrino oscillation, a phenomenon that may hold clues to how our universe evolved.

    FNAL/NOvA experiment map


    FNAL NOvA detector in northern Minnesota


    NOvA Far detector 15 metric-kiloton far detector in Minnesota just south of the U.S.-Canada border schematic


    NOvA Far Detector Block


    FNAL Near Detector

    Using Cori, the newest supercomputer at the National Energy Research Scientific Computing Center (NERSC), located at Lawrence Berkeley National Laboratory, NOvA used over 1 million computing cores, or CPUs, between May 14 and 15 and over a short timeframe one week later.

    1
    The Cori supercomputer at NERSC was used to perform a complex computational analysis for NOvA. NOvA used over 1 million computing cores, the largest amount ever used concurrently in a 54-hour period. Photo: Roy Kaltschmidt, Lawrence Berkeley National Laboratory
    NERSC CRAY Cori II supercomputerat NERSC at LBNL, named after Gerty Cori, the first American woman to win a Nobel Prize in science

    This is the largest number of CPUs ever used concurrently over this duration — about 54 hours — for a single high-energy physics experiment. This unprecedented amount of computing enabled scientists to carry out some of the most complicated techniques used in neutrino physics, allowing them to dig deeper into the seldom seen interactions of neutrinos. This Cori allocation was more than 400 times the amount of Fermilab computing allocated to the NOvA experiment and 50 times the total computing capacity at Fermilab allocated for all of its rare-physics experiments. A continuation of the analysis was performed on NERSC’s Cori and Edison supercomputers one week later.

    LBL NERSC Cray XC30 Edison supercomputer

    In total, nearly 35 million core-hours were consumed by NOvA in the 54-hour period. Executing the same analysis on a single desktop computer would take 4,000 years.

    “The special thing about NERSC is that it enabled NOvA to do the science at a new level of precision, a much finer resolution with greater statistical accuracy within a finite amount of time,” said Andrew Norman, NOvA physicist at Fermilab. “It facilitated doing analysis of real data coming off the detector at a rate 50 times faster than that achieved in the past. The first round of analysis was done within 16 hours. Experimenters were able to see what was coming out of the data, and in less than six hours everyone was looking at it. Without these types of resources, we, as a collaboration, could not have turned around results as quickly and understood what we were seeing.”

    The experiment presented the latest finding from the recently collected data at the Neutrino 2018 conference in Germany on June 4.

    “The speed with which NERSC allowed our analysis team to run sophisticated and intense calculations needed to produce our final results has been a game-changer,” said Fermilab scientist Peter Shanahan, NOvA co-spokesperson. “It accelerated our time-to-results on the last step in our analysis from weeks to days, and that has already had a huge impact on what we were able to show at Neutrino 2018.”

    In addition to the state-of-the-art NERSC facility, NOvA relied on work done within the SciDAC HEP Data Analytics on HPC (high-performance computers) project and the Fermilab HEPCloud facility. Both efforts are led by Fermilab scientific computing staff, and both worked together with researchers at NERSC to be able to support NOvA’s antineutrino oscillation evidence.

    The current standard practice for Fermilab experimenters is to perform similar analyses using less complex calculations through a combination of both traditional high-throughput computing and the distributed computing provided by Open Science Grid, a national partnership between laboratories and universities for data-intensive research. These are substantial resources, but they use a different model: Both use a large amount of computing resources over a long period of time. For example, some resources are offered only at a low priority, so their use may be preempted by higher-priority demands. But for complex, time-sensitive analyses such as NOvA’s, researchers need the faster processing enabled by modern, high-performance computing techniques.

    SciDAC-4 is a DOE Office of Science program that funds collaboration between experts in mathematics, physics and computer science to solve difficult problems. The HEP on HPC project was funded specifically to explore computational analysis techniques for doing large-scale data analysis on DOE-owned supercomputers. Running the NOvA analysis at NERSC, the mission supercomputing facility for the DOE Office of Science, was a task perfectly suited for this project. Fermilab’s Jim Kowalkowski is the principal investigator for HEP on HPC, which also has collaborators from DOE’s Argonne National Laboratory, Berkeley Lab, University of Cincinnati and Colorado State University.

    “This analysis forms a kind of baseline. We’re just ramping up, just starting to exploit the other capabilities of NERSC at an unprecedented scale,” Kowalkowski said.

    The project’s goal for its first year is to take compute-heavy analysis jobs like NOvA’s and enable it on supercomputers. That means not just running the analysis, but also changing how calculations are done and learning how to revamp the tools that manipulate the data, all in an effort to improve techniques used for doing these analyses and to leverage the full computational power and unique capabilities of modern high-performance computing facilities. In addition, the project seeks to consume all computing cores at once to shorten that timeline.

    The Fermilab HEPCloud facility provides cost-effective access to compute resources by optimizing usage across all available types and elastically expanding the resource pool on short notice by, for example, renting temporary resources on commercial clouds or using high-performance computers. HEPCloud enables NOvA and physicists from other experiments to use these compute resources in a transparent way.

    For this analysis, “NOvA experimenters didn’t have to change much in terms of business as usual,” said Burt Holzman, HEPCloud principal investigator. “With HEPCloud, we simply expanded our local on-site-at-Fermilab facilities to include Cori and Edison at NERSC.”

    3
    At the Neutrino 2018 conference, Fermilab’s NOvA neutrino experiment announced that it had seen strong evidence of muon antineutrinos oscillating into electron antineutrinos over long distances. NOvA collaborated with the Department of Energy’s Scientific Discovery through Advanced Computing program and Fermilab’s HEPCloud program to perform the largest-scale analysis ever to support the recent evidence. Photo: Reidar Hahn

    Building on work the Fermilab HEPCloud team has been doing with researchers at NERSC to optimize high-throughput computing in general, the HEPCloud team was able to leverage the facility to achieve the million-core milestone. Thus, it holds the record for the most resources ever provisioned concurrently at a single facility to run experimental HEP workflows.

    “This is the culmination of more than a decade of R&D we have done at Fermilab under SciDAC and the first taste of things to come, using these capabilities and HEPCloud,” said Panagiotis Spentzouris, head of the Fermilab Scientific Computing Division and HEPCloud sponsor.

    “NOvA is an experimental facility located more than 2,000 miles away from Berkeley Lab, where NERSC is located. The fact that we can make our resources available to the experimental researchers near real-time to enable their time-sensitive science that could not be completed otherwise is very exciting,” said Wahid Bhimji, a NERSC data architect at Berkeley Lab who worked with the NOvA team. “Led by colleague Lisa Gerhardt, we’ve been working closely with the HEPCloud team over the last couple of years, also to support physics experiments at the Large Hadron Collider. The recent NOvA results are a great example of how the infrastructure and capabilities that we’ve built can benefit a wide range of high energy experiments.”

    Going forward, Kowalkowski, Holzman and their associated teams will continue building on this achievement.

    “We’re going to keep iterating,” Kowalkowski said. “The new facilities and procedures were enthusiastically received by the NOvA collaboration. We will accelerate other key analyses.”

    NERSC is a DOE Office of Science user facility.

    See the full article here .


    five-ways-keep-your-child-safe-school-shootings

    Please help promote STEM in your local schools.

    Stem Education Coalition

    FNAL Icon

    Fermi National Accelerator Laboratory (Fermilab), located just outside Batavia, Illinois, near Chicago, is a US Department of Energy national laboratory specializing in high-energy particle physics. Fermilab is America’s premier laboratory for particle physics and accelerator research, funded by the U.S. Department of Energy. Thousands of scientists from universities and laboratories around the world
    collaborate at Fermilab on experiments at the frontiers of discovery.


    FNAL/MINERvA

    FNAL DAMIC

    FNAL Muon g-2 studio

    FNAL Short-Baseline Near Detector under construction

    FNAL Mu2e solenoid

    Dark Energy Camera [DECam], built at FNAL

    FNAL DUNE Argon tank at SURF

    FNAL/MicrobooNE

    FNAL Don Lincoln

    FNAL/MINOS

    FNAL Cryomodule Testing Facility

    FNAL Minos Far Detector

    FNAL LBNF/DUNE from FNAL to SURF, Lead, South Dakota, USA

    FNAL/NOvA experiment map

    FNAL NOvA Near Detector

    FNAL ICARUS

    FNAL Holometer

     
  • richardmitnick 10:59 am on June 19, 2018 Permalink | Reply
    Tags: , , , , NERSC - National Energy Research for Scientific Computing Center, Searching Science Data   

    From Lawrence Berkeley National Lab: “Berkeley Lab Researchers Use Machine Learning to Search Science Data” 

    Berkeley Logo

    From Lawrence Berkeley National Lab

    1
    A screenshot of image-based results in the Science Search interface. In this case, the user performed an image search for nanoparticles. (Credit: Gonzalo Rodrigo/Berkeley Lab)

    As scientific datasets increase in both size and complexity, the ability to label, filter and search this deluge of information has become a laborious, time-consuming and sometimes impossible task, without the help of automated tools.

    With this in mind, a team of researchers from the Department of Energy’s Lawrence Berkeley National Laboratory (Berkeley Lab) and UC Berkeley are developing innovative machine learning tools to pull contextual information from scientific datasets and automatically generate metadata tags for each file. Scientists can then search these files via a web-based search engine for scientific data, called Science Search, that the Berkeley team is building.

    As a proof-of-concept, the team is working with staff at Berkeley Lab’s Molecular Foundry, to demonstrate the concepts of Science Search on the images captured by the facility’s instruments. A beta version of the platform has been made available to Foundry researchers.

    LBNL Molecular Foundry – No image credits found

    “A tool like Science Search has the potential to revolutionize our research,” says Colin Ophus, a Molecular Foundry research scientist within the National Center for Electron Microscopy (NCEM) and Science Search Collaborator. “We are a taxpayer-funded National User Facility, and we would like to make all of the data widely available, rather than the small number of images chosen for publication. However, today, most of the data that is collected here only really gets looked at by a handful of people—the data producers, including the PI (principal investigator), their postdocs or graduate students—because there is currently no easy way to sift through and share the data. By making this raw data easily searchable and shareable, via the Internet, Science Search could open this reservoir of ‘dark data’ to all scientists and maximize our facility’s scientific impact.”

    The Challenges of Searching Science Data

    2
    This screen capture of the Science Search interface shows how users can easily validate metadata tags that have been generated via machine learning, or add information that hasn’t already been captured. (Credit: Gonzalo Rodrigo/Berkeley Lab)

    Today, search engines are ubiquitously used to find information on the Internet but searching science data presents a different set of challenges. For example, Google’s algorithm relies on more than 200 clues to achieve an effective search. These clues can come in the form of key words on a webpage, metadata in images or audience feedback from billions of people when they click on the information they are looking for. In contrast, scientific data comes in many forms that are radically different than an average web page, requires context that is specific to the science and often also lacks the metadata to provide context that is required for effective searches.

    At National User Facilities like the Molecular Foundry, researchers from all over the world apply for time and then travel to Berkeley to use extremely specialized instruments free of charge. Ophus notes that the current cameras on microscopes at the Foundry can collect up to a terabyte of data in under 10 minutes. Users then need to manually sift through this data to find quality images with “good resolution” and save that information on a secure shared file system, like Dropbox, or on an external hard drive that they eventually take home with them to analyze.

    Oftentimes, the researchers that come to the Molecular Foundry only have a couple of days to collect their data. Because it is very tedious and time consuming to manually add notes to terabytes of scientific data and there is no standard for doing it, most researchers just type shorthand descriptions in the filename. This might make sense to the person saving the file, but often doesn’t make much sense to anyone else.

    “The lack of real metadata labels eventually causes problems when the scientist tries to find the data later or attempts to share it with others,” says Lavanya Ramakrishnan, a staff scientist in Berkeley Lab’s Computational Research Division (CRD) and co-principal investigator of the Science Search project. “But with machine-learning techniques, we can have computers help with what is laborious for the users, including adding tags to the data. Then we can use those tags to effectively search the data.”

    3
    In addition to images, Science Search can also be used to look for proposals and papers. This is a screenshot of the paper search results. (Credit: Gonzalo Rodrigo/Berkeley Lab). [No hot links.]

    To address the metadata issue, the Berkeley Lab team uses machine-learning techniques to mine the “science ecosystem”—including instrument timestamps, facility user logs, scientific proposals, publications and file system structures—for contextual information. The collective information from these sources including timestamp of the experiment, notes about the resolution and filter used and the user’s request for time, all provides critical contextual information. The Berkeley lab team has put together an innovative software stack that uses machine-learning techniques including natural language processing pull contextual keywords about the scientific experiment and automatically create metadata tags for the data.

    For the proof-of-concept, Ophus shared data from the Molecular Foundry’s TEAM 1 electron microscope at NCEM that was recently collected by the facility staff, with the Science Search Team.

    LBNL National Center for Electron Microscopy (NCEM)

    He also volunteered to label a few thousand images to give the machine-learning tools some labels from which to start learning. While this is a good start, Science Search co-principal investigator Gunther Weber notes that most successful machine-learning applications typically require significantly more data and feedback to deliver better results. For example, in the case of search engines like Google, Weber notes that training datasets are created and machine-learning techniques are validated when billions of people around the world verify their identity by clicking on all the images with street signs or storefronts after typing in their passwords, or on Facebook when they’re tagging their friends in an image.

    “In the case of science data only a handful of domain experts can create training sets and validate machine-learning techniques, so one of the big ongoing problems we face is an extremely small number of training sets,” says Weber, who is also a staff scientist in Berkeley Lab’s CRD.

    To overcome this challenge, the Berkeley Lab researchers used transfer learning to limit the degrees of freedom, or parameter counts, on their convolutional neural networks (CNNs). Transfer learning is a machine learning method in which a model developed for a task is reused as the starting point for a model on a second task, which allows the user to get more accurate results from a smaller training set. In the case of the TEAM I microscope, the data produced contains information about which operation mode the instrument was in at the time of collection. With that information, Weber was able to train the neural network on that classification so it could generate that mode of operation label automatically. He then froze that convolutional layer of the network, which meant he’d only have to retrain the densely connected layers. This approach effectively reduces the number of parameters on the CNN, allowing the team to get some meaningful results from their limited training data.

    Machine Learning to Mine the Scientific Ecosystem

    In addition to generating metadata tags through training datasets, the Berkeley Lab team also developed tools that use machine-learning techniques for mining the science ecosystem for data context. For example, the data ingest module can look at a multitude of information sources from the scientific ecosystem—including instrument timestamps, user logs, proposals and publications—and identify commonalities. Tools developed at Berkeley Lab that use natural language-processing methods can then identify and rank words that give context to the data and facilitate meaningful results for users later on. The user will see something similar to the results page of an Internet search, where content with the most text matching the user’s search words will appear higher on the page. The system also learns from user queries and the search results they click on.

    Because scientific instruments are generating an ever-growing body of data, all aspects of the Berkeley team’s science search engine needed to be scalable to keep pace with the rate and scale of the data volumes being produced. The team achieved this by setting up their system in a Spin instance on the Cori supercomputer at the National Energy Research Scientific Computing Center (NERSC).

    NERSC

    NERSC Cray Cori II supercomputer at NERSC at LBNL, named after Gerty Cori, the first American woman to win a Nobel Prize in science

    LBL NERSC Cray XC30 Edison supercomputer


    The Genepool system is a cluster dedicated to the DOE Joint Genome Institute’s computing needs. Denovo is a smaller test system for Genepool that is primarily used by NERSC staff to test new system configurations and software.

    NERSC PDSF


    PDSF is a networked distributed computing cluster designed primarily to meet the detector simulation and data analysis requirements of physics, astrophysics and nuclear science collaborations.

    Spin is a Docker-based edge-services technology developed at NERSC that can access the facility’s high performance computing systems and storage on the back end.

    “One of the reasons it is possible for us to build a tool like Science Search is our access to resources at NERSC,” says Gonzalo Rodrigo, a Berkeley Lab postdoctoral researcher who is working on the natural language processing and infrastructure challenges in Science Search. “We have to store, analyze and retrieve really large datasets, and it is useful to have access to a supercomputing facility to do the heavy lifting for these tasks. NERSC’s Spin is a great platform to run our search engine that is a user-facing application that requires access to large datasets and analytical data that can only be stored on large supercomputing storage systems.”

    An Interface for Validating and Searching Data

    When the Berkeley Lab team developed the interface for users to interact with their system, they knew that it would have to accomplish a couple of objectives, including effective search and allowing human input to the machine learning models. Because the system relies on domain experts to help generate the training data and validate the machine-learning model output, the interface needed to facilitate that.

    “The tagging interface that we developed displays the original data and metadata available, as well as any machine-generated tags we have so far. Expert users then can browse the data and create new tags and review any machine-generated tags for accuracy,” says Matt Henderson, who is a Computer Systems Engineer in CRD and leads the user interface development effort.

    To facilitate an effective search for users based on available information, the team’s search interface provides a query mechanism for available files, proposals and papers that the Berkeley-developed machine-learning tools have parsed and extracted tags from. Each listed search result item represents a summary of that data, with a more detailed secondary view available, including information on tags that matched this item. The team is currently exploring how to best incorporate user feedback to improve the models and tags.

    “Having the ability to explore datasets is important for scientific breakthroughs, and this is the first time that anything like Science Search has been attempted,” says Ramakrishnan. “Our ultimate vision is to build the foundation that will eventually support a ‘Google’ for scientific data, where researchers can even search distributed datasets. Our current work provides the foundation needed to get to that ambitious vision.”

    “Berkeley Lab is really an ideal place to build a tool like Science Search because we have a number of user facilities, like the Molecular Foundry, that have decades worth of data that would provide even more value to the scientific community if the data could be searched and shared,” adds Katie Antypas, who is the principal investigator of Science Search and head of NERSC’s Data Department. “Plus we have great access to machine-learning expertise in the Berkeley Lab Computing Sciences Area as well as HPC resources at NERSC in order to build these capabilities.”

    In addition to Antypas, Ramakrishnan and Weber, UC Berkeley Computer Science Professor Joseph Hellerstein is also a principal investigator.

    This work was supported by the DOE Office of Advanced Scientific Computing Research (ASCR). Both the Molecular Foundry and NERSC are DOE Office of Science User Facilities located at Berkeley Lab.

    See the full article here .


    five-ways-keep-your-child-safe-school-shootings
    Please help promote STEM in your local schools.

    Stem Education Coalition

    A U.S. Department of Energy National Laboratory Operated by the University of California

    University of California Seal

    DOE Seal

     
  • richardmitnick 1:48 pm on May 30, 2018 Permalink | Reply
    Tags: , , NERSC - National Energy Research for Scientific Computing Center, OLCF Titan supercomputer, , Supercomputers Provide New Window Into the Life and Death of a Neutron,   

    From Lawrence Berkeley National Lab: “Supercomputers Provide New Window Into the Life and Death of a Neutron” 

    Berkeley Logo

    From Lawrence Berkeley National Lab

    May 30, 2018
    Glenn Roberts Jr.
    geroberts@lbl.gov
    (510) 486-5582

    Berkeley Lab-led research team simulates sliver of the universe to tackle subatomic-scale physics problem.

    1
    In this illustration, the grid in the background represents the computational lattice that theoretical physicists used to calculate a particle property known as nucleon axial coupling. This property determines how a W boson (white wavy line) interacts with one of the quarks in a neutron (large transparent sphere in foreground), emitting an electron (large arrow) and antineutrino (dotted arrow) in a process called beta decay. This process transforms the neutron into a proton (distant transparent sphere). (Credit: Evan Berkowitz/Jülich Research Center, Lawrence Livermore National Laboratory)

    Experiments that measure the lifetime of neutrons reveal a perplexing and unresolved discrepancy. While this lifetime has been measured to a precision within 1 percent using different techniques, apparent conflicts in the measurements offer the exciting possibility of learning about as-yet undiscovered physics.

    Now, a team led by scientists in the Nuclear Science Division at the Department of Energy’s Lawrence Berkeley National Laboratory (Berkeley Lab) has enlisted powerful supercomputers to calculate a quantity known as the “nucleon axial coupling,” or gA – which is central to our understanding of a neutron’s lifetime – with an unprecedented precision. Their method offers a clear path to further improvements that may help to resolve the experimental discrepancy.

    To achieve their results, the researchers created a microscopic slice of a simulated universe to provide a window into the subatomic world. Their study was published online May 30 in the journal Nature.

    The nucleon axial coupling is more exactly defined as the strength at which one component (known as the axial component) of the “weak current” of the Standard Model of particle physics couples to the neutron. The weak current is given by one of the four known fundamental forces of the universe and is responsible for radioactive beta decay – the process by which a neutron decays to a proton, an electron, and a neutrino.

    In addition to measurements of the neutron lifetime, precise measurements of neutron beta decay are also used to probe new physics beyond the Standard Model. Nuclear physicists seek to resolve the lifetime discrepancy and augment with experimental results by determining gA more precisely.

    The researchers turned to quantum chromodynamics (QCD), a cornerstone of the Standard Model that describes how quarks and gluons interact with each other. Quarks and gluons are the fundamental building blocks for larger particles, such as neutrons and protons. The dynamics of these interactions determine the mass of the neutron and proton, and also the value of gA.

    But sorting through QCD’s inherent complexity to produce these quantities requires the aid of massive supercomputers. In the latest study, researchers applied a numeric simulation known as lattice QCD, which represents QCD on a finite grid.

    While a type of mirror-flip symmetry in particle interactions called parity (like swapping your right and left hands) is respected by the interactions of QCD, and the axial component of the weak current flips parity – parity is not respected by nature (analogously, most of us are right-handed). And because nature breaks this symmetry, the value of gA can only be determined through experimental measurements or theoretical predictions with lattice QCD.

    The team’s new theoretical determination of gA is based on a simulation of a tiny piece of the universe – the size of a few neutrons in each direction. They simulated a neutron transitioning to a proton inside this tiny section of the universe, in order to predict what happens in nature.

    The model universe contains one neutron amid a sea of quark-antiquark pairs that are bustling under the surface of the apparent emptiness of free space.

    2
    André Walker-Loud, a staff scientist at Berkeley Lab, led the study that calculated a property central to understanding the lifetime of neutrons. (Credit: Marilyn Chung/Berkeley Lab)

    “Calculating gA was supposed to be one of the simple benchmark calculations that could be used to demonstrate that lattice QCD can be utilized for basic nuclear physics research, and for precision tests that look for new physics in nuclear physics backgrounds,” said André Walker-Loud, a staff scientist in Berkeley Lab’s Nuclear Science Division who led the new study. “It turned out to be an exceptionally difficult quantity to determine.”

    This is because lattice QCD calculations are complicated by exceptionally noisy statistical results that had thwarted major progress in reducing uncertainties in previous gA calculations. Some researchers had previously estimated that it would require the next generation of the nation’s most advanced supercomputers to achieve a 2 percent precision for gA by around 2020.

    The team participating in the latest study developed a way to improve their calculations of gA using an unconventional approach and supercomputers at Oak Ridge National Laboratory (Oak Ridge Lab) and Lawrence Livermore National Laboratory (Livermore Lab), The Vulcan IBM Blue Gene/Q system.

    LLNL Vulcan IBM Blue GeneQ system supercomputer

    The study involved scientists from more than a dozen institutions, including researchers from UC Berkeley and several other Department of Energy national labs.

    Chia Cheng “Jason” Chang, the lead author of the publication and a postdoctoral researcher in Berkeley Lab’s Nuclear Science Division for the duration of this work, said, “Past calculations were all performed amidst this more noisy environment,” which clouded the results they were seeking. Chang has also joined the Interdisciplinary Theoretical and Mathematical Sciences Program at RIKEN in Japan as a research scientist.

    Walker-Loud added, “We found a way to extract gA earlier in time, before the noise ‘explodes’ in your face.”

    Chang said, “We now have a purely theoretical prediction of the lifetime of the neutron, and it is the first time we can predict the lifetime of the neutron to be consistent with experiments.”

    “This was an intense 2 1/2-year project that only came together because of the great team of people working on it,” Walker-Loud said.

    This latest calculation also places tighter constraints on a branch of physics theories that stretch beyond the Standard Model – constraints that exceed those set by powerful particle collider experiments at CERN’s Large Hadron Collider. But the calculations aren’t yet precise enough to determine if new physics have been hiding in the gA and neutron lifetime measurements.

    Chang and Walker-Loud noted that the main limitation to improving upon the precision of their calculations is in supplying more computing power.

    “We don’t have to change the technique we’re using to get the precision necessary,” Walker-Loud said.

    The latest work builds upon decades of research and computational resources by the lattice QCD community. In particular, the research team relied upon QCD data generated by the MILC Collaboration; an open source software library for lattice QCD called Chroma, developed by the USQCD collaboration; and QUDA, a highly optimized open source software library for lattice QCD calculations.

    ORNL Cray Titan XK7 Supercomputer

    The team drew heavily upon the power of Titan, a supercomputer at Oak Ridge Lab equipped with graphics processing units, or GPUs, in addition to more conventional central processing units, or CPUs. GPUs have evolved from their early use in accelerating video game graphics to current applications in evaluating large arrays for tackling complicated algorithms pertinent to many fields of science.

    The axial coupling calculations used about 184 million “Titan hours” of computing power – it would take a single laptop computer with a large memory about 600,000 years to complete the same calculations.

    As the researchers worked through their analysis of this massive set of numerical data, they realized that more refinements were needed to reduce the uncertainty in their calculations.

    The team was assisted by the Oak Ridge Leadership Computing Facility staff to efficiently utilize their 64 million Titan-hour allocation, and they also turned to the Multiprogrammatic and Institutional Computing program at Livermore Lab, which gave them more computing time to resolve their calculations and reduce their uncertainty margin to just under 1 percent.

    “Establishing a new way to calculate gA has been a huge rollercoaster,” Walker-Loud said.

    With more statistics from more powerful supercomputers, the research team hopes to drive the uncertainty margin down to about 0.3 percent. “That’s where we can actually begin to discriminate between the results from the two different experimental methods of measuring the neutron lifetime,” Chang said. “That’s always the most exciting part: When the theory has something to say about the experiment.”

    He added, “With improvements, we hope that we can calculate things that are difficult or even impossible to measure in experiments.”

    Already, the team has applied for time on a next-generation supercomputer at Oak Ridge Lab called Summit, which would greatly speed up the calculations.

    ORNL IBM Summit supercomputer depiction

    In addition to researchers at Berkeley Lab and UC Berkeley, the science team also included researchers from University of North Carolina, RIKEN BNL Research Center at Brookhaven National Laboratory, Lawrence Livermore National Laboratory, the Jülich Research Center in Germany, the University of Liverpool in the U.K., the College of William & Mary, Rutgers University, the University of Washington, the University of Glasgow in the U.K., NVIDIA Corp., and Thomas Jefferson National Accelerator Facility.

    One of the study participants is a scientist at the National Energy Research Scientific Computing Center (NERSC).

    NERSC

    NERSC Cray XC40 Cori II supercomputer

    LBL NERSC Cray XC30 Edison supercomputer


    The Genepool system is a cluster dedicated to the DOE Joint Genome Institute’s computing needs. Denovo is a smaller test system for Genepool that is primarily used by NERSC staff to test new system configurations and software.

    NERSC PDSF


    PDSF is a networked distributed computing cluster designed primarily to meet the detector simulation and data analysis requirements of physics, astrophysics and nuclear science collaborations.

    The Titan supercomputer is a part of the Oak Ridge Leadership Computing Facility (OLCF). NERSC and OLCF are DOE Office of Science User Facilities.

    The work was supported by Laboratory Directed Research and Development programs at Berkeley Lab, the U.S. Department of Energy’s Office of Science, the Nuclear Physics Double Beta Decay Topical Collaboration, the DOE Early Career Award Program, the NVIDIA Corporation, the Joint Sino-German Research Projects of the German Research Foundation and National Natural Science Foundation of China, RIKEN in Japan, the Leverhulme Trust, the National Science Foundation’s Kavli Institute for Theoretical Physics, DOE’s Innovative and Novel Computational Impact on Theory and Experiment (INCITE) program, and the Lawrence Livermore National Laboratory Multiprogrammatic and Institutional Computing program through a Tier 1 Grand Challenge award.

    See the full article here .


    five-ways-keep-your-child-safe-school-shootings

    stem

    Stem Education Coalition

    A U.S. Department of Energy National Laboratory Operated by the University of California

    University of California Seal

    DOE Seal

     
  • richardmitnick 12:33 pm on March 1, 2018 Permalink | Reply
    Tags: , , , Can Strongly Lensed Type Ia Supernovae Resolve One of Cosmology’s Biggest Controversies?, , , , , NERSC - National Energy Research for Scientific Computing Center   

    From LBNL: “Can Strongly Lensed Type Ia Supernovae Resolve One of Cosmology’s Biggest Controversies?” 

    Berkeley Logo

    Berkeley Lab

    March 1, 2018
    Linda Vu
    lvu@lbl.gov
    (510) 495-2402

    1
    This composite of two astrophysics simulations shows a Type Ia supernova (purple disc) expanding over different microlensing magnification patterns (colored fields). Because individual stars in the lensing galaxy can significantly change the brightness of a lensed event, regions of the supernova can experience varying amounts of brightening and dimming, which scientists believed would be a problem for cosmologists measuring time delays. Using detailed computer simulations at NERSC, astrophysicists showed that this would have a small effect on time-delay cosmology. (Credit: Danny Goldstein/UC Berkeley)

    Gravitational Lensing NASA/ESA

    NERSC Cray XC40 Cori II supercomputer

    LBL NERSC Cray XC30 Edison supercomputer


    The Genepool system is a cluster dedicated to the DOE Joint Genome Institute’s computing needs. Denovo is a smaller test system for Genepool that is primarily used by NERSC staff to test new system configurations and software.

    NERSC PDSF


    PDSF is a networked distributed computing cluster designed primarily to meet the detector simulation and data analysis requirements of physics, astrophysics and nuclear science collaborations.

    In 1929 Edwin Hubble surprised many people – including Albert Einstein – when he showed that the universe is expanding. Another bombshell came in 1998 when two teams of astronomers proved that cosmic expansion is actually speeding up due to a mysterious property of space called dark energy. This discovery provided the first evidence of what is now the reigning model of the universe: “Lambda-CDM,” which says that the cosmos is approximately 70 percent dark energy, 25 percent dark matter and 5 percent “normal” matter (everything we’ve ever observed).

    Until 2016, Lambda-CDM agreed beautifully with decades of cosmological data. Then a research team used the Hubble Space Telescope to make an extremely precise measurement of the local cosmic expansion rate. The result was another surprise: the researchers found that the universe was expanding a little faster than Lambda-CDM and the Cosmic Microwave Background (CMB), relic radiation from the Big Bang, predicted. So it seems something’s amiss – could this discrepancy be a systematic error, or possibly new physics?

    Astrophysicists at Lawrence Berkeley National Laboratory (Berkeley Lab) and the Institute of Cosmology and Gravitation at the University of Portsmouth in the UK believe that strongly lensed Type Ia supernovae are the key to answering this question. And in a new The Astrophysical Journal paper, they describe how to control “microlensing,” a physical effect that many scientists believed would be a major source of uncertainty facing these new cosmic probes. They also show how to identify and study these rare events in real time.

    “Ever since the CMB result came out and confirmed the accelerating universe and the existence of dark matter, cosmologists have been trying to make better and better measurements of the cosmological parameters, shrink the error bars,” says Peter Nugent, an astrophysicist in Berkeley Lab’s Computational Cosmology Center (C3) and co-author on the paper.

    CMB per ESA/Planck


    ESA/Planck

    “The error bars are now so small that we should be able to say ‘this and this agree,’ so the results presented in 2016 [ApJ] introduced a big tension in cosmology. Our paper presents a path forward for determining whether the current disagreement is real or whether it’s a mistake.”

    Better Distance Markers Shed Brighter Light on Cosmic History

    But last year an international team of researchers found an even more reliable distance marker – the first-ever strongly lensed Type Ia supernova [Science]. These events occur when the gravitational field of a massive object – like a galaxy – bends and refocuses passing light from a Type Ia event behind it. This “gravitational lensing” causes the supernova’s light to appear brighter and sometimes in multiple locations, if the light rays travel different paths around the massive object.

    Because different routes around the massive object are longer than others, light from different images of the same Type Ia event will arrive at different times. By tracking time-delay between the strongly lensed images, astrophysicists believe they can get a very precise measurement of the cosmic expansion rate.

    “Strongly lensed supernovae are much rarer than conventional supernovae – they’re one in 50,000. Although this measurement was first proposed in the 1960’s, it has never been made because only two strongly lensed supernovae have been discovered to date, neither of which were amenable to time delay measurements,” says Danny Goldstein, a UC Berkeley graduate student and lead author on the new Astrophysical Journal paper.

    After running a number of computationally intensive simulations of supernova light at the National Energy Research Scientific Computing Center (NERSC), a Department of Energy Office of Science User Facility located at Berkeley Lab, Goldstein and Nugent suspect that they’ll be able to find about 1,000 of these strongly lensed Type Ia supernovae in data collected by the upcoming Large Synoptic Survey Telescope (LSST) – about 20 times more than previous expectations.

    LSST


    LSST Camera, built at SLAC



    LSST telescope, currently under construction on the El Peñón peak at Cerro Pachón Chile, a 2,682-meter-high mountain in Coquimbo Region, in northern Chile, alongside the existing Gemini South and Southern Astrophysical Research Telescopes.

    These results are the basis of their new paper in The Astrophysical Journal.

    “With three lensed quasars – cosmic beacons emanating from massive black holes in the centers of galaxies – collaborators and I measured the expansion rate to 3.8 percent precision. We got a value higher than the CMB measurement, but we need more systems to be really sure that something is amiss with the standard model of cosmology, “ says Thomas Collett, an astrophysicist at the University of Portsmouth and a co-author on the new Astrophysical Journal paper. “It can take years to get a time delay measurement with quasars, but this work shows we can do it for supernovae in months. One thousand lensed supernovae will let us really nail down the cosmology.”

    In addition to identifying these events, the NERSC simulations also helped them prove that strongly lensed Type Ia supernovae can be very accurate cosmological probes.

    “When cosmologists try to measure time delays, the problem they often encounter is that individual stars in the lensing galaxy can distort the light curves of the different images of the event, making it harder to match them up,” says Goldstein. “This effect, known as ‘microlensing,’ makes it harder to measure accurate time delays, which are essential for cosmology.”

    But after running their simulations, Goldstein and Nugent found microlensing did not change the colors of strongly lensed Type Ia supernova in their early phases. So researchers can subtract the unwanted effects of microlensing by working with colors instead of light curves.

    Gravitational microlensing, S. Liebes, Physical Review B, 133 (1964): 835

    Once these undesirable effects are subtracted, scientists will be able to easily match the light curves and make accurate cosmological measurements.

    They came to this conclusion by modeling the supernovae using the SEDONA code, which was developed with funding from two DOE Scientific Discovery through Advanced Computing (SciDAC) Institutes to calculate light curves, spectra and polarization of aspherical supernova models.

    “In the early 2000s DOE funded two SciDAC projects to study supernova explosions, we basically took the output of those models and passed them through a lensing system to prove that the effects are achromatic,” says Nugent.

    “The simulations give us a dazzling picture of the inner workings of a supernova, with a level of detail that we could never know otherwise,” says Daniel Kasen, an astrophysicist in Berkeley Lab’s Nuclear Science Division, and a co-author on the paper. “Advances in high performance computing are finally allowing us to understand the explosive death of stars, and this study shows that such models are needed to figure out new ways to measure dark energy.”

    Taking Supernova Hunting to the Extreme

    When LSST begins full survey operations in 2023, it will be able to scan the entire sky in only three nights from its perch on the Cerro Pachón ridge in north-central Chile. Over its 10-year mission, LSST is expected to deliver over 200 petabytes of data. As part of the LSST Dark Energy Science Collaboration, Nugent and Goldstein hope that they can run some of this data through a novel supernova-detection pipeline, based at NERSC.

    For more than a decade, Nugent’s Real-Time Transient Detection pipeline running at NERSC has been using machine learning algorithms to scour observations collected by the Palomar Transient Factor (PTF) and then the Intermediate Palomar Transient Factory (iPTF) – searching every night for “transient” objects that change in brightness or position by comparing the new observations with all of the data collected from previous nights. Within minutes after an interesting event is discovered, machines at NERSC then trigger telescopes around the globe to collect follow-up observations. In fact, it was this pipeline that revealed the first-ever strongly lensed Type Ia supernova earlier this year.

    “What we hope to do for the LSST is similar to what we did for Palomar, but times 100,” says Nugent. “There’s going to be a flood of information every night from LSST. We want to take that data and ask what do we know about this part of the sky, what’s happened there before and is this something we’re interested in for cosmology?”

    He adds that once researchers identify the first light of a strongly lensed supernova event, computational modeling could also be used to precisely predict when the next of the light will appear. Astronomers can use this information to trigger ground- and space-based telescopes to follow up and catch this light, essentially allowing them to observe a supernova seconds after it goes off.

    “I came to Berkeley Lab 21 years ago to work on supernova radiative-transfer modeling and now for the first time we’ve used these theoretical models to prove that we can do cosmology better,” says Nugent. “It’s exciting to see DOE reap the benefits of investments in computational cosmology that they started making decades ago.”

    The SciDAC partnership project – Computational Astrophysics Consortium: Supernovae, Gamma-Ray Bursts, and Nucleosynthesis – funded by DOE Office of Science and the National Nuclear Security Agency was led by Stan Woosley of UC Santa Cruz, and supported both Nugent and Kasen of Berkeley Lab.

    NERSC is a DOE Office of Science User Facility.

    See the full article here .

    Please help promote STEM in your local schools.

    STEM Icon

    Stem Education Coalition

    A U.S. Department of Energy National Laboratory Operated by the University of California

    University of California Seal

    DOE Seal

     
  • richardmitnick 2:28 pm on February 1, 2018 Permalink | Reply
    Tags: , , NERSC - National Energy Research for Scientific Computing Center, , ,   

    From SLAC: “Q&A: Alan Heirich and Elliott Slaughter Take On SLAC’s Big Data Challenges” 


    SLAC Lab

    January 9, 2018
    Manuel Gnida

    1
    Members of SLAC’s Computer Science Division. From left: Alex Aiken, Elliott Slaughter and Alan Heirich. (Dawn Harmer/SLAC National Accelerator Laboratory)

    As the Department of Energy’s SLAC National Accelerator Laboratory builds the next generation of powerful instruments for groundbreaking research in X-ray science, astronomy and other fields, its Computer Science Division is preparing for the onslaught of data these instruments will produce.

    The division’s initial focus is on LCLS-II, an upgrade to the Linac Coherent Light Source (LCLS) X-ray laser that will fire 8,000 times faster than the current version. LCLS-II promises to provide completely new views of the atomic world and its fundamental processes. However, the jump in firing rate goes hand and in hand with an explosion of scientific data that would overwhelm today’s computing architectures.

    SLAC/LCLS

    SLAC/LCLS II projected view

    In this Q&A, SLAC computer scientists Alan Heirich and Elliott Slaughter talk about their efforts to develop new computing capabilities that will help the lab cope with the coming data challenges.

    Heirich, who joined the lab last April, earned a PhD from the California Institute of Technology and has many years of experience working in industry and academia. Slaughter joined last June; he’s a recent PhD graduate from Stanford University, where he worked under the guidance of Alex Aiken, professor of computer science at Stanford and director of SLAC’s Computer Science Division.

    What are the computing challenges you’re trying to solve?

    Heirich: The major challenge we’re looking at now is that LCLS-II will produce so much more data than the current X-ray laser. Data rates will increase 10,000 times, from about 100 megabytes per second today to a terabyte per second in a few years. We need to think about the computing tools and infrastructure necessary to take control over that enormous future data stream.

    Slaughter: Our development of new computing architectures is aimed at analyzing LCLS-II data on the fly, providing initial results within a minute or two. This allows researchers to evaluate the quality of their data quickly, make adjustments and collect data in the most efficient way. However, real-time data analysis is quite challenging if you collect data with an X-ray laser that fires a million pulses per second.
    How can real-time analysis be achieved?

    Slaughter: We won’t be able to do all this with just the computing capabilities we have on site. The plan is to send some of the most challenging LCLS-II data analyses to the National Energy Research Scientific Computing Center (NERSC) at DOE’s Lawrence Berkeley National Laboratory, where extremely fast supercomputers will analyze the data and send the results back to us within minutes.

    Our team has joined forces with Amedeo Perazzo, who leads the LCLS Controls and Data Systems Division, to develop the system that will run the analysis. Scientists doing experiments at LCLS will be able to define the details of that analysis, depending on what their scientific questions are.

    Our goal is to be able to do the analysis in a very flexible way using all kinds of high-performance computers that have completely different hardware and architectures. In the future, these will also include exascale supercomputers that perform more than a billion billion calculations per second – up to a hundred times more than today’s most powerful machines.

    Is it difficult to build such a flexible computing system?

    Heirich: Yes. Supercomputers are very complex with millions of processors running in parallel, and we need to figure out how to make use of their individual architectures most efficiently. At Stanford, we’re therefore developing a programming system, called Legion, that allows people to write programs that are portable across very different high-performance computer architectures.

    Traditionally, if you want to run a program with the best possible performance on a new computer system, you may need to rewrite significant parts of the program so that it matches the new architecture. That’s very labor and cost intensive. Legion, on the other hand, is specifically designed to be used on diverse architectures and requires only relatively small tweaks when moving from one system to another. This approach prepares us for whatever the future of computing looks like. At SLAC, we’re now starting to adapt Legion to the needs of LCLS-II.

    We’re also looking into how we can visualize the scientific data after they are analyzed at NERSC.

    NERSC Cray XC40 Cori II supercomputer

    LBL NERSC Cray XC30 Edison supercomputer


    The Genepool system is a cluster dedicated to the DOE Joint Genome Institute’s computing needs. Denovo is a smaller test system for Genepool that is primarily used by NERSC staff to test new system configurations and software.

    NERSC PDSF


    PDSF is a networked distributed computing cluster designed primarily to meet the detector simulation and data analysis requirements of physics, astrophysics and nuclear science collaborations.

    The analysis will be done on thousands of processors, and it’s challenging to orchestrate this process and put it together into one coherent visual picture. We just presented one way to approach this problem at the supercomputing conference SC17 in November.

    What’s the goal for the coming year?

    Slaughter: We’re working with the LCLS team on building an initial data analysis prototype. One goal is to get a first test case running on the new system. This will be done with X-ray crystallography data from LCLS, which are used to reconstruct the 3-D atomic structure of important biomolecules, such as proteins. The new system will be much more responsive than the old one. It’ll be able to read and analyze data at the same time, whereas the old system can only do one or the other at any given moment.
    Will other research areas besides X-ray science profit from your work?

    Slaughter: Yes. Alex is working on growing our division, identifying potential projects across the lab and expanding our research portfolio. Although we’re concentrating on LCLS-II right now, we’re interested in joining other projects, such as the Large Synoptic Survey Telescope (LSST). SLAC is building the LSST camera, a 3.2-gigapixel digital camera that will capture unprecedented images of the night sky. But it will also produce enormous piles of data – millions of gigabytes per year. Progress in computer science is needed to efficiently handle these data volumes.

    Heirich: SLAC and its close partnership with Stanford Computer Science make for a great research environment. There is also a lot of interest in machine learning. In this form of artificial intelligence, computer programs get better and more efficient over time by learning from the tasks they performed in the past. It’s a very active research field that has seen a lot of growth over the past five years, and machine learning has become remarkably effective in solving complex problems that previously needed to be done by human beings.

    Many groups at SLAC and Stanford are exploring how they can exploit machine learning, including teams working in X-ray science, particle physics, astrophysics, accelerator research and more. But there are very fundamental computer science problems to solve. As machine learning replaces some conventional analysis methods, one big question is, for example, whether the solutions it generates are as reliable as those obtained in the conventional way.

    LCLS and NERSC are DOE Office of Science user facilities. Legion is being developed at Stanford with funding from DOE’s ExaCT Combustion Co-Design Center, Scientific Data Management, Analysis and Visualization program and Exascale Computing Project (ECP) as well as other contributions. SLAC’s Computer Science Division receives funding from the ECP.

    See the full article here .

    Please help promote STEM in your local schools.

    STEM Icon

    Stem Education Coalition

    SLAC Campus
    SLAC is a multi-program laboratory exploring frontier questions in photon science, astrophysics, particle physics and accelerator research. Located in Menlo Park, California, SLAC is operated by Stanford University for the DOE’s Office of Science.

     
c
Compose new post
j
Next post/Next comment
k
Previous post/Previous comment
r
Reply
e
Edit
o
Show/Hide comments
t
Go to top
l
Go to login
h
Show/Hide help
shift + esc
Cancel
%d bloggers like this: