From Argonne Leadership Computing Facility: “Scientists spearhead convergence of AI and HPC for cosmology”

Argonne Lab
News from Argonne National Laboratory

From Argonne Leadership Computing Facility

1
Using real images of galaxies from astronomical surveys of our Universe, a neural network can be trained to classify these with impressive accuracy. This image is a snapshot from a visualization that shows the output of the penultimate layer of a deep neural network during training as it is learning to classify two types of galaxies: spirals and ellipticals. View the full visualization here. Credit: Janet Knowles, Joseph A. Insley and Silvio Rizzi, Argonne National Laboratory

In 2007, the Sloan Digital Sky Survey (SDSS) launched a citizen science campaign called Galaxy Zoo to enlist the public’s help in classifying the hundreds of thousands of galaxy images captured by an optical telescope. Through this highly successful crowdsourcing effort, volunteers reviewed the images online to help determine whether each galaxy had a spiral or elliptical structure.

Leveraging data generated by the Galaxy Zoo project, a team of scientists is now applying the power of artificial intelligence (AI) and high-performance supercomputers to accelerate efforts to analyze the increasingly massive datasets produced by ongoing and future cosmological surveys.

In a new study, researchers from the National Center for Supercomputing Applications (NCSA) at the University of Illinois at Urbana-Champaign and the Argonne Leadership Computing Facility (ALCF) at the U.S. Department of Energy’s (DOE) Argonne National Laboratory have developed a novel combination of deep learning methods to provide a highly accurate approach to classifying hundreds of millions of unlabeled galaxies. The team’s findings were published in Physics Letters B.

“The NCSA Gravity Group initiated, and continues to spearhead, the use of deep learning at scale for gravitational wave astrophysics. We have expanded our research portfolio to address a computational grand challenge in cosmology, innovating the use of several deep learning methods in combination with high-performance computing (HPC),” said Eliu Huerta, NCSA Gravity Group Lead. “Our work also showcases how the interoperability of NSF and DOE supercomputing resources can be used to accelerate science.”

“Deep learning research has rapidly become a booming enterprise across multiple disciplines. Our findings show that the convergence of deep learning and HPC can address big-data challenges of large-scale electromagnetic surveys. This research is part of a multidisciplinary program at NCSA to push the boundaries of AI and HPC in scientific research,” added Asad Khan, a graduate student at the NCSA Gravity Group and lead author of this study.

Supported by an ALCF Data Science Program award, the team used the SDSS datasets produced by the Galaxy Zoo campaign to train neural network models to classify galaxies in the Dark Energy Survey (DES) that overlap the footprint of both surveys. The method’s ability to identify spiral and elliptical galaxies was found to be 99.6 percent accurate.

“Using the millions of classifications carried out by the public in the Galaxy Zoo project to train a neural network is an inspiring use of the citizens science program,” said Elise Jennings, ALCF computer scientist. “This exciting research also sheds light on the inner workings of the neural network, which clearly learns two distinct feature clusters to identify spiral and elliptical galaxies.”

The team’s innovative framework lays the foundations to exploit deep transfer learning at scale, data clustering and recursive training to produce large-scale galaxy catalogs in the Large Synoptic Survey Telescope (LSST) era.

“We’re excited to work with the team at NCSA and Argonne as well as the researchers who drove the original Galaxy Zoo effort to pursue this important area of scientific discovery,” said Tom Gibbs, manager of developer relations at NVIDIA. “Using these new methods, we’re taking an important step to understanding the mystery of dark energy.”

Highlights of this study include:

The first application of deep transfer learning using disparate datasets for galaxy classification. The team used deep transfer learning to transfer knowledge from Xception, a state-of-the-art neural network model for image classification trained with the ImageNet dataset, to classify SDSS galaxy images. Transfer learning between similar datasets, such as images of human faces, has been traditionally used in computer science literature. In stark contrast, this study uses a pre-trained model for real-world object recognition and then transfers its knowledge to classify galaxies.
The researchers developed open-source software stacks to extract galaxy images from the SDSS and DES surveys at scale using the NCSA’s Blue Waters supercomputer. Deep learning algorithms were prototyped and trained using NVIDIA GPUs in the Bridges supercomputer at the Pittsburgh Supercomputing Center through the National Science Foundation’s Extreme Science and Engineering Discovery Environment (XSEDE). Finally, deep transfer learning was combined with distributed training to reduce the training stage of the Xception model with galaxy image datasets from five hours to just eight minutes using ALCF supercomputing resources.
The researchers used deep neural network classifiers to label over 10,000 unlabeled DES galaxies that have not been observed in previous surveys. The neural network model models are then turned into feature extractors to show that these unlabeled datasets can be clustered according to their morphology, forming two distinct datasets.
ALCF researchers created a visualization to show the output of the penultimate layer of a deep neural network during training as it is learning to classify galaxies as spiral or elliptical. View the visualizartion here.

See the full article here .

five-ways-keep-your-child-safe-school-shootings

Please help promote STEM in your local schools.

Stem Education Coalition

Argonne National Laboratory seeks solutions to pressing national problems in science and technology. The nation’s first national laboratory, Argonne conducts leading-edge basic and applied scientific research in virtually every scientific discipline. Argonne researchers work closely with researchers from hundreds of companies, universities, and federal, state and municipal agencies to help them solve their specific problems, advance America’s scientific leadership and prepare the nation for a better future. With employees from more than 60 nations, Argonne is managed by UChicago Argonne, LLC for the U.S. Department of Energy’s Office of Science. For more visit www.anl.gov.

About ALCF
The Argonne Leadership Computing Facility’s (ALCF) mission is to accelerate major scientific discoveries and engineering breakthroughs for humanity by designing and providing world-leading computing facilities in partnership with the computational science community.

We help researchers solve some of the world’s largest and most complex problems with our unique combination of supercomputing resources and expertise.

ALCF projects cover many scientific disciplines, ranging from chemistry and biology to physics and materials science. Examples include modeling and simulation efforts to:

Discover new materials for batteries
Predict the impacts of global climate change
Unravel the origins of the universe
Develop renewable energy technologies

Argonne is managed by UChicago Argonne, LLC for the U.S. Department of Energy’s Office of Science

Argonne Lab Campus

#scientists-spearhead-convergence-of-ai-and-hpc-for-cosmology, #alcf, #applied-research-technology, #basic-research

From ALCF: “Supercomputing Award of 5.95 Billion Hours to 55 Computational Research Projects”

Argonne Lab
News from Argonne National Laboratory

ALCF

November 13, 2017
Katie Bethea

1
The projects will share 5.95 billion core-hours on three of America’s most powerful supercomputers dedicated to open science and support a broad range of large-scale research campaigns from infectious disease treatment to next-generation materials development. Researchers from the University of Washington’s Baker Lab will use their hours to develop approaches to harness the power of ALCF computing resources to efficiently sample the peptide macrocycle “conformational landscape”, and to identify low-energy states in which a peptide macrocycle can be rigidly locked.

The U.S. Department of Energy’s Office of Science announced 55 projects with high potential for accelerating discovery through its Innovative and Novel Computational Impact on Theory and Experiment (INCITE) program. The projects will share 5.95 billion core-hours on three of America’s most powerful supercomputers dedicated to open science and support a broad range of large-scale research campaigns from infectious disease treatment to next-generation materials development.

These awards allocate the multi-petascale computing resources of two DOE Leadership Computing Facilities at Argonne and Oak Ridge National Laboratories. The two centers jointly manage the INCITE program as is the primary means of accessing their resources. INCITE proposals are awarded on a competitive basis to researchers from academia, government research facilities, and industry. The average award is more than 108.1 million core-hours—with some awards of up to several hundred million core-hours—on systems capable of quadrillions of calculations per second.

“DOE’s INCITE program gives researchers access to computational resources to ambitiously address some of the world’s most formidable scientific research problems,” said James Hack, director of the National Center for Computational Sciences (NCCS), home to the Oak Ridge Leadership Computing Facility (OLCF), a DOE Office of Science User Facility in Oak Ridge, Tennessee. “These scientific projects would not be possible without access to DOE leadership computing resources, and we’re gratified to be able to partner with these outstanding research teams to enable fundamental advances in our understanding of the amazing world in which we live.”

Domain scientists and computational scientists at the leadership centers partner with each INCITE project, aiding in code and methods development, optimization, streamlining workflow, troubleshooting unforeseen problems, and assisting with data analysis and visualization.

“Researchers seek out the INCITE program to support their research through access to three of the world’s fastest supercomputers, but our facilities also provide significant staff expertise and support to help them achieve their goals,” said Michael E. Papka, director of the Argonne Leadership Computing Facility (ALCF), a DOE Office of Science User Facility just outside Chicago. “This support helps ensure investigators can maximize their time on our leading-edge systems.”

The ALCF’s Mira supercomputer is a 10-petaflops IBM Blue Gene/Q system with 49,152 compute nodes and a power-efficient architecture. ALCF’s Theta is a 9.65-petaflops Cray XC40 system based on the second-generation Intel Xeon Phi processor. The OLCF’s Titan supercomputer is a 27-petaflops Cray XK7 hybrid system employing both CPUs and energy-efficient, high-performance GPUs in its 18,688 compute nodes.

Despite continued upgrades, expansions, and advances in computing power, demand for leadership-class resources such as Mira, Theta and Titan continues to exceed availability, and, once again, more applications for time were made to INCITE than were awarded.

For a complete list of 2018 INCITE awards, please visit:

http://www.doeleadershipcomputing.org/awards/2018INCITEFactSheets.pdf

Highlights of the 2018 allocations include:

Brant Robertson from the University of California Santa Cruz received 46 million core-hours to understand the role galactic-scale winds play in the formation and evolution of galaxies.
Rommie Amaro from the University of California San Diego received 80 million core-hours to investigate the druggability and transmissibility of pandemic and seasonal influenza. These simulations, the largest of their kind, will unlock new clues to flu infection and potential treatment.
Michael Sprague from DOE’s National Renewal Energy Laboratory received 115 million core-hours to create new predictive simulation capabilities that will lower the cost of wind energy by providing new understanding and new pathways to optimized wind farms.
Vasily Bulatov from DOE’s Lawrence Livermore National Laboratory received 110 million core-hours for large-scale molecular dynamics simulations that will increase understanding of the strength of construction materials by understanding both the microscopic origin of strain hardening and the nature and geometric character of dislocation patterns.
Sean Dettrick from TAE Technologies received 31 million core-hours for simulations in understanding the magnetic confinement of plasmas with the goal of creating a clean, commercially viable, fusion-based electricity generator.
Konstantinos Orginos from the College of William & Mary received 155 million core-hours to study the structure of hadrons that will allow, for the first time, a complete 3D image of the hadron using the fundamental theory known as Quantum Chromodynamics.

The INCITE program promotes transformational advances in science and technology through large allocations of time on state-of-the-art supercomputers. For more information, please visit: http://www.doeleadershipcomputing.org/incite-program/.

ANL ALCF Cetus IBM supercomputer

ANL ALCF Theta Cray supercomputer

ANL ALCF Cray Aurora supercomputer

ANL ALCF MIRA IBM Blue Gene Q supercomputer at the Argonne Leadership Computing Facility

See the full article here .

Please help promote STEM in your local schools.
STEM Icon
Stem Education Coalition

Argonne National Laboratory seeks solutions to pressing national problems in science and technology. The nation’s first national laboratory, Argonne conducts leading-edge basic and applied scientific research in virtually every scientific discipline. Argonne researchers work closely with researchers from hundreds of companies, universities, and federal, state and municipal agencies to help them solve their specific problems, advance America’s scientific leadership and prepare the nation for a better future. With employees from more than 60 nations, Argonne is managed by UChicago Argonne, LLC for the U.S. Department of Energy’s Office of Science. For more visit www.anl.gov.

About ALCF

The Argonne Leadership Computing Facility’s (ALCF) mission is to accelerate major scientific discoveries and engineering breakthroughs for humanity by designing and providing world-leading computing facilities in partnership with the computational science community.

We help researchers solve some of the world’s largest and most complex problems with our unique combination of supercomputing resources and expertise.

ALCF projects cover many scientific disciplines, ranging from chemistry and biology to physics and materials science. Examples include modeling and simulation efforts to:

Discover new materials for batteries
Predict the impacts of global climate change
Unravel the origins of the universe
Develop renewable energy technologies

Argonne is managed by UChicago Argonne, LLC for the U.S. Department of Energy’s Office of Science

Argonne Lab Campus

#alcf, #the-u-s-department-of-energys-office-of-science-announced-55-projects-with-high-potential-for-accelerating-discovery-through-its-innovative-and-novel-computational-impact-on-theory-and-exper

From ALCF: “Argonne’s data science program adds new projects, doubles in size”

Argonne Lab
News from Argonne National Laboratory

ALCF

ANL ALCF Cetus IBM supercomputer

ANL ALCF Theta Cray supercomputer

ANL ALCF Cray Aurora supercomputer

ANL ALCF MIRA IBM Blue Gene Q supercomputer at the Argonne Leadership Computing Facility

October 2, 2017
ALCF Staff

1
Isosurface of the Lapacian of the electron density of MK44, an organic dye, attached to the surface of a nanocluster of titania. Credit: Álvaro Vázquez Mayagoitia, Argonne National Laboratory; Jacqueline M. Cole, University of Cambridge

In 2016, the Argonne Leadership Computing Facility (ALCF) started a forward-looking allocation program to support two new types of projects: efforts focused on the extraction of science from various data sources, and ones focused on scaling the underlying data science technology to make use of Leadership Computing resources. This program is known as the ALCF Data Science Program (ADSP). This year, Argonne has awarded computing time to four new projects, bringing the total number of ADSP projects for 2017-2018 to eight. All four of the program’s inaugural projects were also renewed.

The new project award recipients include an industry-based deep learning project; a national laboratory-based cosmology workflow project; and two university-based projects: one that uses machine-learning for materials discovery, and a deep-learning computer science project.

The current projects have made great progress in just the past year alone. On the science side, Jacqueline M. Cole (University of Cambridge) has used ADSP resources to build a large database of organic molecule candidates with desirable structural and electronic qualities by data mining materials’ physical and chemical properties from 300,000 published research articles as part of her Data-Driven Molecular Engineering of Solar-Powered Windows project. Doga Gursoy (Argonne National Laboratory), with his Large-Scale Computing and Visualization on the Connectomes of the Brain project, has streamlined the reconstruction of mice brains utilizing novel imaging and analytical tools to image at the level of individual cells and blood vessels. On the technology side, Fabien Delalondre (École polytechnique fédérale de Lausanne) has been focused on code development and infrastructure setup for his Leveraging Non-volatile Memory, Big Data, and Distributed Workflow Technology to Leap Forward Brain Modeling project. Finally, Taylor Childers’s (Argonne National Laboratory) Advancing the Scalability of LHC Workflows to Enable Discoveries at the Energy Frontier project has deployed Athena, the ATLAS simulation and analysis framework, on ALCF’s Theta supercomputer; and are working to deploy HTCondor-CE, a “gateway” software tool developed by the Open Science Grid to authorize remote users and provide a resource provisioning service.

New ADSP Projects

Enabling Multiscale Physics for Industrial Design Using Deep Learning Networks
Rathakrishnan Bhaskaran, GE Global Research

This project will leverage machine learning and large datasets generated by wall-resolved large-eddy simulations to develop data-driven turbulence models with improved predictive accuracy. The team will apply this approach to turbomachinery such as a wind turbine airfoil, demonstrating the impact that deep learning can have on industrial design processes for applications in power generation, aerospace, and other fields.

Realistic Simulations of the LSST Survey at Scale
Katrin Heitmann, Argonne National Laboratory

To help prepare for the massive amount of data that will be generated by the Large Synoptic Survey Telescope (LSST), this project aims to develop a comprehensive workflow starting from a structure formation simulation to the creation of sky maps with realistic galaxies. The team’s work will lead to one of the largest, most detailed synthetic sky maps ever created, as well as an end-to-end pipeline for LSST data processing and analysis on ALCF supercomputers.

LSST


LSST Camera, built at SLAC



LSST telescope, currently under construction at Cerro Pachón Chile, a 2,682-meter-high mountain in Coquimbo Region, in northern Chile, alongside the existing Gemini South and Southern Astrophysical Research Telescopes.

Massive Hyperparameter Searches on Deep Neural Networks Using Leadership Systems
Pierre Baldi, University of California, Irvine

This project will run massive hyperparameter searches on deep neural networks to investigate the fundamentals of deep learning algorithms, helping to improve the use of deep learning on leadership computing systems. The research team also seeks to advance high-energy physics research by applying deep learning methods to improve the detection of exotic particles at CERN’s Large Hadron Collider.

Constructing and Navigating Polymorphic Landscapes of Molecular Crystals
Alexandre Tkatchenko, University of Luxembourg

This project seeks to combine state-of-the-art atomistic quantum simulations and data science methods to enable accurate predictions of novel molecular crystals for alternative energy materials, disease-curing pharmaceuticals, and molecular electronics. The generated data (structures, energies, and properties) and the associated big-data analytics tools developed during this project have the potential to enable major breakthroughs in computational materials discovery.

See the full article here .

Please help promote STEM in your local schools.
STEM Icon
Stem Education Coalition

Argonne National Laboratory seeks solutions to pressing national problems in science and technology. The nation’s first national laboratory, Argonne conducts leading-edge basic and applied scientific research in virtually every scientific discipline. Argonne researchers work closely with researchers from hundreds of companies, universities, and federal, state and municipal agencies to help them solve their specific problems, advance America’s scientific leadership and prepare the nation for a better future. With employees from more than 60 nations, Argonne is managed by UChicago Argonne, LLC for the U.S. Department of Energy’s Office of Science. For more visit www.anl.gov.

About ALCF

The Argonne Leadership Computing Facility’s (ALCF) mission is to accelerate major scientific discoveries and engineering breakthroughs for humanity by designing and providing world-leading computing facilities in partnership with the computational science community.

We help researchers solve some of the world’s largest and most complex problems with our unique combination of supercomputing resources and expertise.

ALCF projects cover many scientific disciplines, ranging from chemistry and biology to physics and materials science. Examples include modeling and simulation efforts to:

Discover new materials for batteries
Predict the impacts of global climate change
Unravel the origins of the universe
Develop renewable energy technologies

Argonne is managed by UChicago Argonne, LLC for the U.S. Department of Energy’s Office of Science

Argonne Lab Campus

#alcf, #applied-research-technology, #argonnes-data-science-program-adds-new-projects-doubles-in-size, #astronomy, #basic-research, #biology, #chemistry, #physics

From ALCF: “Cartography of the cosmos”

Argonne Lab
News from Argonne National Laboratory

ALCF

September 27, 2017
John Spizzirri

2
Argonne’s Salman Habib leads the ExaSky project, which takes on the biggest questions, mysteries, and challenges currently confounding cosmologists.

1
No image caption or credit

There are hundreds of billions of stars in our own Milky Way galaxy.

Milky Way NASA/JPL-Caltech /ESO R. Hurt

Estimates indicate a similar number of galaxies in the observable universe, each with its own large assemblage of stars, many with their own planetary systems. Beyond and between these stars and galaxies are all manner of matter in various phases, such as gas and dust. Another form of matter, dark matter, exists in a very different and mysterious form, announcing its presence indirectly only through its gravitational effects.

This is the universe Salman Habib is trying to reconstruct, structure by structure, using precise observations from telescope surveys combined with next-generation data analysis and simulation techniques currently being primed for exascale computing.

“We’re simulating all the processes in the structure and formation of the universe. It’s like solving a very large physics puzzle,” said Habib, a senior physicist and computational scientist with the High Energy Physics and Mathematics and Computer Science divisions of the U.S. Department of Energy’s (DOE) Argonne National Laboratory.

Habib leads the “Computing the Sky at Extreme Scales” project or “ExaSky,” one of the first projects funded by the recently established Exascale Computing Project (ECP), a collaborative effort between DOE’s Office of Science and its National Nuclear Security Administration.

From determining the initial cause of primordial fluctuations to measuring the sum of all neutrino masses, this project’s science objectives represent a laundry list of the biggest questions, mysteries, and challenges currently confounding cosmologists.

There is the question of dark energy, the potential cause of the accelerated expansion of the universe, while yet another is the nature and distribution of dark matter in the universe.

Dark Energy Survey


Dark Energy Camera [DECam], built at FNAL


NOAO/CTIO Victor M Blanco 4m Telescope which houses the DECam at Cerro Tololo, Chile, housing DECam at an altitude of 7200 feet

Dark Matter Research

Universe map Sloan Digital Sky Survey (SDSS) 2dF Galaxy Redshift Survey

Scientists studying the cosmic microwave background hope to learn about more than just how the universe grew—it could also offer insight into dark matter, dark energy and the mass of the neutrino.

Dark matter cosmic web and the large-scale structure it forms The Millenium Simulation, V. Springel et al

Dark Matter Particle Explorer China

DEAP Dark Matter detector, The DEAP-3600, suspended in the SNOLAB deep in Sudbury’s Creighton Mine

LUX Dark matter Experiment at SURF, Lead, SD, USA

ADMX Axion Dark Matter Experiment, U Uashington

These are immense questions that demand equally expansive computational power to answer. The ECP is readying science codes for exascale systems, the new workhorses of computational and big data science.

Initiated to drive the development of an “exascale ecosystem” of cutting-edge, high-performance architectures, codes and frameworks, the ECP will allow researchers to tackle data and computationally intensive challenges such as the ExaSky simulations of the known universe.

In addition to the magnitude of their computational demands, ECP projects are selected based on whether they meet specific strategic areas, ranging from energy and economic security to scientific discovery and healthcare.

“Salman’s research certainly looks at important and fundamental scientific questions, but it has societal benefits, too,” said Paul Messina, Argonne Distinguished Fellow. “Human beings tend to wonder where they came from, and that curiosity is very deep.”

HACC’ing the night sky

For Habib, the ECP presents a two-fold challenge — how do you conduct cutting-edge science on cutting-edge machines?

The cross-divisional Argonne team has been working on the science through a multi-year effort at the Argonne Leadership Computing Facility (ALCF), a DOE Office of Science User Facility. The team is running cosmological simulations for large-scale sky surveys on the facility’s 10-petaflop high-performance computer, Mira. The simulations are designed to work with observational data collected from specialized survey telescopes, like the forthcoming Dark Energy Spectroscopic Instrument (DESI) and the Large Synoptic Survey Telescope (LSST).

LBNL/DESI Dark Energy Spectroscopic Instrument for the Nicholas U. Mayall 4-meter telescope at Kitt Peak National Observatory near Tucson, Ariz, USA

LSST


LSST Camera, built at SLAC



LSST telescope, currently under construction at Cerro Pachón Chile, a 2,682-meter-high mountain in Coquimbo Region, in northern Chile, alongside the existing Gemini South and Southern Astrophysical Research Telescopes.

Survey telescopes look at much larger areas of the sky — up to half the sky, at any point — than does the Hubble Space Telescope, for instance, which focuses more on individual objects.

NASA/ESA Hubble Telescope

One night concentrating on one patch, the next night another, survey instruments systematically examine the sky to develop a cartographic record of the cosmos, as Habib describes it.

Working in partnership with Los Alamos and Lawrence Berkeley National Laboratories, the Argonne team is readying itself to chart the rest of the course.

Their primary code, which Habib helped develop, is already among the fastest science production codes in use. Called HACC (Hardware/Hybrid Accelerated Cosmology Code), this particle-based cosmology framework supports a variety of programming models and algorithms.

Unique among codes used in other exascale computing projects, it can run on all current and prototype architectures, from the basic X86 chip used in most home PCs, to graphics processing units, to the newest Knights Landing chip found in Theta, the ALCF’s latest supercomputing system.

As robust as the code is already, the HACC team continues to develop it further, adding significant new capabilities, such as hydrodynamics and associated subgrid models.

“When you run very large simulations of the universe, you can’t possibly do everything, because it’s just too detailed,” Habib explained. “For example, if we’re running a simulation where we literally have tens to hundreds of billions of galaxies, we cannot follow each galaxy in full detail. So we come up with approximate approaches, referred to as subgrid models.”

Even with these improvements and its successes, the HACC code still will need to increase its performance and memory to be able to work in an exascale framework. In addition to HACC, the ExaSky project employs the adaptive mesh refinement code Nyx, developed at Lawrence Berkeley. HACC and Nyx complement each other with different areas of specialization. The synergy between the two is an important element of the ExaSky team’s approach.

A cosmological simulation approach that melds multiple approaches allows the verification of difficult-to-resolve cosmological processes involving gravitational evolution, gas dynamics and astrophysical effects at very high dynamic ranges. New computational methods like machine learning will help scientists to quickly and systematically recognize features in both the observational and simulation data that represent unique events.

A trillion particles of light

The work produced under the ECP will serve several purposes, benefitting both the future of cosmological modeling and the development of successful exascale platforms.

On the modeling end, the computer can generate many universes with different parameters, allowing researchers to compare their models with observations to determine which models fit the data most accurately. Alternatively, the models can make predictions for observations yet to be made.

Models also can produce extremely realistic pictures of the sky, which is essential when planning large observational campaigns, such as those by DESI and LSST.

“Before you spend the money to build a telescope, it’s important to also produce extremely good simulated data so that people can optimize observational campaigns to meet their data challenges,” said Habib.

But the cost of realism is expensive. Simulations can range in the trillion-particle realm and produce several petabytes — quadrillions of bytes — of data in a single run. As exascale becomes prevalent, these simulations will produce 10 to 100 times as much data.

The work that the ExaSky team is doing, along with that of the other ECP research teams, will help address these challenges and those faced by computer manufacturers and software developers as they create coherent, functional exascale platforms to meet the needs of large-scale science. By working with their own codes on pre-exascale machines, the ECP research team can help guide vendors in chip design, I/O bandwidth and memory requirements and other features.

“All of these things can help the ECP community optimize their systems,” noted Habib. “That’s the fundamental reason why the ECP science teams were chosen. We will take the lessons we learn in dealing with this architecture back to the rest of the science community and say, ‘We have found a solution.’”

The Exascale Computing Project is a collaborative effort of two DOE organizations — the Office of Science and the National Nuclear Security Administration. As part of President Obama’s National Strategic Computing initiative, ECP was established to develop a capable exascale ecosystem, encompassing applications, system software, hardware technologies and architectures and workforce development to meet the scientific and national security mission needs of DOE in the mid-2020s timeframe.

ANL ALCF Cetus IBM supercomputer

ANL ALCF Theta Cray supercomputer

ANL ALCF Cray Aurora supercomputer

ANL ALCF MIRA IBM Blue Gene Q supercomputer at the Argonne Leadership Computing Facility

See the full article here .

Please help promote STEM in your local schools.
STEM Icon
Stem Education Coalition

Argonne National Laboratory seeks solutions to pressing national problems in science and technology. The nation’s first national laboratory, Argonne conducts leading-edge basic and applied scientific research in virtually every scientific discipline. Argonne researchers work closely with researchers from hundreds of companies, universities, and federal, state and municipal agencies to help them solve their specific problems, advance America’s scientific leadership and prepare the nation for a better future. With employees from more than 60 nations, Argonne is managed by UChicago Argonne, LLC for the U.S. Department of Energy’s Office of Science. For more visit www.anl.gov.

About ALCF

The Argonne Leadership Computing Facility’s (ALCF) mission is to accelerate major scientific discoveries and engineering breakthroughs for humanity by designing and providing world-leading computing facilities in partnership with the computational science community.

We help researchers solve some of the world’s largest and most complex problems with our unique combination of supercomputing resources and expertise.

ALCF projects cover many scientific disciplines, ranging from chemistry and biology to physics and materials science. Examples include modeling and simulation efforts to:

Discover new materials for batteries
Predict the impacts of global climate change
Unravel the origins of the universe
Develop renewable energy technologies

Argonne is managed by UChicago Argonne, LLC for the U.S. Department of Energy’s Office of Science

Argonne Lab Campus

#alcf, #exasky-computing-the-sky-at-extreme-scales-project-or, #cartography-of-the-cosmos, #des-dark-energy-survey, #ecp-exascale-computing-project, #lbnldesi-dark-energy-spectroscopic-instrument, #lsst-large-synoptic-survey-telescope, #salman-habib, #supercomputing, #the-computer-can-generate-many-universes-with-different-parameters, #there-are-hundreds-of-billions-of-stars-in-our-own-milky-way-galaxy

From ANL: “Big Bang – The Movie”

Argonne Lab
News from Argonne National Laboratory

August 24, 2017
Jared Sagoff
Austin Keating

If you have ever had to wait those agonizing minutes in front of a computer for a movie or large file to load, you’ll likely sympathize with the plight of cosmologists at the U.S. Department of Energy’s (DOE) Argonne National Laboratory. But instead of watching TV dramas, they are trying to transfer, as fast and as accurately as possible, the huge amounts of data that make up movies of the universe – computationally demanding and highly intricate simulations of how our cosmos evolved after the Big Bang.

In a new approach to enable scientific breakthroughs, researchers linked together supercomputers at the Argonne Leadership Computing Facility (ALCF) and at the National Center for Supercomputing Applications (NCSA) at the University of Illinois at Urbana-Champaign (UI). This link enabled scientists to transfer massive amounts of data and to run two different types of demanding computations in a coordinated fashion – referred to technically as a workflow.

What distinguishes the new work from typical workflows is the scale of the computation, the associated data generation and transfer and the scale and complexity of the final analysis. Researchers also tapped the unique capabilities of each supercomputer: They performed cosmological simulations on the ALCF’s Mira supercomputer, and then sent huge quantities of data to UI’s Blue Waters, which is better suited to perform the required data analysis tasks because of its processing power and memory balance.

ANL ALCF MIRA IBM Blue Gene Q supercomputer at the Argonne Leadership Computing Facility

U Illinois Blue Waters Cray supercomputer

For cosmology, observations of the sky and computational simulations go hand in hand, as each informs the other. Cosmological surveys are becoming ever more complex as telescopes reach deeper into space and time, mapping out the distributions of galaxies at farther and farther distances, at earlier epochs of the evolution of the universe.

The very nature of cosmology precludes carrying out controlled lab experiments, so scientists rely instead on simulations to provide a unique way to create a virtual cosmological laboratory. “The simulations that we run are a backbone for the different kinds of science that can be done experimentally, such as the large-scale experiments at different telescope facilities around the world,” said Argonne cosmologist Katrin Heitmann. “We talk about building the ‘universe in the lab,’ and simulations are a huge component of that.”

Not just any computer is up to the immense challenge of generating and dealing with datasets that can exceed many petabytes a day, according to Heitmann. “You really need high-performance supercomputers that are capable of not only capturing the dynamics of trillions of different particles, but also doing exhaustive analysis on the simulated data,” she said. “And sometimes, it’s advantageous to run the simulation and do the analysis on different machines.”

Typically, cosmological simulations can only output a fraction of the frames of the computational movie as it is running because of data storage restrictions. In this case, Argonne sent every data frame to NCSA as soon it was generated, allowing Heitmann and her team to greatly reduce the storage demands on the ALCF file system. “You want to keep as much data around as possible,” Heitmann said. “In order to do that, you need a whole computational ecosystem to come together: the fast data transfer, having a good place to ultimately store that data and being able to automate the whole process.”

In particular, Argonne transferred the data produced immediately to Blue Waters for analysis. The first challenge was to set up the transfer to sustain the bandwidth of one petabyte per day.

Once Blue Waters performed the first pass of data analysis, it reduced the raw data – with high fidelity – into a manageable size. At that point, researchers sent the data to a distributed repository at Argonne, the Oak Ridge Leadership Computing Facility at Oak Ridge National Laboratory and the National Energy Research Scientific Computing Center (NERSC) at Lawrence Berkeley National Laboratory. Cosmologists can access and further analyze the data through a system built by researchers in Argonne’s Mathematics and Computer Science Division in collaboration with Argonne’s High Energy Physics Division.

Argonne and University of Illinois built one such central repository on the Supercomputing ’16 conference exhibition floor in November 2016, with memory units supplied by DDN Storage. The data moved over 1,400 miles to the conference’s SciNet network. The link between the computers used high-speed networking through the Department of Energy’s Energy Science Network (ESnet). Researchers sought, in part, to take full advantage of the fast SciNET infrastructure to do real science; typically it is used for demonstrations of technology rather than solving real scientific problems.

“External data movement at high speeds significantly impacts a supercomputer’s performance,” said Brandon George, systems engineer at DDN Storage. “Our solution addresses that issue by building a self-contained data transfer node with its own high-performance storage that takes in a supercomputer’s results and the responsibility for subsequent data transfers of said results, leaving supercomputer resources free to do their work more efficiently.”

The full experiment ran successfully for 24 hours without interruption and led to a valuable new cosmological data set that Heitmann and other researchers started to analyze on the SC16 show floor.

Argonne senior computer scientist Franck Cappello, who led the effort, likened the software workflow that the team developed to accomplish these goals to an orchestra. In this “orchestra,” Cappello said, the software connects individual sections, or computational resources, to make a richer, more complex sound.

He added that his collaborators hope to improve the performance of the software to make the production and analysis of extreme-scale scientific data more accessible. “The SWIFT workflow environment and the Globus file transfer service were critical technologies to provide the effective and reliable orchestration and the communication performance that were required by the experiment,” Cappello said.

“The idea is to have data centers like we have for the commercial cloud. They will hold scientific data and will allow many more people to access and analyze this data, and develop a better understanding of what they’re investigating,” said Cappello, who also holds an affiliate position at NCSA and serves as director of the international Joint Laboratory on Extreme Scale Computing, based in Illinois. “In this case, the focus was cosmology and the universe. But this approach can aid scientists in other fields in reaching their data just as well.”

Argonne computer scientist Rajkumar Kettimuthu and David Wheeler, lead network engineer at NCSA, were instrumental in establishing the configuration that actually reached this performance. Maxine Brown from University of Illinois provided the Sage environment to display the analysis result at extreme resolution. Justin Wozniak from Argonne developed the whole workflow environment using SWIFT to orchestrate and perform all operations.

The Argonne Leadership Computing Facility, the Oak Ridge Leadership Computing Facility, the Energy Science Network and the National Energy Research Scientific Computing Center are DOE Office of Science User Facilities. Blue Waters is the largest leadership-class supercomputer funded by the National Science Foundation. Part of this work was funded by DOE’s Office of Science.

The National Center for Supercomputing Applications (NCSA) at the University of Illinois at Urbana-Champaign provides supercomputing and advanced digital resources for the nation’s science enterprise. At NCSA, University of Illinois faculty, staff, students, and collaborators from around the globe use advanced digital resources to address research grand challenges for the benefit of science and society. NCSA has been advancing one third of the Fortune 50 for more than 30 years by bringing industry, researchers, and students together to solve grand challenges at rapid speed and scale.

See the full article here .

Please help promote STEM in your local schools.
STEM Icon
Stem Education Coalition
Argonne National Laboratory seeks solutions to pressing national problems in science and technology. The nation’s first national laboratory, Argonne conducts leading-edge basic and applied scientific research in virtually every scientific discipline. Argonne researchers work closely with researchers from hundreds of companies, universities, and federal, state and municipal agencies to help them solve their specific problems, advance America’s scientific leadership and prepare the nation for a better future. With employees from more than 60 nations, Argonne is managed by UChicago Argonne, LLC for the U.S. Department of Energy’s Office of Science. For more visit www.anl.gov.

The Advanced Photon Source at Argonne National Laboratory is one of five national synchrotron radiation light sources supported by the U.S. Department of Energy’s Office of Science to carry out applied and basic research to understand, predict, and ultimately control matter and energy at the electronic, atomic, and molecular levels, provide the foundations for new energy technologies, and support DOE missions in energy, environment, and national security. To learn more about the Office of Science X-ray user facilities, visit http://science.energy.gov/user-facilities/basic-energy-sciences/.

Argonne is managed by UChicago Argonne, LLC for the U.S. Department of Energy’s Office of Science

Argonne Lab Campus

#alcf, #anl, #cosmology, #dealing-with-massive-data, #supercomputing, #u-illinois