From Argonne Leadership Computing Facility: “Scientists spearhead convergence of AI and HPC for cosmology”

Argonne Lab
News from Argonne National Laboratory

From Argonne Leadership Computing Facility

Using real images of galaxies from astronomical surveys of our Universe, a neural network can be trained to classify these with impressive accuracy. This image is a snapshot from a visualization that shows the output of the penultimate layer of a deep neural network during training as it is learning to classify two types of galaxies: spirals and ellipticals. View the full visualization here. Credit: Janet Knowles, Joseph A. Insley and Silvio Rizzi, Argonne National Laboratory

In 2007, the Sloan Digital Sky Survey (SDSS) launched a citizen science campaign called Galaxy Zoo to enlist the public’s help in classifying the hundreds of thousands of galaxy images captured by an optical telescope. Through this highly successful crowdsourcing effort, volunteers reviewed the images online to help determine whether each galaxy had a spiral or elliptical structure.

Leveraging data generated by the Galaxy Zoo project, a team of scientists is now applying the power of artificial intelligence (AI) and high-performance supercomputers to accelerate efforts to analyze the increasingly massive datasets produced by ongoing and future cosmological surveys.

In a new study, researchers from the National Center for Supercomputing Applications (NCSA) at the University of Illinois at Urbana-Champaign and the Argonne Leadership Computing Facility (ALCF) at the U.S. Department of Energy’s (DOE) Argonne National Laboratory have developed a novel combination of deep learning methods to provide a highly accurate approach to classifying hundreds of millions of unlabeled galaxies. The team’s findings were published in Physics Letters B.

“The NCSA Gravity Group initiated, and continues to spearhead, the use of deep learning at scale for gravitational wave astrophysics. We have expanded our research portfolio to address a computational grand challenge in cosmology, innovating the use of several deep learning methods in combination with high-performance computing (HPC),” said Eliu Huerta, NCSA Gravity Group Lead. “Our work also showcases how the interoperability of NSF and DOE supercomputing resources can be used to accelerate science.”

“Deep learning research has rapidly become a booming enterprise across multiple disciplines. Our findings show that the convergence of deep learning and HPC can address big-data challenges of large-scale electromagnetic surveys. This research is part of a multidisciplinary program at NCSA to push the boundaries of AI and HPC in scientific research,” added Asad Khan, a graduate student at the NCSA Gravity Group and lead author of this study.

Supported by an ALCF Data Science Program award, the team used the SDSS datasets produced by the Galaxy Zoo campaign to train neural network models to classify galaxies in the Dark Energy Survey (DES) that overlap the footprint of both surveys. The method’s ability to identify spiral and elliptical galaxies was found to be 99.6 percent accurate.

“Using the millions of classifications carried out by the public in the Galaxy Zoo project to train a neural network is an inspiring use of the citizens science program,” said Elise Jennings, ALCF computer scientist. “This exciting research also sheds light on the inner workings of the neural network, which clearly learns two distinct feature clusters to identify spiral and elliptical galaxies.”

The team’s innovative framework lays the foundations to exploit deep transfer learning at scale, data clustering and recursive training to produce large-scale galaxy catalogs in the Large Synoptic Survey Telescope (LSST) era.

“We’re excited to work with the team at NCSA and Argonne as well as the researchers who drove the original Galaxy Zoo effort to pursue this important area of scientific discovery,” said Tom Gibbs, manager of developer relations at NVIDIA. “Using these new methods, we’re taking an important step to understanding the mystery of dark energy.”

Highlights of this study include:

The first application of deep transfer learning using disparate datasets for galaxy classification. The team used deep transfer learning to transfer knowledge from Xception, a state-of-the-art neural network model for image classification trained with the ImageNet dataset, to classify SDSS galaxy images. Transfer learning between similar datasets, such as images of human faces, has been traditionally used in computer science literature. In stark contrast, this study uses a pre-trained model for real-world object recognition and then transfers its knowledge to classify galaxies.
The researchers developed open-source software stacks to extract galaxy images from the SDSS and DES surveys at scale using the NCSA’s Blue Waters supercomputer. Deep learning algorithms were prototyped and trained using NVIDIA GPUs in the Bridges supercomputer at the Pittsburgh Supercomputing Center through the National Science Foundation’s Extreme Science and Engineering Discovery Environment (XSEDE). Finally, deep transfer learning was combined with distributed training to reduce the training stage of the Xception model with galaxy image datasets from five hours to just eight minutes using ALCF supercomputing resources.
The researchers used deep neural network classifiers to label over 10,000 unlabeled DES galaxies that have not been observed in previous surveys. The neural network model models are then turned into feature extractors to show that these unlabeled datasets can be clustered according to their morphology, forming two distinct datasets.
ALCF researchers created a visualization to show the output of the penultimate layer of a deep neural network during training as it is learning to classify galaxies as spiral or elliptical. View the visualizartion here.

See the full article here .


Please help promote STEM in your local schools.

Stem Education Coalition

Argonne National Laboratory seeks solutions to pressing national problems in science and technology. The nation’s first national laboratory, Argonne conducts leading-edge basic and applied scientific research in virtually every scientific discipline. Argonne researchers work closely with researchers from hundreds of companies, universities, and federal, state and municipal agencies to help them solve their specific problems, advance America’s scientific leadership and prepare the nation for a better future. With employees from more than 60 nations, Argonne is managed by UChicago Argonne, LLC for the U.S. Department of Energy’s Office of Science. For more visit

About ALCF
The Argonne Leadership Computing Facility’s (ALCF) mission is to accelerate major scientific discoveries and engineering breakthroughs for humanity by designing and providing world-leading computing facilities in partnership with the computational science community.

We help researchers solve some of the world’s largest and most complex problems with our unique combination of supercomputing resources and expertise.

ALCF projects cover many scientific disciplines, ranging from chemistry and biology to physics and materials science. Examples include modeling and simulation efforts to:

Discover new materials for batteries
Predict the impacts of global climate change
Unravel the origins of the universe
Develop renewable energy technologies

Argonne is managed by UChicago Argonne, LLC for the U.S. Department of Energy’s Office of Science

Argonne Lab Campus