Tagged: Exascale Computing Project Toggle Comment Threads | Keyboard Shortcuts

  • richardmitnick 12:35 pm on February 12, 2019 Permalink | Reply
    Tags: , CANDLE (CANcer Distributed Learning Environment) framework, Exascale Computing Project, ,   

    From insideHPC: “Argonne ALCF Looks to Singularity for HPC Code Portability” 

    From insideHPC

    February 10, 2019

    Over at Argonne, Nils Heinonen writes that Researchers are using the open source Singularity framework as a kind of Rosetta Stone for running supercomputing code most anywhere.

    Scaling code for massively parallel architectures is a common challenge the scientific community faces. When moving from a system used for development—a personal laptop, for instance, or even a university’s computing cluster—to a large-scale supercomputer like those housed at the Argonne Leadership Computing Facility [see below], researchers traditionally would only migrate the target application: the underlying software stack would be left behind.

    To help alleviate this problem, the ALCF has deployed the service Singularity. Singularity, an open-source framework originally developed by Lawrence Berkeley National Laboratory (LBNL) and now supported by Sylabs Inc., is a tool for creating and running containers (platforms designed to package code and its dependencies so as to facilitate fast and reliable switching between computing environments)—albeit one intended specifically for scientific workflows and high-performance computing resources.

    “here is a definite need for increased reproducibility and flexibility when a user is getting started here, and containers can be tremendously valuable in that regard,” said Katherine Riley, Director of Science at the ALCF. “Supporting emerging technologies like Singularity is part of a broader strategy to provide users with services and tools that help advance science by eliminating barriers to productive use of our supercomputers.”

    This plot shows the number of events ATLAS events simulated (solid lines) with and without containerization. Linear scaling is shown (dotted lines) for reference.

    The demand for such services has grown at the ALCF as a direct result of the HPC community’s diversification.

    When the ALCF first opened, it was catering to a smaller user base representative of the handful of domains conventionally associated with scientific computing (high energy physics and astrophysics, for example).

    ANL ALCF Cetus IBM supercomputer

    ANL ALCF Theta Cray supercomputer

    ANL ALCF Cray Aurora supercomputer

    ANL ALCF MIRA IBM Blue Gene Q supercomputer at the Argonne Leadership Computing Facility

    HPC is now a principal research tool in new fields such as genomics, which perhaps lack some of the computing culture ingrained in certain older disciplines. Moreover, researchers tackling problems in machine learning, for example, constitute a new community. This creates a strong incentive to make HPC more immediately approachable to users so as to reduce the amount of time spent preparing code and establishing migration protocols, and thus hasten the start of research.

    Singularity, to this end, promotes strong mobility of compute and reproducibility due to the framework’s employment of a distributable image format. This image format incorporates the entire software stack and runtime environment of the application into a single monolithic file. Users thereby gain the ability to define, create, and maintain an application on different hosts and operating environments. Once a containerized workflow is defined, its image can be snapshotted, archived, and preserved for future use. The snapshot itself represents a boon for scientific provenance by detailing the exact conditions under which given data were generated: in theory, by providing the machine, the software stack, and the parameters, one’s work can be completely reproduced. Because reproducibility is so crucial to the scientific process, this capability can be seen as one of the primary assets of container technology.

    ALCF users have already begun to take advantage of the service. Argonne computational scientist Taylor Childers (in collaboration with a team of researchers from Brookhaven National Laboratory, LBNL, and the Large Hadron Collider’s ATLAS experiment) led ASCR Leadership Computing Challenge and ALCF Data Science Program projects to improve the performance of ATLAS software and workflows on DOE supercomputers.

    CERN/ATLAS detector

    Every year ATLAS generates petabytes of raw data, the interpretation of which requires even larger simulated datasets, making recourse to leadership-scale computing resources an attractive option. The ATLAS software itself—a complex collection of algorithms with many different authors—is terabytes in size and features manifold dependencies, making manual installation a cumbersome task.

    The researchers were able to run the ATLAS software on Theta inside a Singularity container via Yoda, an MPI-enabled Python application the team developed to communicate between CERN and ALCF systems and ensure all nodes in the latter are supplied with work throughout execution. The use of Singularity resulted in linear scaling on up to 1024 of Theta’s nodes, with event processing improved by a factor of four.

    “All told, with this setup we were able to deliver to ATLAS 65 million proton collisions simulated on Theta using 50 million core-hours,” said John Taylor Childers from ALCF.

    Containerization also effectively circumvented the software’s relative “unfriendliness” toward distributed shared file systems by accelerating metadata access calls; tests performed without the ATLAS software suggested that containerization could speed up such access calls by a factor of seven.

    While Singularity can present a tradeoff between immediacy and computational performance (because the containerized software stacks, generally speaking, are not written to exploit massively parallel architectures), the data-intensive ATLAS project demonstrates the potential value in such a compromise for some scenarios, given the impracticality of retooling the code at its center.

    Because containers afford users the ability to switch between software versions without risking incompatibility, the service has also been a mechanism to expand research and try out new computing environments. Rick Stevens—Argonne’s Associate Laboratory Director for Computing, Environment, and Life Sciences (CELS)—leads the Aurora Early Science Program project Virtual Drug Response Prediction. The machine learning-centric project, whose workflow is built from the CANDLE (CANcer Distributed Learning Environment) framework, enables billions of virtual drugs to be screened singly and in numerous combinations while predicting their effects on tumor cells. Their distribution made possible by Singularity containerization, CANDLE workflows are shared between a multitude of users whose interests span basic cancer research, deep learning, and exascale computing. Accordingly, different subsets of CANDLE users are concerned with experimental alterations to different components of the software stack.

    CANDLE users at health institutes, for instance, may have no need for exotic code alterations intended to harness the bleeding-edge capabilities of new systems, instead requiring production-ready workflows primed to address realistic problems,” explained Tom Brettin, Strategic Program Manager for CELS and a co-principal investigator on the project. Meanwhile, through the support of DOE’s Exascale Computing Project, CANDLE is being prepared for exascale deployment.

    Containers are relatively new technology for HPC, and their role may well continue to grow. “I don’t expect this to be a passing fad,” said Riley. “It’s functionality that, within five years, will likely be utilized in ways we can’t even anticipate yet.”

    See the full article here .


    Please help promote STEM in your local schools.

    Stem Education Coalition

    Founded on December 28, 2006, insideHPC is a blog that distills news and events in the world of HPC and presents them in bite-sized nuggets of helpfulness as a resource for supercomputing professionals. As one reader said, we’re sifting through all the news so you don’t have to!

    If you would like to contact me with suggestions, comments, corrections, errors or new company announcements, please send me an email at rich@insidehpc.com. Or you can send me mail at:

    2825 NW Upshur
    Suite G
    Portland, OR 97239

    Phone: (503) 877-5048

  • richardmitnick 11:17 am on April 13, 2017 Permalink | Reply
    Tags: Exascale Computing Project, Exascale Computing Project (ECP) Co-Design Center,   

    From PNNL: “PNNL-Led Co-Design Center to Enhance Graph Algorithms for Exascale Computing Project” 

    PNNL Lab

    March 2017

    Recently, a PNNL-led proposal, “ExaGraph: Combinatorial Methods for Enabling Exascale Applications,” was selected as the fifth Exascale Computing Project (ECP) Co-Design Center. The center will focus on graph analytics, primarily combinatorial (graph) kernels. These kernels can access computing system resources to enhance data analytic computing applications but are among the most difficult to implement on parallel systems. Mahantesh Halappanavar, the Analytics and Algorithms Team Lead with ACMD Division’s Data Sciences group, will lead the center with Aydin Buluç, from Lawrence Berkeley National Laboratory; Erik Boman, of Sandia National Laboratories; and Alex Pothen, from Purdue University, serving as co-principal investigators.

    PNNL is leading the fifth Exascale Computing Project Co-Design Center, which according to ECP leadership puts the program in “a better position to ready current and evolving data analytic computing applications for efficient use of capable exascale platforms.”

    According to Halappanavar, the center will tackle developing key combinatorial algorithms arising from several exascale application domains such as power grid, computational chemistry, computational biology, and climate science. These applications and their respective growing data volumes increasingly pose an unprecedented need for larger computational resources to solve problems. This complexity will drive selection of kernels and their integration among software tools. To start, the intent is to work with the scientists involved in related ECP projects, such as NWChemEx, to fine-tune software tools that will perform on current and future extreme-scale systems, as well as enhance scientific discovery by providing more computation and flexibility to do what is needed for large volumes of data.

    “In the end, the applications will need to benefit from the tools that incorporate the algorithms targeted for exascale architectures,” Halappanavar explained.

    As part of the four-year project, the ExaGraph Co-Design Center will investigate a diversity of data analytic computational motifs, including graph traversals, graph matching and coloring, graph clustering and partitioning, parallel mixed-integer programs, and ordering and scheduling algorithms.

    “The ExaGraph Co-Design Center’s aim is to highlight the value of graph kernels via co-design of key algorithmic motifs and science applications along with the classical hardware-software co-design of algorithmic kernels,” Halappanavar said. “These graph algorithms will augment how data analytics are performed for applications and scientific computing.”

    Beyond its initial launch, Halappanavar noted the ExaGraph Co-Design Center aims to deliver a software library. The library will feature a set of frameworks that implement combinatorial kernels that can communicate with each other to enable scientific computing, which further empowers basic science research.

    In addition, Adolfy Hoisie, PNNL’s Chief Scientist for Computing and Laboratory Fellow, explained that having a PNNL-led ECP Co-Design Center that takes advantage of Halappanavar’s considerable expertise and unites some key collaborators is a welcome and synergistic addition to the laboratory’s research landscape and capabilities.

    “The ExaGraph Co-Design Center is technically important to ECP and will provide significant contributions that benefit its overall exascale program in a way that can be accessible and useful across many scientific application areas,” Hoisie said. “I look forward to seeing this center grow.”

    About ECP
    The U.S. Department of Energy’s Exascale Computing Project is responsible for developing the strategy, aligning the resources, and conducting the R&D necessary to achieve the nation’s imperative of delivering exascale computing by 2021. ECP’s mission is to ensure all the necessary pieces are in place for the first exascale systems—an ecosystem that includes mission critical applications, software stack, hardware architecture, advanced system engineering and hardware components to enable fully functional, capable exascale computing environments critical to national security, scientific discovery, and a strong U.S. economy.

    The ECP is a collaborative project of two U.S. Department of Energy organizations, the Office of Science and the National Nuclear Security Administration.

    See the full article here .

    Please help promote STEM in your local schools.
    STEM Icon

    Stem Education Coalition

    Pacific Northwest National Laboratory (PNNL) is one of the United States Department of Energy National Laboratories, managed by the Department of Energy’s Office of Science. The main campus of the laboratory is in Richland, Washington.

    PNNL scientists conduct basic and applied research and development to strengthen U.S. scientific foundations for fundamental research and innovation; prevent and counter acts of terrorism through applied research in information analysis, cyber security, and the nonproliferation of weapons of mass destruction; increase the U.S. energy capacity and reduce dependence on imported oil; and reduce the effects of human activity on the environment. PNNL has been operated by Battelle Memorial Institute since 1965.


  • richardmitnick 11:32 am on June 16, 2016 Permalink | Reply
    Tags: , Exascale Computing Project, Paul Messina   

    From ANL: “Messina discusses rewards, challenges for new exascale project” 

    ANL Lab
    News from Argonne National Laboratory

    Argonne Distinguished Fellow Paul Messina has been tapped to lead the DOE and NNSA’s Exascale Computing Project with the goal of paving the way toward exascale supercomputing.

    June 8, 2016
    Louise Lerner

    The exascale initiative has an ambitious goal: to develop supercomputers a hundred times more powerful than today’s systems.

    That’s the kind of speed that can help scientists make serious breakthroughs in solar and sustainable energy technology, weather forecasting, batteries and more.

    Last year, President Obama announced a unified National Strategic Computing Initiative to support U.S. leadership in high-performance computing; one key objective is to pave the road toward an exascale computing system.

    The U.S. Department of Energy (DOE) has been charged with carrying out that role in an initiative called the Exascale Computing Project.

    Argonne National Laboratory Distinguished Fellow Paul Messina has been tapped to lead the project, heading a team with representation from the six major participating DOE national laboratories: Argonne, Los Alamos, Lawrence Berkeley, Lawrence Livermore, Oak Ridge and Sandia. The project program office is located at Oak Ridge.

    Messina, who has made fundamental contributions to modern scientific computing and networking and previously served as the Director of Science for the Argonne Leadership Computing Facility, a DOE Office of Science User Facility, for eight years, will now help usher in a new generation of supercomputers with the capabilities to change our everyday lives.

    Exascale-level computing could have an impact on almost everything, Messina said. It can help increase the efficiency of wind farms by determining the best locations and arrangements of turbines, as well as optimizing the design of the turbines themselves. It can also help severe weather forecasters make their models more accurate and could boost research in solar energy, nuclear energy, biofuels and combustion, among many other fields.

    “For example, it’s clear from some of our pilot projects that exascale computing power could help us make real progress on batteries,” Messina said.

    Brute computing force is not sufficient, however, Messina said; “We also need mathematical models that better represent phenomena and algorithms that can efficiently implement those models on the new computer architectures.”

    Given those advances, researchers will be able to sort through the massive number of chemical combinations and reactions to identify good candidates for new batteries.

    “Computing can help us optimize. For example, let’s say that we know we want a manganese cathode with this electrolyte; with these new supercomputers, we can more easily find the optimal chemical compositions and proportions for each,” he said.

    Exascale computing will help researchers get a handle on what’s happening inside systems where the chemistry and physics are extremely complex. To stick with the battery example: the behavior of liquids and components within a working battery is intricate and constantly changing as the battery ages.

    “We use approximations in many of our calculations to make the computational load lighter,” Messina said, “but what if we could afford to use the more accurate — but more computationally expensive — methods?”

    In addition, Messina said that one of the project’s goals is to boost U.S. industry, so the Exascale Computing Project will be working with companies to make sure the project is in step with their goals and needs.

    Messina spoke further on the four areas where the project will focus its efforts.


    The applications software to tackle these larger computing challenges will often evolve from current codes, but will need substantial work, Messina said.

    First, simulating more challenging problems will require some brand-new methods and algorithms. Second, the architectures of these new computers will be different from the ones we have today, so to be able to use existing codes effectively, the codes will have to be modified. This is a daunting task for many of the teams that use scientific supercomputers today.

    “These are huge, complex applications, often with literally millions of lines of code,” Messina said. “Maybe they took the team 500 person-years to write, and now you need to modify it to take advantage of new architectures, or even translate it into a different programming language.”

    The project will support teams that can provide the people-power to tackle a number of applications of interest, he said. For example, data-intensive calculations are expected to be increasingly important and will require new software and hardware features.

    The goal is to have “mission-critical” applications to be ready when the first exascale systems are deployed, Messina said.

    The teams will also identify both what new supporting software is needed, and ways that the hardware design could be improved to work with that software before the computers themselves are ever built. This “co-design” element is central for reaching the full potential of exascale, he said.


    “The software ecosystem will need to evolve both to support new functionality demanded by applications and to use new hardware features efficiently,” Messina said.

    The project will enhance the software stack that DOE Office of Science and NNSA applications rely on and evolve it for exascale, as well as conduct R&D on tools and methods to boost productivity and portability between systems.

    For example, many tasks are the same from scientific application to application and are embodied as elements of software libraries. Teams writing new code use the libraries for efficiency — “so you don’t have to be an expert in every single thing,” Messina explained.

    “Thus, improving libraries that do numerical tasks or visualizations, data analytics and program languages, for example, would benefit many different users,” he said.

    Teams working on these components will work closely with the applications taskforce, he said. “We’ll need good communication between these teams so everyone knows what’s needed and how to use the tools provided.”

    In addition, as researchers are able to get more and more data from experiments, they’ll need software infrastructure to more effectively deal with that data.


    While the computers themselves are massive, they aren’t a big part of the commercial market.

    “Scientific computers are a niche market, so we make our own specs to get the best results for computational science applications,” Messina said. “That’s what we do with most of our scientific supercomputers, including here at Argonne when we collaborated with IBM and Lawrence Livermore National Laboratory on the design of Mira, and we believe it really paid off.”

    For example, companies are used to building huge banks of servers for business computing applications, for which it’s not usually important for one cabinet’s worth of chips to be able to talk to another one; “For us, it matters a lot,” he said.

    This segment will work with computer vendors and hardware technology providers to accelerate the development of particular features for scientific and engineering applications — not just those DOE is interested in, but also priorities for other federal agencies, academia and industry, Messina said.

    Prepping exascale sites

    Supercomputers need very special accommodations — you can’t stick one just anywhere. They need a good deal of electricity and cooling infrastructure; they take up a fair amount of square footage, and all of the flooring needs to be reinforced. This effort will work to develop sites for computers with this kind of footprint.

    The Exascale Computing Project is a complex project with many stakeholders and moving parts, Messina said. “The challenge will be to effectively coordinate activities in many different sites in a relatively short time frame — but the rewards are clear.”

    The project will be jointly funded by the U.S. Department of Energy’s Office of Science and the National Nuclear Security Administration’s Office of Defense Programs.

    See the full article here .

    Please help promote STEM in your local schools.
    STEM Icon
    Stem Education Coalition
    Argonne National Laboratory seeks solutions to pressing national problems in science and technology. The nation’s first national laboratory, Argonne conducts leading-edge basic and applied scientific research in virtually every scientific discipline. Argonne researchers work closely with researchers from hundreds of companies, universities, and federal, state and municipal agencies to help them solve their specific problems, advance America’s scientific leadership and prepare the nation for a better future. With employees from more than 60 nations, Argonne is managed by UChicago Argonne, LLC for the U.S. Department of Energy’s Office of Science. For more visit http://www.anl.gov.

    The Advanced Photon Source at Argonne National Laboratory is one of five national synchrotron radiation light sources supported by the U.S. Department of Energy’s Office of Science to carry out applied and basic research to understand, predict, and ultimately control matter and energy at the electronic, atomic, and molecular levels, provide the foundations for new energy technologies, and support DOE missions in energy, environment, and national security. To learn more about the Office of Science X-ray user facilities, visit http://science.energy.gov/user-facilities/basic-energy-sciences/.

    Argonne is managed by UChicago Argonne, LLC for the U.S. Department of Energy’s Office of Science

    Argonne Lab Campus

Compose new post
Next post/Next comment
Previous post/Previous comment
Show/Hide comments
Go to top
Go to login
Show/Hide help
shift + esc
%d bloggers like this: