Tagged: Exascale computing Toggle Comment Threads | Keyboard Shortcuts

  • richardmitnick 7:58 pm on June 30, 2022 Permalink | Reply
    Tags: "ExaSMR Models Small Modular Reactors Throughout Their Operational Lifetime", , , Current advanced reactor design approaches leverage decades of experimental and operational experience with the US nuclear fleet., Exascale computing, Exascale supercomputers give us a tool to model SMRs with higher resolution than possible on smaller supercomputers., ExaSMR integrates the most reliable and high-confidence numerical methods for modeling operational reactors., Investing in computer design capability means we can better evaluate and refine the designs to come up with the most efficacious solutions., Many different designs are being studied for next-generation reactors., , The DOE’s Exascale Computing Project, The ExaSMR team has adapted their algorithms and code to run on GPUs to realize an orders-of-magnitude increase in performance., The proposed SMR designs are generally simpler and require no human intervention or external power or the application of external force to shut down., We are already seeing significant improvements now on pre-exascale systems.   

    From The DOE’s Exascale Computing Project: “ExaSMR Models Small Modular Reactors Throughout Their Operational Lifetime” 

    From The DOE’s Exascale Computing Project

    June 8, 2022 [Just now in social media.]
    Rob Farber

    Technical Introduction

    Small modular reactors (SMRs) are advanced nuclear reactors that can be incrementally added to a power grid to provide carbon-free energy generation to match increasing energy demand.[1],[2] Their small size and modular design make them a more affordable option because they can be factory assembled and transported to an installation site as prefabricated units.

    Compared to existing nuclear reactors, proposed SMR designs are generally simpler and require no human intervention or external power or the application of external force to shut down. SMRs are designed to rely on passive systems that utilize physical phenomena, such as natural circulation, convection, gravity, and self-pressurization to eliminate or significantly lower the potential for unsafe releases of radioactivity in case of an accident.[3] Computer models are used to ensure that the SMR passive systems can safely operate the reactor regardless of the reactor’s operational mode—be it at idle, during startup, or running at full power.

    Current advanced reactor design approaches leverage decades of experimental and operational experience with the US nuclear fleet and are informed by calibrated numerical models of reactor phenomena. The exascale SMR (ExaSMR) project generates datasets of virtual reactor design simulations based on high-fidelity, coupled physics models for reactor phenomena that are truly predictive and reflect as much ground truth as experimental and operational reactor data.[4]

    An Integrated Toolkit

    The Exascale Computing Project’s (ECP’s) ExaSMR team is working to build a highly accurate, exascale-capable integrated tool kit that couples high-fidelity neutronics and computational fluid dynamics (CFD) codes to model the operational behavior of SMRs over the complete reactor lifetime. This includes accurately modeling the full-core multiphase thermal hydraulics and the fuel depletion. Even with exascale performance, reduced-order mesh numerical methodologies are required to achieve sufficient accuracy with reasonable runtimes to make these simulations tractable.

    According to Steven Hamilton (Figure 2), a senior researcher at The DOE’s Oak Ridge National Laboratory (ORNL) and PI of the project, ExaSMR integrates the most reliable and high-confidence numerical methods for modeling operational reactors.

    Specifically, ExaSMR is designed to leverage exascale systems to accurately and efficiently model the reactor’s neutron state with Monte Carlo (MC) neutronics and the reactor’s thermal fluid heat transfer efficiency with high-resolution CFD.[5] The ExaSMR team’s goal is to achieve very high spatial accuracy using models that contain 40 million spatial elements and exhibit 22 billion degrees of freedom.[6]

    Hamilton notes that high-resolution models are essential because they are used to reflect the presence of spacer grids and the complex mixing promoted by mixing vanes (or the equivalent) in the reactor. The complex fluid flows around these regions in the reactor (Figure 1) require high spatial resolution so engineers can understand the neutron distribution and the reactor’s thermal heat transfer efficiency. Of particular interest is the behavior of the reactor during low-power conditions as well as the initiation of coolant flow circulation through the SMR reactor core and its primary heat exchanger during startup.

    Figure 1. Complex fluid flows and momentum cause swirling.

    To make the simulations run in reasonable times even when using an exascale supercomputer, the results of the high accuracy model are adapted so they can be utilized in a reduced order methodology. This methodology is based on momentum sources that can mimic the mixing caused by the vanes in the reactor. [7] Hamilton notes, “Essentially, we use the full core simulation on a small model that is replicated over the reactor by mapping to a coarser mesh. This coarser mesh eliminates the time-consuming complexity of the mixing vane calculations while still providing an accurate-enough representation for the overall model.” The data from the resulting virtual reactor simulations are used to fill in critical gaps in experimental and operational reactor data. These results give engineers the ability to accelerate the currently cumbersome advanced reactor concept-to-design-to-build cycle that has constrained the nuclear energy industry for decades. ExaSMR can also provide an avenue for validating existing industry design and regulatory tools.[8]

    Figure 2. Steven Hamilton, PI of the ExaSMR project and Senior researcher at ORNL.

    “The importance,” Hamilton states, “is that many different designs are being studied for next-generation reactors. Investing in computer design capability means we can better evaluate and refine the designs to come up with the most efficacious solutions. Exascale supercomputers give us a tool to model SMRs with higher resolution than possible on smaller supercomputers. These resolution improvements make our simulations more predictive of the phenomena we are modeling. We are already seeing significant improvements now on pre-exascale systems and expect a similar jump in performance once we are running on the actual exascale hardware.” He concludes by noting, “Many scientists believe that nuclear is the only carbon-free energy source that is suitable for bulk deployment to meet primary energy needs with a climate-friendly technology.”

    The First Full-Core, Pin-Resolved CFD Simulations

    To achieve their goal of generating high-fidelity, coupled-physics models for truly predictive reactor models, the team must overcome limitations in computing power that have constrained past efforts to modeling only specific regions of a reactor core.[9] To this end, the ExaSMR team has adapted their algorithms and code to run on GPUs to realize an orders-of-magnitude increase in performance when running a challenge problem on the pre-exascale Summit supercomputer.

    Hamilton explains, “We were able to perform the simulations between 170× and 200× faster on the Summit supercomputer compared to the previous Titan ORNL supercomputer.

    Much of this is owed to ECP’s investment in the ExaSMR project and the Center for Efficient Exascale Discretizations (CEED) along with larger, higher performance GPU hardware. The CEED project has been instrumental for improving the algorithms we used in this simulation.”

    In demonstrating this new high watermark in performance, the team also performed (to their knowledge) the first ever full-core, pin-resolved CFD simulation that modeled coolant flow around the fuel pins in a light water reactor core. These fluid flows play a critical role in determining the reactor’s safety and performance. Hamilton notes, “This full core spacer grids and the mixing vanes (SGMV) simulation provides a high degree of spatial resolution that allows simultaneous capture of local and global effects. Capturing the effect of mixing vanes on flow and heat transfer is vital to predictive simulations.”

    The complexity of these flows can be seen in streamlines in Figure 1. Note the transition from parallel to rotating flows caused by simulation of the CFD momentum sources.

    A Two-Step Approach to Large-Scale Simulations

    A two-step approach was taken to implement a GPU-oriented CFD code using Reynolds-Averaged Navier-Stokes (RANS) equations to model the behavior in this SGMV challenge problem.

    Small simulations are performed using the more accurate yet computationally expensive large eddy simulation (LES) code. Hamilton notes these are comparatively small and do not need to be performed on the supercomputer.
    The accurate LES results are then imposed on a coarser mesh, which is used for modeling the turbulent flow at scale on the supercomputer’s GPUs. The RANS approach is needed because the Reynolds number in the core is expected to be high.[10]

    Jun Fang, an author of the study in which these results were published, reflects on the importance of these pre-exascale results by observing, “As we advance toward exascale computing, we will see more opportunities to reveal large-scale dynamics of these complex structures in regimes that were previously inaccessible, thereby giving us real information that can reshape how we approach the challenges in reactor designs.”[11]

    This basis for this optimism is reflected in the strong scaling behavior of NekRES, a GPU-enabled branch of the Nek5000 CFD code contributed by the ExaSMR team.[12] NekRS utilizes optimized finite-element flow solver kernels from the libParanumal library developed by CEED. The ExaSMR code is portable owing in part to the team’s use of the ECP-supported exascale-capable OCCA performance portability library. The OCCA library provides programmers with the ability to write portable kernels that can run on a variety of hardware platforms or be translated to backend-specific code such as OpenCL and CUDA.

    Figure 3. NekRS strong scaling on Summit.

    Development of Novel Momentum Sources to Model Auxiliary Structures in the Core

    Even with the considerable computational capability of exascale hardware, the team was forced to develop a reduced-order methodology that mimics the mixing of the vanes to make the full core simulation tractable. “This methodology,” Hamilton notes, “allows the impact of mixing vanes on flow to be captured without requiring an explicit model of vanes. The objective is to model the fluid flow without the need of an expensive body-fitted mesh.” Instead, as noted in the paper, “The effects of spacer grid, mixing vanes, springs, dimples, and guidance/maintaining vanes are taken into account in the form of momentum sources and pressure drop.”[13]

    Validation of the Challenge Results

    To ensure adequate accuracy of the reduced order methodology, the momentum sources are carefully calibrated by the team with detailed LES of spacer grids performed with Nek5000.[14] The Nek5000 reference was used because it is a trusted reference in the literature.

    “The combination of RANS (full core) and LES,” the team wrote in their paper, “forms a flexible strategy that balances both efficiency and the accuracy.” Furthermore, “Continuous validation and verification studies have been conducted over years for Nek5000 for various geometries of interest to nuclear engineers, including the rod bundles with spacer grid and mixing vanes.”[15]

    Expanding on the text in the paper, Hamilton points out that “the momentum source method (MSM) was implemented in NekRS using the same approach developed in Nek5000, thereby leveraging as much as possible the same routines.”

    Validation of the simulation results includes the demonstration of the momentum sources shown in Figure 1 as well as validation of the pressure drop. Both are discussed in detail in the team’s peer-reviewed paper, which includes a numerical quantification of results by various figures of merit. Based on the success reflected in the validation metrics, the team concludes that they “clearly demonstrated that the RANS momentum sources developed can successfully reproduce the time-averaged macroscale flow physics revealed by the high-fidelity LES reference.”[16]

    The Groundwork has been Laid to Expand the Computational Domain

    Improved software, GPU acceleration, and reduced-order mesh numerical methodologies have laid the groundwork for further development of the integrated ExaSMR toolkit. In combination with operational exascale hardware, the ExaSMR team can expand their capabilities to simulate and study the system behavior concerning the neutronics and thermal–hydraulics of these small reactors.

    The implications are significant because the passive design and ease of installation means that SMRs offer a solution where the United States and the world can meet essential carbon-neutral climate goals while also addressing the need to augment existing electricity generation capacity.

    This research was supported by the Exascale Computing Project (17-SC-20-SC), a joint project of the US Department of Energy’s Office of Science and National Nuclear Security Administration, responsible for delivering a capable exascale ecosystem, including software, applications, and hardware technology, to support the nation’s exascale computing imperative.

    [1] https://www.iaea.org/newscenter/news/what-are-small-modular-reactors-smrs

    [2] https://www.energy.gov/ne/articles/4-key-benefits-advanced-small-modular-reactors

    [3] https://www.iaea.org/newscenter/news/what-are-small-modular-reactors-smrs

    [4] https://www.ornl.gov/project/exasmr-coupled-monte-carlo-neutronics-and-fluid-flow-simulation-small-modular-reactors

    [5] https://www.ornl.gov/project/exasmr-coupled-monte-carlo-neutronics-and-fluid-flow-simulation-small-modular-reactors

    [6] https://www.exascaleproject.org/research-project/exasmr/

    [7] https://www.sciencedirect.com/science/article/abs/pii/S0029549321000959?via%3Dihub

    [8] https://www.exascaleproject.org/research-project/exasmr/

    [9] https://www.ans.org/news/article-2968/argonneled-team-models-fluid-dynamics-of-entire-smr-core/

    [10] https://www.sciencedirect.com/science/article/abs/pii/S0029549321000959?via%3Dihub

    [11] https://www.ans.org/news/article-2968/argonneled-team-models-fluid-dynamics-of-entire-smr-core/

    [12] https://www.exascaleproject.org/research-project/exasmr/

    [13] https://www.sciencedirect.com/science/article/abs/pii/S0029549321000959?via%3Dihub

    [14] https://www.sciencedirect.com/science/article/abs/pii/S0029549321000959?via%3Dihub

    [15] https://www.sciencedirect.com/science/article/abs/pii/S0029549321000959?via%3Dihub

    [16] https://www.osti.gov/biblio/1837194-feasibility-full-core-pin-resolved-cfd-simulations-small-modular-reactor-momentum-sources

    See the full article here.


    Please help promote STEM in your local schools.

    Stem Education Coalition

    About The DOE’s Exascale Computing Project
    The ECP is a collaborative effort of two DOE organizations – the The DOE’s Office of Science and theThe DOE’s National Nuclear Security Administration. As part of the National Strategic Computing initiative, ECP was established to accelerate delivery of a capable exascale ecosystem, encompassing applications, system software, hardware technologies and architectures, and workforce development to meet the scientific and national security mission needs of DOE in the early-2020s time frame.

    About the Office of Science

    The DOE’s Office of Science is the single largest supporter of basic research in the physical sciences in the United States and is working to address some of the most pressing challenges of our time. For more information, please visit https://science.energy.gov/.

    About The NNSA

    Established by Congress in 2000, NNSA is a semi-autonomous agency within the DOE responsible for enhancing national security through the military application of nuclear science. NNSA maintains and enhances the safety, security, and effectiveness of the U.S. nuclear weapons stockpile without nuclear explosive testing; works to reduce the global danger from weapons of mass destruction; provides the U.S. Navy with safe and effective nuclear propulsion; and responds to nuclear and radiological emergencies in the United States and abroad. https://nnsa.energy.gov

    The Goal of ECP’s Application Development focus area is to deliver a broad array of comprehensive science-based computational applications that effectively utilize exascale HPC technology to provide breakthrough simulation and data analytic solutions for scientific discovery, energy assurance, economic competitiveness, health enhancement, and national security.

    Awareness of ECP and its mission is growing and resonating—and for good reason. ECP is an incredible effort focused on advancing areas of key importance to our country: economic competiveness, breakthrough science and technology, and national security. And, fortunately, ECP has a foundation that bodes extremely well for the prospects of its success, with the demonstrably strong commitment of the US Department of Energy (DOE) and the talent of some of America’s best and brightest researchers.

    ECP is composed of about 100 small teams of domain, computer, and computational scientists, and mathematicians from DOE labs, universities, and industry. We are tasked with building applications that will execute well on exascale systems, enabled by a robust exascale software stack, and supporting necessary vendor R&D to ensure the compute nodes and hardware infrastructure are adept and able to do the science that needs to be done with the first exascale platforms.the science that needs to be done with the first exascale platforms.

  • richardmitnick 4:10 pm on May 30, 2022 Permalink | Reply
    Tags: "Frontier supercomputer debuts as world’s fastest-breaking exascale barrier", , , , Exascale computing, , , ,   

    From The DOE’s Oak Ridge National Laboratory: “Frontier supercomputer debuts as world’s fastest-breaking exascale barrier” 

    From The DOE’s Oak Ridge National Laboratory

    May 30, 2022

    Media Contacts:

    Sara Shoemaker

    Secondary Media Contact
    Katie Bethea
    Oak Ridge Leadership Computing Facility

    Frontier: The World’s First Exascale Supercomputer Has Arrived

    The Frontier supercomputer [below] at the Department of Energy’s Oak Ridge National Laboratory earned the top ranking today as the world’s fastest on the 59th TOP500 list, with 1.1 exaflops of performance. The system is the first to achieve an unprecedented level of computing performance known as exascale, a threshold of a quintillion calculations per second.

    Frontier features a theoretical peak performance of 2 exaflops, or two quintillion calculations per second, making it ten times more powerful than ORNL’s Summit system [below]. The system leverages ORNL’s extensive expertise in accelerated computing and will enable scientists to develop critically needed technologies for the country’s energy, economic and national security, helping researchers address problems of national importance that were impossible to solve just five years ago.

    “Frontier is ushering in a new era of exascale computing to solve the world’s biggest scientific challenges,” ORNL Director Thomas Zacharia said. “This milestone offers just a preview of Frontier’s unmatched capability as a tool for scientific discovery. It is the result of more than a decade of collaboration among the national laboratories, academia and private industry, including DOE’s Exascale Computing Project, which is deploying the applications, software technologies, hardware and integration necessary to ensure impact at the exascale.”

    Rankings were announced at the International Supercomputing Conference 2022 in Hamburg, Germany, which gathers leaders from around the world in the field of high-performance computing, or HPC. Frontier’s speeds surpassed those of any other supercomputer in the world, including ORNL’s Summit, which is also housed at ORNL’s Oak Ridge Leadership Computing Facility, a DOE Office of Science user facility.

    Frontier, a HPE Cray EX supercomputer, also claimed the number one spot on the Green500 list, which rates energy use and efficiency by commercially available supercomputing systems, with 62.68 gigaflops performance per watt. Frontier rounded out the twice-yearly rankings with the top spot in a newer category, mixed-precision computing, that rates performance in formats commonly used for artificial intelligence, with a performance of 6.88 exaflops.

    The work to deliver, install and test Frontier began during the COVID-19 pandemic, as shutdowns around the world strained international supply chains. More than 100 members of a public-private team worked around the clock, from sourcing millions of components to ensuring deliveries of system parts on deadline to carefully installing and testing 74 HPE Cray EX supercomputer cabinets, which include more than 9,400 AMD-powered nodes and 90 miles of networking cables.

    “When researchers gain access to the fully operational Frontier system later this year, it will mark the culmination of work that began over three years ago involving hundreds of talented people across the Department of Energy and our industry partners at HPE and AMD,” ORNL Associate Lab Director for computing and computational sciences Jeff Nichols said. “Scientists and engineers from around the world will put these extraordinary computing speeds to work to solve some of the most challenging questions of our era, and many will begin their exploration on Day One.”


    Frontier’s overall performance of 1.1 exaflops translates to more than one quintillion floating point operations per second, or flops, as measured by the High-Performance Linpack Benchmark test. Each flop represents a possible calculation, such as addition, subtraction, multiplication or division.

    Frontier’s early performance on the Linpack benchmark amounts to more than seven times that of Summit at 148.6 petaflops. Summit continues as an impressive, highly ranked workhorse machine for open science, listed at number four on the TOP500.

    Frontier’s mixed-precision computing performance clocked in at roughly 6.88 exaflops, or more than 6.8 quintillion flops per second, as measured by the High-Performance Linpack-Accelerator Introspection, or HPL-AI, test. The HPL-AI test measures calculation speeds in the computing formats typically used by the machine-learning methods that drive advances in artificial intelligence.

    Detailed simulations relied on by traditional HPC users to model such phenomena as cancer cells, supernovas, the coronavirus or the atomic structure of elements require 64-bit precision, a computationally demanding form of computing accuracy. Machine-learning algorithms typically require much less precision — sometimes as little as 32-, 24- or 16-bit accuracy — and can take advantage of special hardware in the graphic processing units, or GPUs, relied on by machines like Frontier to reach even faster speeds.

    ORNL and its partners continue to execute the bring-up of Frontier on schedule. Next steps include continued testing and validation of the system, which remains on track for final acceptance and early science access later in 2022 and open for full science at the beginning of 2023.

    Credit: Laddy Fields/ORNL, U.S. Dept. of Energy.


    The Frontier supercomputer’s exascale performance is enabled by some of the world’s most advanced pieces of technology from HPE and AMD:

    Frontier has 74 HPE Cray EX supercomputer cabinets, which are purpose-built to support next-generation supercomputing performance and scale, once open for early science access.

    Each node contains one optimized EPYC™ processor and four AMD Instinct™ accelerators, for a total of more than 9,400 CPUs and more than 37,000 GPUs in the entire system. These nodes provide developers with easier capabilities to program their applications, due to the coherency enabled by the EPYC processors and Instinct accelerators.

    HPE Slingshot, the world’s only high-performance Ethernet fabric designed for next-generation HPC and AI solutions, including larger, data-intensive workloads, to address demands for higher speed and congestion control for applications to run smoothly and boost performance.

    An I/O subsystem from HPE that will come online this year to support Frontier and the OLCF. The I/O subsystem features an in-system storage layer and Orion, a Lustre-based enhanced center-wide file system that is also the world’s largest and fastest single parallel file system, based on the Cray ClusterStor E1000 storage system. The in-system storage layer will employ compute-node local storage devices connected via PCIe Gen4 links to provide peak read speeds of more than 75 terabytes per second, peak write speeds of more than 35 terabytes per second, and more than 15 billion random-read input/output operations per second. The Orion center-wide file system will provide around 700 petabytes of storage capacity and peak write speeds of 5 terabytes per second.

    As a next-generation supercomputing system and the world’s fastest for open science, Frontier is also energy-efficient, due to its liquid-cooled capabilities. This cooling system promotes a quieter data center by removing the need for a noisier, air-cooled system.

    See the full article here .


    Please help promote STEM in your local schools.

    Stem Education Coalition

    Established in 1942, The DOE’s Oak Ridge National Laboratory is the largest science and energy national laboratory in the Department of Energy system (by size) and third largest by annual budget. It is located in the Roane County section of Oak Ridge, Tennessee. Its scientific programs focus on materials, neutron science, energy, high-performance computing, systems biology and national security, sometimes in partnership with the state of Tennessee, universities and other industries.

    ORNL has several of the world’s top supercomputers, including Summit [below], ranked by the TOP500 as Earth’s second-most powerful.

    ORNL OLCF IBM Q AC922 SUMMIT supercomputer, was No.1 on the TOP500..

    The lab is a leading neutron and nuclear power research facility that includes the Spallation Neutron Source and High Flux Isotope Reactor.

    ORNL Spallation Neutron Source annotated.

    It hosts the Center for Nanophase Materials Sciences, the BioEnergy Science Center, and the Consortium for Advanced Simulation of Light Water Nuclear Reactors.

    ORNL is managed by UT-Battelle for the Department of Energy’s Office of Science. DOE’s Office of Science is the single largest supporter of basic research in the physical sciences in the United States, and is working to address some of the most pressing challenges of our time.

    Areas of research

    ORNL conducts research and development activities that span a wide range of scientific disciplines. Many research areas have a significant overlap with each other; researchers often work in two or more of the fields listed here. The laboratory’s major research areas are described briefly below.

    Chemical sciences – ORNL conducts both fundamental and applied research in a number of areas, including catalysis, surface science and interfacial chemistry; molecular transformations and fuel chemistry; heavy element chemistry and radioactive materials characterization; aqueous solution chemistry and geochemistry; mass spectrometry and laser spectroscopy; separations chemistry; materials chemistry including synthesis and characterization of polymers and other soft materials; chemical biosciences; and neutron science.
    Electron microscopy – ORNL’s electron microscopy program investigates key issues in condensed matter, materials, chemical and nanosciences.
    Nuclear medicine – The laboratory’s nuclear medicine research is focused on the development of improved reactor production and processing methods to provide medical radioisotopes, the development of new radionuclide generator systems, the design and evaluation of new radiopharmaceuticals for applications in nuclear medicine and oncology.
    Physics – Physics research at ORNL is focused primarily on studies of the fundamental properties of matter at the atomic, nuclear, and subnuclear levels and the development of experimental devices in support of these studies.
    Population – ORNL provides federal, state and international organizations with a gridded population database, called Landscan, for estimating ambient population. LandScan is a raster image, or grid, of population counts, which provides human population estimates every 30 x 30 arc seconds, which translates roughly to population estimates for 1 kilometer square windows or grid cells at the equator, with cell width decreasing at higher latitudes. Though many population datasets exist, LandScan is the best spatial population dataset, which also covers the globe. Updated annually (although data releases are generally one year behind the current year) offers continuous, updated values of population, based on the most recent information. Landscan data are accessible through GIS applications and a USAID public domain application called Population Explorer.

  • richardmitnick 2:10 pm on September 28, 2021 Permalink | Reply
    Tags: "The co-evolution of particle physics and computing", , , Exascale computing, , , ,   

    From Symmetry: “The co-evolution of particle physics and computing” 

    Symmetry Mag

    From Symmetry

    Stephanie Melchor

    Illustration by Sandbox Studio, Chicago with Ariel Davis.

    Over time, particle physics and astrophysics and computing have built upon one another’s successes. That co-evolution continues today.

    In the mid-twentieth century, particle physicists were peering deeper into the history and makeup of the universe than ever before. Over time, their calculations became too complex to fit on a blackboard—or to farm out to armies of human “computers” doing calculations by hand.

    To deal with this, they developed some of the world’s earliest electronic computers.

    Physics has played an important role in the history of computing. The transistor—the switch that controls the flow of electrical signal within a computer—was invented by a group of physicists at Bell Labs. The incredible computational demands of particle physics and astrophysics experiments have consistently pushed the boundaries of what is possible. They have encouraged the development of new technologies to handle tasks from dealing with avalanches of data to simulating interactions on the scales of both the cosmos and the quantum realm.

    But this influence doesn’t just go one way. Computing plays an essential role in particle physics and astrophysics as well. As computing has grown increasingly more sophisticated, its own progress has enabled new scientific discoveries and breakthroughs.

    Illustration by Sandbox Studio, Chicago with Ariel Davis.

    Managing an onslaught of data

    In 1973, scientists at DOE’s Fermi National Accelerator Laboratory (US) in Illinois got their first big mainframe computer: a 7-year-old hand-me-down from DOE’s Lawrence Berkeley National Laboratory (US). Called the CDC 6600, it weighed about 6 tons. Over the next five years, Fermilab added five more large mainframe computers to its collection.

    Then came the completion of the Tevatron—at the time, the world’s highest-energy particle accelerator—which would provide the particle beams for numerous experiments at the lab.


    FNAL/Tevatron map

    Tevatron Accelerator


    FNAL/Tevatron CDF detector

    FNAL/Tevatron DØ detector


    By the mid-1990s, two four-story particle detectors would begin selecting, storing and analyzing data from millions of particle collisions at the Tevatron per second. Called the Collider Detector at Fermilab and the DØ detector, these new experiments threatened to overpower the lab’s computational abilities.

    In December of 1983, a committee of physicists and computer scientists released a 103-page report highlighting the “urgent need for an upgrading of the laboratory’s computer facilities.” The report said the lab “should continue the process of catching up” in terms of computing ability, and that “this should remain the laboratory’s top computing priority for the next few years.”

    Instead of simply buying more large computers (which were incredibly expensive), the committee suggested a new approach: They recommended increasing computational power by distributing the burden over clusters or “farms” of hundreds of smaller computers.

    Thanks to Intel’s 1971 development of a new commercially available microprocessor the size of a domino, computers were shrinking. Fermilab was one of the first national labs to try the concept of clustering these smaller computers together, treating each particle collision as a computationally independent event that could be analyzed on its own processor.

    Like many new ideas in science, it wasn’t accepted without some pushback.

    Joel Butler, a physicist at Fermilab who was on the computing committee, recalls, “There was a big fight about whether this was a good idea or a bad idea.”

    A lot of people were enchanted with the big computers, he says. They were impressive-looking and reliable, and people knew how to use them. And then along came “this swarm of little tiny devices, packaged in breadbox-sized enclosures.”

    The computers were unfamiliar, and the companies building them weren’t well-established. On top of that, it wasn’t clear how well the clustering strategy would work.

    As for Butler? “I raised my hand [at a meeting] and said, ‘Good idea’—and suddenly my entire career shifted from building detectors and beamlines to doing computing,” he chuckles.

    Not long afterward, innovation that sparked for the benefit of particle physics enabled another leap in computing. In 1989, Tim Berners-Lee, a computer scientist at European Organization for Nuclear Research [Organisation européenne pour la recherche nucléaire] [Europäische Organisation für Kernforschung](CH) [CERN], launched the World Wide Web to help CERN physicists share data with research collaborators all over the world.

    To be clear, Berners-Lee didn’t create the internet—that was already underway in the form the ARPANET, developed by the US Department of Defense.


    But the ARPANET connected only a few hundred computers, and it was difficult to share information across machines with different operating systems.

    The web Berners-Lee created was an application that ran on the internet, like email, and started as a collection of documents connected by hyperlinks. To get around the problem of accessing files between different types of computers, he developed HTML (HyperText Markup Language), a programming language that formatted and displayed files in a web browser independent of the local computer’s operating system.

    Berners-Lee also developed the first web browser, allowing users to access files stored on the first web server (Berners-Lee’s computer at CERN).

    NCSA MOSAIC Browser


    He implemented the concept of a URL (Uniform Resource Locator), specifying how and where to access desired web pages.

    What started out as an internal project to help particle physicists share data within their institution fundamentally changed not just computing, but how most people experience the digital world today.

    Back at Fermilab, cluster computing wound up working well for handling the Tevatron data. Eventually, it became industry standard for tech giants like Google and Amazon.

    Over the next decade, other US national laboratories adopted the idea, too. DOE’s SLAC National Accelerator Laboratory (US)—then called Stanford Linear Accelerator Center—transitioned from big mainframes to clusters of smaller computers to prepare for its own extremely data-hungry experiment, BaBar.

    SLAC National Accelerator Laboratory(US) BaBar

    Both SLAC and Fermilab also were early adopters of Lee’s web server. The labs set up the first two websites in the United States, paving the way for this innovation to spread across the continent.

    In 1989, in recognition of the growing importance of computing in physics, Fermilab Director John Peoples elevated the computing department to a full-fledged division. The head of a division reports directly to the lab director, making it easier to get resources and set priorities. Physicist Tom Nash formed the new Computing Division, along with Butler and two other scientists, Irwin Gaines and Victoria White. Butler led the division from 1994 to 1998.

    High-performance computing in particle physics and astrophysics

    These computational systems worked well for particle physicists for a long time, says Berkeley Lab astrophysicist Peter Nugent. That is, until Moore’s Law started grinding to a halt.

    Moore’s Law is the idea that the number of transistors in a circuit will double, making computers faster and cheaper, every two years. The term was first coined in the mid-1970s, and the trend reliably proceeded for decades. But now, computer manufacturers are starting to hit the physical limit of how many tiny transistors they can cram onto a single microchip.

    Because of this, says Nugent, particle physicists have been looking to take advantage of high-performance computing instead.

    Nugent says high-performance computing is “something more than a cluster, or a cloud-computing environment that you could get from Google or AWS, or at your local university.”

    What it typically means, he says, is that you have high-speed networking between computational nodes, allowing them to share information with each other very, very quickly. When you are computing on up to hundreds of thousands of nodes simultaneously, it massively speeds up the process.

    On a single traditional computer, he says, 100 million CPU hours translates to more than 11,000 years of continuous calculations. But for scientists using a high-performance computing facility at Berkeley Lab, DOE’s Argonne National Laboratory (US) or DOE’s Oak Ridge National Laboratory (US), 100 million hours is a typical, large allocation for one year at these facilities.

    Although astrophysicists have always relied on high-performance computing for simulating the birth of stars or modeling the evolution of the cosmos, Nugent says they are now using it for their data analysis as well.

    This includes rapid image-processing computations that have enabled the observations of several supernovae, including SN 2011fe, captured just after it began. “We found it just a few hours after it exploded, all because we were able to run these pipelines so efficiently and quickly,” Nugent says.

    According to Berkeley Lab physicist Paolo Calafiura, particle physicists also use high-performance computing for simulations—for modeling not the evolution of the cosmos, but rather what happens inside a particle detector. “Detector simulation is significantly the most computing-intensive problem that we have,” he says.

    Scientists need to evaluate multiple possibilities for what can happen when particles collide. To properly correct for detector effects when analyzing particle detector experiments, they need to simulate more data than they collect. “If you collect 1 billion collision events a year,” Calafiura says, “you want to simulate 10 billion collision events.”

    Calafiura says that right now, he’s more worried about finding a way to store all of the simulated and actual detector data than he is about producing it, but he knows that won’t last.

    “When does physics push computing?” he says. “When computing is not good enough… We see that in five years, computers will not be powerful enough for our problems, so we are pushing hard with some radically new ideas, and lots of detailed optimization work.”

    That’s why The Department of Energy’s Exascale Computing Project aims to build, in the next few years, computers capable of performing a quintillion (that is, a billion billion) operations per second. The new computers will be 1000 times faster than the current fastest computers.

    Depiction of ANL ALCF Cray Intel SC18 Shasta Aurora exascale supercomputer, to be built at DOE’s Argonne National Laboratory.

    The exascale computers will also be used for other applications ranging from precision medicine to climate modeling to national security.

    Machine learning and quantum computing

    Innovations in computer hardware have enabled astrophysicists to push the kinds of simulations and analyses they can do. For example, Nugent says, the introduction of graphics processing units [GPU’s] has sped up astrophysicists’ ability to do calculations used in machine learning, leading to an explosive growth of machine learning in astrophysics.

    With machine learning, which uses algorithms and statistics to identify patterns in data, astrophysicists can simulate entire universes in microseconds.

    Machine learning has been important in particle physics as well, says Fermilab scientist Nhan Tran. “[Physicists] have very high-dimensional data, very complex data,” he says. “Machine learning is an optimal way to find interesting structures in that data.”

    The same way a computer can be trained to tell the difference between cats and dogs in pictures, it can learn how to identify particles from physics datasets, distinguishing between things like pions and photons.

    Tran says using computation this way can accelerate discovery. “As physicists, we’ve been able to learn a lot about particle physics and nature using non-machine-learning algorithms,” he says. “But machine learning can drastically accelerate and augment that process—and potentially provide deeper insight into the data.”

    And while teams of researchers are busy building exascale computers, others are hard at work trying to build another type of supercomputer: the quantum computer.

    Remember Moore’s Law? Previously, engineers were able to make computer chips faster by shrinking the size of electrical circuits, reducing the amount of time it takes for electrical signals to travel. “Now our technology is so good that literally the distance between transistors is the size of an atom,” Tran says. “So we can’t keep scaling down the technology and expect the same gains we’ve seen in the past.”

    To get around this, some researchers are redefining how computation works at a fundamental level—like, really fundamental.

    The basic unit of data in a classical computer is called a bit, which can hold one of two values: 1, if it has an electrical signal, or 0, if it has none. But in quantum computing, data is stored in quantum systems—things like electrons, which have either up or down spins, or photons, which are polarized either vertically or horizontally. These data units are called “qubits.”

    Here’s where it gets weird. Through a quantum property called superposition, qubits have more than just two possible states. An electron can be up, down, or in a variety of stages in between.

    What does this mean for computing? A collection of three classical bits can exist in only one of eight possible configurations: 000, 001, 010, 100, 011, 110, 101 or 111. But through superposition, three qubits can be in all eight of these configurations at once. A quantum computer can use that information to tackle problems that are impossible to solve with a classical computer.

    Fermilab scientist Aaron Chou likens quantum problem-solving to throwing a pebble into a pond. The ripples move through the water in every possible direction, “simultaneously exploring all of the possible things that it might encounter.”

    In contrast, a classical computer can only move in one direction at a time.

    But this makes quantum computers faster than classical computers only when it comes to solving certain types of problems. “It’s not like you can take any classical algorithm and put it on a quantum computer and make it better,” says University of California, Santa Barbara physicist John Martinis, who helped build Google’s quantum computer.

    Although quantum computers work in a fundamentally different way than classical computers, designing and building them wouldn’t be possible without traditional computing laying the foundation, Martinis says. “We’re really piggybacking on a lot of the technology of the last 50 years or more.”

    The kinds of problems that are well suited to quantum computing are intrinsically quantum mechanical in nature, says Chou.

    For instance, Martinis says, consider quantum chemistry. Solving quantum chemistry problems with classical computers is so difficult, he says, that 10 to 15% of the world’s supercomputer usage is currently dedicated to the task. “Quantum chemistry problems are hard for the very reason why a quantum computer is powerful”—because to complete them, you have to consider all the different quantum-mechanical states of all the individual atoms involved.

    Because making better quantum computers would be so useful in physics research, and because building them requires skills and knowledge that physicists possess, physicists are ramping up their quantum efforts. In the United States, the National Quantum Initiative Act of 2018 called for the The National Institute of Standards and Technology (US), The National Science Foundation (US) and The Department of Energy (US) to support programs, centers and consortia devoted to quantum information science.

    Coevolution requires cooperation

    In the early days of computational physics, the line between who was a particle physicist and who was a computer scientist could be fuzzy. Physicists used commercially available microprocessors to build custom computers for experiments. They also wrote much of their own software—ranging from printer drivers to the software that coordinated the analysis between the clustered computers.

    Nowadays, roles have somewhat shifted. Most physicists use commercially available devices and software, allowing them to focus more on the physics, Butler says. But some people, like Anshu Dubey, work right at the intersection of the two fields. Dubey is a computational scientist at DOE’s Argonne National Laboratory (US) who works with computational physicists.

    When a physicist needs to computationally interpret or model a phenomenon, sometimes they will sign up a student or postdoc in their research group for a programming course or two and then ask them to write the code to do the job. Although these codes are mathematically complex, Dubey says, they aren’t logically complex, making them relatively easy to write.

    A simulation of a single physical phenomenon can be neatly packaged within fairly straightforward code. “But the real world doesn’t want to cooperate with you in terms of its modularity and encapsularity,” she says.

    Multiple forces are always at play, so to accurately model real-world complexity, you have to use more complex software—ideally software that doesn’t become impossible to maintain as it gets updated over time. “All of a sudden,” says Dubey, “you start to require people who are creative in their own right—in terms of being able to architect software.”

    That’s where people like Dubey come in. At Argonne, Dubey develops software that researchers use to model complex multi-physics systems—incorporating processes like fluid dynamics, radiation transfer and nuclear burning.

    Hiring computer scientists for research projects in physics and other fields of science can be a challenge, Dubey says. Most funding agencies specify that research money can be used for hiring students and postdocs, but not paying for software development or hiring dedicated engineers. “There is no viable career path in academia for people whose careers are like mine,” she says.

    In an ideal world, universities would establish endowed positions for a team of research software engineers in physics departments with a nontrivial amount of computational research, Dubey says. These engineers would write reliable, well-architected code, and their institutional knowledge would stay with a team.

    Physics and computing have been closely intertwined for decades. However the two develop—toward new analyses using artificial intelligence, for example, or toward the creation of better and better quantum computers—it seems they will remain on this path together.

    See the full article here .


    Please help promote STEM in your local schools.

    Stem Education Coalition

    Symmetry is a joint Fermilab/SLAC publication.

  • richardmitnick 11:57 am on August 26, 2021 Permalink | Reply
    Tags: "Motion detectors", , , , , Earthquake Simulation (EQSIM) project, Earthquake simulators angle to use exascale computers to detail site-specific ground movement., Exascale computing, Geotechnical Engineering, , Structural engineering, The San Francisco Bay area serves as EQSIM’s subject for testing computational models of the Hayward fault., The University of Nevada-Reno (US)   

    From DOE’s ASCR Discovery (US) : “Motion detectors” 

    From DOE’s ASCR Discovery (US)

    DOE’s Lawrence Berkeley National Laboratory (US)-led earthquake simulators angle to use exascale computers to detail site-specific ground movement.

    Models can now couple ground-shaking duration and intensity along the Hayward Fault with damage potential to skyscrapers and smaller residential and commercial buildings (red = most damaging, green = least). Image courtesy of David McCallen/Berkeley Lab.

    This research team wants to make literal earthshaking discoveries every day.

    “Earthquakes are a tremendous societal problem,” says David McCallen, a senior scientist at the U.S. Department of Energy’s Lawrence Berkeley National Laboratory who heads the Earthquake Simulation (EQSIM) project. “Whether it’s the Pacific Northwest or the Los Angeles Basin or San Francisco or the New Madrid Zone in the Midwest, they’re going to happen.”

    A part of the DOE’s Exascale Computing Project, the EQSIM collaboration comprises researchers from Berkeley Lab, DOE’s Lawrence Livermore National Laboratory and The University of Nevada-Reno (US).

    The San Francisco Bay area serves as EQSIM’s subject for testing computational models of the Hayward fault. Considered a major threat, the steadily creeping fault runs throughout the East Bay area.

    “If you go to Hayward and look at the sidewalks and the curbs, you see little offsets because the earth is creeping,” McCallen says. As the earth moves it stores strain energy in the rocks below. When that energy releases, seismic waves radiate from the fault, shaking the ground. “That’s what you feel when you feel an earthquake.

    The Hayward fault ruptures every 140 or 150 years, on average. The last rupture came in 1868 – 153 years ago.

    Historically speaking, the Bay Area may be due for a major earthquake along the Hayward Fault. Image courtesy of Geological Survey (US).

    “Needless to say, we didn’t have modern seismic instruments measuring that rupture,” McCallen notes. “It’s a challenge having no data to try to predict what the motions will be for the next earthquake.”

    That data dearth led earth scientists to try a work-around. They assumed that data taken from earthquakes elsewhere around the world would apply to the Hayward fault.

    That helps to an extent, McCallen says. “But it’s well-recognized that earthquake motions tend to be very specific in a region and at any specific site as a result of the geologic setting.” That has prompted researchers to take a new approach: focusing on data most relevant to a specific fault like Hayward.

    “If you have no data, that’s hard to do,” McCallen says. “That’s the promise of advanced simulations: to understand the site-specific character of those motions.”

    Part of the project has advanced earthquake models’ computational workflow from start to finish. This includes syncing regional-scale models and with structural ones to refine earthquake wave forms’ three-dimensional complexity as they strike buildings and infrastructure.

    “We’re coupling multiple codes to be able to do that efficiently,” McCallen says. “We’re at the phase now where those advanced algorithm developments are being finished.”

    Developing the workflow presents many challenges to ensure that every step is efficient and effective. The software tools that DOE is developing for exascale platforms have helped optimize EQSIM’s ability to store and retrieve massive datasets.

    The process includes creating a computational representation of Earth that may contain 200 billion grid points. (If those grid points were seconds, that would equal 6,400 years.) With simulations this size, McCallen says, inefficiencies become obvious immediately. “You really want to make sure that the way you set up that grid is optimized and matched closely to the natural variation of the Earth’s geologic properties.”

    The project’s earthquake simulations cut across three disciplines. The process starts with seismology. That covers the rupture of an earthquake fault and seismic wave propagation through highly varied rock layers. Next, the waves arrive at a building. “That tends to transition into being both a geotechnical and a structural-engineering problem,” McCallen notes. Geotechnical engineers can analyze quake-affected soils’ complex behavior near the surface. Finally, seismic waves impinge upon a building and the soil island that supports it. That’s the structural engineer’s domain.

    EQSIM researchers have already improved their geophysics code’s performance to simulate Bay Area ground motions at a regional scale. “We’re trying to get to what we refer to as higher-frequency resolution. We want to generate the ground motions that have the dynamics in them relevant to engineered structures.”

    Early simulations at 1 or 2 hertz – vibration cycles per second – couldn’t approximate the ground motions at 5 to 10 hertz that rock buildings and bridges. Using the DOE’s Oak Ridge National Laboratory’s Summit supercomputer, EQSIM has now surpassed 5 hertz for the entire Bay Area. More work remains to be done at the exascale, however, to simulate the area’s geologic structure at the 10-hertz upper end.

    Livermore’s SW4 code for 3-D seismic modeling served as EQSIM’s foundation. The team boosted the code’s speed and efficiency to optimize performance on massively parallel machines, which deploy many processors to perform multiple calculations simultaneously. Even so, an earthquake simulation can take 20 to 30 hours to complete, but the team hopes to reduce that time by harnessing the full power of exascale platforms – performing a quintillion operations a second – that DOE is completing this year at its leadership computing facilities. The first exascale systems will operate at 5 to 10 times the capability of today’s most powerful petascale systems.

    The potential payoff, McCallen says: saved lives and reduced economic loss. “We’ve been fortunate in this country in that we haven’t had a really large earthquake in a long time, but we know they’re coming. It’s inevitable.”

    See the full article here.


    Please help promote STEM in your local schools.

    Stem Education Coalition

    ASCRDiscovery is a publication of The U.S. Department of Energy

    The United States Department of Energy (DOE)(US) is a cabinet-level department of the United States Government concerned with the United States’ policies regarding energy and safety in handling nuclear material. Its responsibilities include the nation’s nuclear weapons program; nuclear reactor production for the United States Navy; energy conservation; energy-related research; radioactive waste disposal; and domestic energy production. It also directs research in genomics. the Human Genome Project originated in a DOE initiative. DOE sponsors more research in the physical sciences than any other U.S. federal agency, the majority of which is conducted through its system of National Laboratories. The agency is led by the United States Secretary of Energy, and its headquarters are located in Southwest Washington, D.C., on Independence Avenue in the James V. Forrestal Building, named for James Forrestal, as well as in Germantown, Maryland.

    Formation and consolidation

    In 1942, during World War II, the United States started the Manhattan Project, a project to develop the atomic bomb, under the eye of the U.S. Army Corps of Engineers. After the war in 1946, the Atomic Energy Commission (AEC) was created to control the future of the project. The Atomic Energy Act of 1946 also created the framework for the first National Laboratories. Among other nuclear projects, the AEC produced fabricated uranium fuel cores at locations such as Fernald Feed Materials Production Center in Cincinnati, Ohio. In 1974, the AEC gave way to the Nuclear Regulatory Commission, which was tasked with regulating the nuclear power industry and the Energy Research and Development Administration, which was tasked to manage the nuclear weapon; naval reactor; and energy development programs.

    The 1973 oil crisis called attention to the need to consolidate energy policy. On August 4, 1977, President Jimmy Carter signed into law The Department of Energy Organization Act of 1977 (Pub.L. 95–91, 91 Stat. 565, enacted August 4, 1977), which created the Department of Energy(US). The new agency, which began operations on October 1, 1977, consolidated the Federal Energy Administration; the Energy Research and Development Administration; the Federal Power Commission; and programs of various other agencies. Former Secretary of Defense James Schlesinger, who served under Presidents Nixon and Ford during the Vietnam War, was appointed as the first secretary.

    President Carter created the Department of Energy with the goal of promoting energy conservation and developing alternative sources of energy. He wanted to not be dependent on foreign oil and reduce the use of fossil fuels. With international energy’s future uncertain for America, Carter acted quickly to have the department come into action the first year of his presidency. This was an extremely important issue of the time as the oil crisis was causing shortages and inflation. With the Three-Mile Island disaster, Carter was able to intervene with the help of the department. Carter made switches within the Nuclear Regulatory Commission in this case to fix the management and procedures. This was possible as nuclear energy and weapons are responsibility of the Department of Energy.


    On March 28, 2017, a supervisor in the Office of International Climate and Clean Energy asked staff to avoid the phrases “climate change,” “emissions reduction,” or “Paris Agreement” in written memos, briefings or other written communication. A DOE spokesperson denied that phrases had been banned.

    In a May 2019 press release concerning natural gas exports from a Texas facility, the DOE used the term ‘freedom gas’ to refer to natural gas. The phrase originated from a speech made by Secretary Rick Perry in Brussels earlier that month. Washington Governor Jay Inslee decried the term “a joke”.


    The Department of Energy operates a system of national laboratories and technical facilities for research and development, as follows:

    Ames Laboratory
    Argonne National Laboratory
    Brookhaven National Laboratory
    Fermi National Accelerator Laboratory
    Idaho National Laboratory
    Lawrence Berkeley National Laboratory
    Lawrence Livermore National Laboratory
    Los Alamos National Laboratory
    National Energy Technology Laboratory
    National Renewable Energy Laboratory
    Oak Ridge National Laboratory
    Pacific Northwest National Laboratory
    Princeton Plasma Physics Laboratory
    Sandia National Laboratories
    Savannah River National Laboratory
    SLAC National Accelerator Laboratory
    Thomas Jefferson National Accelerator Facility

    Other major DOE facilities include:
    Albany Research Center
    Bannister Federal Complex
    Bettis Atomic Power Laboratory – focuses on the design and development of nuclear power for the U.S. Navy
    Kansas City Plant
    Knolls Atomic Power Laboratory – operates for Naval Reactors Program Research under the DOE (not a National Laboratory)
    National Petroleum Technology Office
    Nevada Test Site
    New Brunswick Laboratory
    Office of Fossil Energy[32]
    Office of River Protection[33]
    Radiological and Environmental Sciences Laboratory
    Y-12 National Security Complex
    Yucca Mountain nuclear waste repository

    Pahute Mesa Airstrip – Nye County, Nevada, in supporting Nevada National Security Site

  • richardmitnick 2:00 pm on June 15, 2021 Permalink | Reply
    Tags: "Forthcoming revolution will unveil the secrets of matter", , , , European High Performance Computer Joint Undertaking (EU), Exaflop computers, Exascale computing, ,   

    From CNRS-The National Center for Scientific Research [Centre national de la recherche scientifique] (FR) : “Forthcoming revolution will unveil the secrets of matter” 

    From CNRS-The National Center for Scientific Research [Centre national de la recherche scientifique] (FR)

    Martin Koppe

    ©Sikov /Stock.Adobe.com

    Provided adapted software can be developed, exascale computing, a new generation of supercomputers, will offer massive power to model the properties of molecules and materials, while taking into account their fundamental interactions and quantum mechanics. The TREX-Targeting Real Chemical accuracy at the EXascale (EU) project is set to meet the challenge.

    One quintillion operations per second. Exaflop computers – from the prefix -exa or 10^18, and flops, the number of floating-point operations that a computer can perform in one second – will offer this colossal computing power, as long as specifically designed programs and codes are available. An international race is thus underway to produce these impressive machines, and to take full advantage of their capacities. The European Commission is financing ambitious projects that are preparing the way for exascale, which is to say any form of high-performance computing that reaches an exaflop. The Targeting Real chemical precision at the EXascale (TREX)[1] programme focuses on highly precise computing methods in the fields of chemistry and materials physics.

    Compute nodes of the Jean Zay supercomputer, the first French converged supercomputer between intensive calculations and artificial intelligence. After its extension in the summer of 2020, it attained 28 petaflops, or 28 quintillion operations per second, thanks to its 86,344 cores supported by 2,696 GPU accelerators.
    © Cyril FRESILLON / Idris: A Language for Type-Driven Development / CNRS Photothèque.

    Officially inaugurated in October 2020, TREX is part of the broader European High Performance Computing (European High Performance Computer Joint Undertaking (EU)) joint undertaking, whose goal is to ensure Europe is a player alongside the United States and China in exascale computing. “The Japanese have already achieved exascale by lowering computational precision,” enthuses Anthony Scemama, a researcher at the LCPQ-Laboratoire de Chimie et Physique Quantiques (FR),[2] and one of the two CNRS coordinators of TREX. “A great deal of work remains to be done on codes if we want to take full advantage of these future machines.”

    Exascale computing will probably use GPUs as well as traditional processors, or CPUs. These graphics processors were originally developed for video games, but they have enjoyed increasing success in data-intensive computing applications. Here again, their use will entail rewriting programs to fully harness their power for those applications that will need it.

    “Chemistry researchers already have various computing techniques for producing simulations, such as modelling the interaction of light with a molecule,” Scemama explains. “TREX focuses on cases where the computing methods for a realistic and predictive description of the physical phenomena controlling chemical reactions are too costly.”

    “TREX is an interdisciplinary project that also includes physicists,” stresses CNRS researcher and project coordinator Michele Casula, at the Institute of Mineralogy, Material Physics and Cosmochemistry [Institut de minéralogie, de physique des matériaux et de cosmochimie (FR).[3] “Our two communities need computing methods that are powerful enough to accurately predict the behaviour of matter, which often requires far too much computation time for conventional computers.”
    The TREX team has identified several areas for applications. First of all, and surprising though it may seem, the physicochemical properties of water have not been sufficiently modelled. The best ab initio simulations – those based on fundamental interactions – are wrong by a few degrees when trying to estimate its boiling point.

    Improved water models will enable us to more effectively simulate the behaviour of proteins, which continually evolve in aqueous environments. The applications being developed in connection with the TREX project could have a significant impact on research in biology and pharmacy. For example, nitrogenases, which make essential contributions to life, transform nitrogen gas into ammonia, a form that can be used by organisms. However, the theoretical description of the physicochemical mechanisms used by this enzyme is not accurate enough under current models. Exascale computing should also improve experts’ understanding of highly correlated materials such as superconductors, which are characterised by the substantial interactions between the electrons they are made of.

    “The microscopic understanding of their functioning remains an unresolved issue, one that has nagged scientists ever since the 1980s,” Casula points out. “It is one of the major open problems in condensed matter physics. When mastered, these materials will, among other things, be able to transport electricity with no loss of energy.” 2D materials are also involved, especially those used in solar panels to convert light into power.

    “To model matter using quantum mechanics means relying on equations that become exponentially more complex, such as the Schrödinger equation, whose number of coordinates increases with the system, ” Casula adds. “In order to solve them in simulations, we either have to use quantum computers, or further explore the power of silicon analogue chips with exascale computing, along with suitable algorithms.”

    To achieve this, TREX members are counting on Quantum Monte Carlo (QMC), and developing libraries to integrate it into existing codes. “We are fortunate to have a method that perfectly matches exascale machines,” Scemama exclaims. QMC is particularly effective at digitally calculating observable values – the quantum equivalent of classical physical values – bringing into play quantum interactions between multiple particles.

    Modelling of electron trajectories in an aggregate of water, created by the QMC programme developed at the LCPQ in Toulouse (southwestern France). © Anthony Scemama / Laboratoire de Chimie et Physique Quantiques.

    “The full computation of these observables is too complex,” Casula stresses. “Accurately estimating them using deterministic methods could take more time than the age of the Universe. Simply put, QMC will not solve everything, but instead provides a statistical sampling of results. Exaflop computers could draw millions of samples per second, and thanks to statistical tools such as the central limit theorem, the more of these values we have, the closer we get to the actual result. We can thus obtain an approximation that is accurate enough to help researchers, all within an acceptable amount of time.”

    With regard to the study of matter, an exascale machine can provide a good description of the electron cloud and its interaction with nuclei. That is not the only advantage. “When configured properly, these machines may use thirty times more energy than classical supercomputers, but in return will produce a thousand times more computing power,” Scemama believes. “Researchers could launch very costly calculations, and use the results to build simpler models for future use.”

    The TREX team nevertheless insists that above all else, it creates technical and predictive tools for other researchers, who will then seek to develop concrete applications. Ongoing exchanges have made it possible to share best practices and feedback among processor manufacturers, physicists, chemists, researchers in high-performance computing, and TREX’s two computing centres.


    In addition to the CNRS, the project includes the universities of Versailles Saint-Quentin-en-Yvelines University [Université de Versailles Saint-Quentin-en-Yvelines – UVSQ] (FR); University of Twente [ Universiteit Twente] (NL), University of Vienna [Universität Wien] (AT)(Austria), Lodz University of Technology [Politechnika Łódzka] (PL) (Poland), the International School for Advanced Studies [Scuola Internazionale Superiore di Studi Avanzati] (IT) (Italy), the MPG Institutes (DE)(Germany), the Slovak University of Technology in Bratislava [Slovenská technická univerzita v Bratislave](STU)(SK) (Slovakia), as well as the Cineca (IT) (Italy) and Jülich Supercomputing Centre [Forschungszentrum Jülich ] (DE) (Germany) supercomputing centres, the MEGWARE [Deutsche Megware] Computer HPC Systems & Solutions (DE) and Trust-IT Services | Phidias (FR) companies.
    Laboratoire de chimie et physique quantiques (CNRS / Université Toulouse III – Paul Sabatier.
    CNRS / National Museum of Natural History [Muséum National d’Histoire Naturelle] (MNHN) (FR) / Sorbonne University [Sorbonne Université] (FR).

    See the full article here.


    Please help promote STEM in your local schools.

    Stem Education Coalition

    CNRS-The National Center for Scientific Research [Centre national de la recherche scientifique](FR) is the French state research organisation and is the largest fundamental science agency in Europe.

    In 2016, it employed 31,637 staff, including 11,137 tenured researchers, 13,415 engineers and technical staff, and 7,085 contractual workers. It is headquartered in Paris and has administrative offices in Brussels; Beijing; Tokyo; Singapore; Washington D.C.; Bonn; Moscow; Tunis; Johannesburg; Santiago de Chile; Israel; and New Delhi.

    The CNRS was ranked No. 3 in 2015 and No. 4 in 2017 by the Nature Index, which measures the largest contributors to papers published in 82 leading journals.

    The CNRS operates on the basis of research units, which are of two kinds: “proper units” (UPRs) are operated solely by the CNRS, and “joint units” (UMRs – French: Unité mixte de recherche)[9] are run in association with other institutions, such as universities or INSERM. Members of joint research units may be either CNRS researchers or university employees (maîtres de conférences or professeurs). Each research unit has a numeric code attached and is typically headed by a university professor or a CNRS research director. A research unit may be subdivided into research groups (“équipes”). The CNRS also has support units, which may, for instance, supply administrative, computing, library, or engineering services.

    In 2016, the CNRS had 952 joint research units, 32 proper research units, 135 service units, and 36 international units.

    The CNRS is divided into 10 national institutes:

    Institute of Chemistry (INC)
    Institute of Ecology and Environment (INEE)
    Institute of Physics (INP)
    Institute of Nuclear and Particle Physics (IN2P3)
    Institute of Biological Sciences (INSB)
    Institute for Humanities and Social Sciences (INSHS)
    Institute for Computer Sciences (INS2I)
    Institute for Engineering and Systems Sciences (INSIS)
    Institute for Mathematical Sciences (INSMI)
    Institute for Earth Sciences and Astronomy (INSU)

    The National Committee for Scientific Research, which is in charge of the recruitment and evaluation of researchers, is divided into 47 sections (e.g. section 41 is mathematics, section 7 is computer science and control, and so on).Research groups are affiliated with one primary institute and an optional secondary institute; the researchers themselves belong to one section. For administrative purposes, the CNRS is divided into 18 regional divisions (including four for the Paris region).

    Some selected CNRS laboratories

    APC laboratory
    Centre d’Immunologie de Marseille-Luminy
    Centre d’Etude Spatiale des Rayonnements
    Centre européen de calcul atomique et moléculaire
    Centre de Recherche et de Documentation sur l’Océanie
    CINTRA (joint research lab)
    Institut de l’information scientifique et technique
    Institut de recherche en informatique et systèmes aléatoires
    Institut d’astrophysique de Paris
    Institut de biologie moléculaire et cellulaire
    Institut Jean Nicod
    Laboratoire de Phonétique et Phonologie
    Laboratoire d’Informatique, de Robotique et de Microélectronique de Montpellier
    Laboratory for Analysis and Architecture of Systems
    Laboratoire d’Informatique de Paris 6
    Laboratoire d’informatique pour la mécanique et les sciences de l’ingénieur
    Observatoire océanologique de Banyuls-sur-Mer

  • richardmitnick 7:51 am on September 26, 2020 Permalink | Reply
    Tags: "Solving Big Questions Takes Big Computing Power—and a Robust Software Development Tools Portfolio", , , , Exascale computing,   

    From Exascale Computing Project: “Solving Big Questions Takes Big Computing Power—and a Robust Software Development Tools Portfolio” 

    From Exascale Computing Project

    Matt Lakin, Oak Ridge National Laboratory

    When the U.S. Department of Energy presses the start button on the world’s first generation of exascale supercomputers, scientists won’t want to wait to take full advantage of the unprecedented power behind the screens. The inaugural pair of machines, Aurora at Argonne National Laboratory in Chicago, and Frontier at Oak Ridge National Laboratory (ORNL) in Tennessee are set to run at speeds that top 1.5 exaflops apiece.

    Depiction of ANL ALCF Cray Intel SC18 Shasta Aurora exascale supercomputer.

    ORNL Cray Frontier Shasta based Exascale supercomputer with Slingshot interconnect featuring high-performance AMD EPYC CPU and AMD Radeon Instinct GPU technology.

    For the scorekeepers, that’s 1.5 quintillion (1018) calculations per second, or 15 million times the average power of the human brain and its 100 billion neurons. Summit, the world’s #2 supercomputing champion at ORNL, clocks in at 200 petaflops—200×1015 calculations per second, or 200 quadrillion.

    ORNL IBM AC922 SUMMIT supercomputer, was No.1 on the TOP500. Credit: Carlos Jones, Oak Ridge National Laboratory/U.S. Dept. of Energy.

    Frontier and Aurora promise to run nearly eight times faster.

    That scale of processing power could help resolve fundamental questions of modern science, from designing cleaner-burning combustion engines to identifying new treatments for cancer. More than 50 teams of researchers will be waiting to test out applications aimed at tackling such questions. Making the most of exascale’s power means building a software stack that exploits every advancement in parallel computing and heterogeneous architecture to leave no FLOP behind, said Doug Kothe, director of the DOE’s Exascale Computing Project (ECP).

    “We’re building what I expect to be the tools of the trade for decades to come, and they need to be ready to go on Day One,” Kothe said. “At the base of this whole pyramid is the software stack. One node on the exascale system will be the same or greater power as the highest-grade consumer processor. These nodes are going to be complicated beasts, and they’re going to be very challenging to program efficiently. But if we don’t find a way to do that, we’re going to leave a lot of computing power on the floor, and that’s not acceptable.”

    Software engineers and computer scientists with the ECP have spent the past five years laboring to build a pyramid that will support exascale’s full processing power—part of a twin effort as the DOE’s Exascale Computing Project works to make the massive leaps in computing speed.

    Unlike the average developer, these teams don’t have the luxury of testing out approaches on a finished product with trial audiences over time.

    “The more traditional approach would be to build every application and install them one at a time,” said Mike Heroux, a senior scientist at Sandia National Laboratories and director of software technology for the ECP.

    “We want to be able to take this software and install it as the operating system on any device from a laptop to the largest supercomputer and have it up and running immediately. This is a first-of-its-kind effort, because if it were available before, we would have done it before.

    “A lot of this hardware and software is all brand new, and it takes a lot of time to just debug something. Not only does the software need to run quickly, it needs to run smoothly, and it needs to run consistently across all platforms. It’s like writing software for both iPhone and Android while the phones are still in production.”

    Achieving those levels of speed and consistency requires rethinking the classical software architecture of processing and memory. To run efficiently at exascale speed, the new supercomputing giants will need to balance parallel operations running on parallel processors at an unprecedented scale.

    “Because any single processing element runs on the order of a few gigahertz, you can do at most a billion operations per second per processor,” Heroux said. “The transistors aren’t getting any faster. So the only way you can do a billion billion operations is to have a billion of these processors doing a billion operations per second, all at once. It scales to human endeavor as well.

    “Say you have a printing shop. You have a person who can print 100 pages per minute. If you want a hundred hundred pages—which is 10,000—then you have to have 100 printers running in parallel. That doesn’t make the first page come out any faster. It’s still going to take just as long to get a single page out, but if you’ve got 10,000 pages to do, you can get those done in one minute instead of 100 minutes, thanks to that concurrent assembly line. It’s all about getting more work done at the same time. In our case, we have quintillions of operations to do. We can’t do it consecutively. So we need algorithms that say, ‘Let’s do these 1 billion things at once, these next 1 billion things at once,’ and so on. The reality is even more complicated in the sense that most operations on a processing element take many clock cycles to complete, so we need to start the first billion and then start the next billion, and so on—before the first billion are even done!”

    The hardware foundation for Frontier and Aurora, like Summit’s, will rest on graphics processing units (GPUs), which have proven ideal for splitting up and juggling the computation tasks necessary for high-performance computing. That means software libraries for applications originally designed to run on more traditional central processing units (CPUs) must be translated, sharpened, and brought up to date.

    “We’re rethinking the architecture in the sense that the broader community is mapping their applications to GPUs like they haven’t been forced to do in the past,” said Jeffrey Vetter, an ORNL corporate fellow and the lead for ECP’s Development Tools group.

    “Software vendors have largely been able to either choose to not participate or just run on platforms without GPUs. Now, the next-generation supercomputing platforms for DOE are all GPU-based, so if you want to make use of them, you’re going to have to reprogram your application to make use of GPUs. We’re working on development tools for writing software easily and compilers for translating codes on these new heterogeneous systems.”

    Writing that software requires creating new languages and building out existing languages to make the new systems run effectively. Most of the solutions Vetter and his team have developed rely on Low-Level Virtual Machine (LLVM), an open-source compiler that translates widely-used computer languages such as Fortran and C++ into machine-specific code for the processors made by major vendors such as Intel, IBM, NVIDIA and AMD.

    “The biggest challenge is developing a programming system that is simultaneously portable and efficient on all these systems,” Vetter said. “There’s not really one programming model that runs across all of them. So our approach is layered. The higher-level programming models rely on the features of lower-level systems, such as the compiler and runtime system provided by the target architecture. But we’ve got to have that higher-level programming model to abstract enough detail so applications can be portable across systems.

    “There’s no one silver bullet. With so much software out there already, most of the work is improving and enhancing existing compilers, so it’s more evolutionary than revolutionary. LLVM is a compiler used by virtually everybody in the industry—Google, Facebook, IBM, they all converge on LLVM. Any time we add something to LLVM, all those companies can benefit from it and vice versa. In the end, we’re making LLVM that’s better for all users.”

    Besides LLVM, development teams plan to use OpenMP, a multi-threaded parallel programming model based on compiler directives, to express computations for the nodes on the forthcoming exascale systems. That means enhancing OpenMP’s existing features, working with software vendors to add new features, and making sure all the pieces fit together.

    “It was sort of like jumping onto a moving train at first,” said Barbara Chapman, chair of computer science and mathematics at Brookhaven National Laboratory, who’s leading the effort to scale up OpenMP and LLVM for exascale.

    “OpenMP emerged as the favorite approach for exploiting parallelism across computing cores. Our goal was to add in all the new features that the applications need, especially for GPUs, and the OpenMP standards committee was already working on their next release. With help from the application teams, we were able to convince the vendors to adopt these features in a very short time.

    “Since then we’ve had to focus on clarifications, improvements, all the little things you haven’t thought about when you try to specify features, such as some details of how two of them interact. We have to encourage vendors to quickly implement the extensions that we need for exascale, and we have to work directly on the open-source LLVM compiler to make it ready to provide the OpenMP features and performance we need. We’ve had some very encouraging results, especially in the last six months. What we’re going to be moving onto is the phase of getting more vendor compilers that meet our needs and gaining experience with them.”

    The only challenge equal to ramping up performance for a machine that hasn’t been built yet might be building and calibrating tools to measure that performance. Jack Dongarra—a supercomputing veteran and fellow of the Institute of Electrical and Electronics Engineers, the National Academy of Engineering, and the Royal Society of London—hasn’t blinked.

    “It’s always challenging whenever you face building software for a new architecture,” said Dongarra, a distinguished researcher at ORNL and professor of computer science and electrical engineering at the University of Tennessee. “You end up having to constantly retool the apps, but that’s normal in the face of a new system. It’s a challenge we can deal with.”


    Dongarra leads a group at the university to develop the Exascale Performance Application Programming Interface (ExaPAPI), a diagnostic tool to measure exascale output and efficiency.

    “Think of it as a dashboard for what’s going on inside the machine,” he said. “This way you can see how efficiently it’s running, how much is coming in vs. how much is going out, how much energy is being used. We want to make sure all these apps will perform at peak levels, and to make the apps more efficient, we need that feedback. We already have the basic performance counters: How many flops, how much memory is in use, how much power is running. But without the other tools, the user is faced with this black box and unable to understand what’s going on inside. ExaPAPI is what’s going to let us really get a view of what’s going on inside that black box.”

    ExaPAPI provides a detailed assessment of exascale performance. HPCToolkit, another diagnostic tool for exascale, acts as the zoom lens, intended to pinpoint opportunities for optimization in application codes.

    “What ExaPAPI does is great, but if your piece of code is 100,000 lines, you need to know where to fix it if there’s a problem,” said John Mellor-Crummey, a professor of computer science and electrical and computer engineering at Rice University, who’s leading development of HPCToolkit.

    “It’s not necessarily enough just to know how long it takes to run the program. What we acquire is not only the place where a metric was measured, but we find out exactly where we are in the context of program execution when the problem was encountered. That way we can track down where you’re spending your time, attribute this to individual source lines in the program, and tell you: Here’s where it’s working effectively, here’s where it’s wasting time, here’s where it’s spending time waiting for synchronization.”

    The vast number of parallel operations required for exascale presents a particular challenge for measurement and analysis.

    “We’re building tools that are very different from what’s been built in the past,” Mellor-Crummey said. “It’s a billion threads, each doing a billion things a second, and we’ve got to measure it and figure out what’s going wrong. How can we analyze all the measurements we collect and do it fast enough? We’ve got to build our own parallel applications to process all this performance data and our own visualizations as well. But I’m confident in our approach.”

    The ECP scientists ultimately envision a rich, diverse ecosystem of software for exascale, where advanced libraries for math, visualization, and data analytics build upon these foundational programming models and tools to provide the versatile capabilities needed by scientific applications teams.

    “The developers of reusable software libraries are pushing new frontiers of research in algorithms and data structures to exploit emerging architectural features, while relying on these advances in compilers and performance tools,” said Lois Curfman McInnes, a senior computational scientist at Argonne and deputy director of software technology for the ECP.

    “Developers of applications and libraries will leverage these new features in programming models, compilers, and development tools in order to advance their software on emerging architectures, while also providing important feedback about their wish lists for future functionality.”

    That work won’t end when Frontier and Aurora or the third planned exascale system, El Capitan at Lawrence Livermore National Laboratory in California, are stood up. The next generation of exascale will need fine-tuning, and so will whatever comes next—whether zettascale, quantum, or neuromorphic computing.

    HPE Cray Shasta El Capitan supercomputer at LLNL.

    “The goal has always been that what we’re building would still translate onto the newer systems,” said Vetter, the software technology tools manager. “It’s hard to understand if what we’re doing now would apply as far out as quantum, but for some of the next generation systems, we can anticipate what the software architectures need to look like, and we’ve got people doing active research trying to find solutions. In some cases, we’re not even looking at the same questions. There are too many fun things to do, too many possibilities to ever think about stopping.”

    The problems to be solved won’t stop, either. The scientists wouldn’t have it any other way.

    “This is the nature of high-performance computing,” said Heroux, the ECP’s software technology director. “We’re always trying to get more performance and innovate new ways to get that, always bringing in new technology. It’s like building racecars: Racing is grabbing onto the edge of disaster and not letting go, progressively reaching further as we try to go faster. We’re always on the edge. That’s where we want to be.”

    See the full article here.


    Please help promote STEM in your local schools.

    Stem Education Coalition

    About ECP
    The ECP is a collaborative effort of two DOE organizations – the Office of Science and the National Nuclear Security Administration. As part of the National Strategic Computing initiative, ECP was established to accelerate delivery of a capable exascale ecosystem, encompassing applications, system software, hardware technologies and architectures, and workforce development to meet the scientific and national security mission needs of DOE in the early-2020s time frame.

    About the Office of Science

    DOE’s Office of Science is the single largest supporter of basic research in the physical sciences in the United States and is working to address some of the most pressing challenges of our time. For more information, please visit https://science.energy.gov/.

    About NNSA

    Established by Congress in 2000, NNSA is a semi-autonomous agency within the DOE responsible for enhancing national security through the military application of nuclear science. NNSA maintains and enhances the safety, security, and effectiveness of the U.S. nuclear weapons stockpile without nuclear explosive testing; works to reduce the global danger from weapons of mass destruction; provides the U.S. Navy with safe and effective nuclear propulsion; and responds to nuclear and radiological emergencies in the United States and abroad. https://nnsa.energy.gov

    The Goal of ECP’s Application Development focus area is to deliver a broad array of comprehensive science-based computational applications that effectively utilize exascale HPC technology to provide breakthrough simulation and data analytic solutions for scientific discovery, energy assurance, economic competitiveness, health enhancement, and national security.

    Awareness of ECP and its mission is growing and resonating—and for good reason. ECP is an incredible effort focused on advancing areas of key importance to our country: economic competiveness, breakthrough science and technology, and national security. And, fortunately, ECP has a foundation that bodes extremely well for the prospects of its success, with the demonstrably strong commitment of the US Department of Energy (DOE) and the talent of some of America’s best and brightest researchers.

    ECP is composed of about 100 small teams of domain, computer, and computational scientists, and mathematicians from DOE labs, universities, and industry. We are tasked with building applications that will execute well on exascale systems, enabled by a robust exascale software stack, and supporting necessary vendor R&D to ensure the compute nodes and hardware infrastructure are adept and able to do the science that needs to be done with the first exascale platforms.the science that needs to be done with the first exascale platforms.

  • richardmitnick 10:47 am on March 28, 2020 Permalink | Reply
    Tags: , “Today’s news provides a prime example of how government and industry can work together for the benefit of the entire nation.”, Ensuring the National Nuclear Security Administration — LLNL Sandia National Laboratories and Los Alamos National Laboratory —keeping the nation’s nuclear stockpile safe., Exascale computing, HPE Cray Shasta El Capitan supercomputer at LLNL, HPE/Cray,   

    From Lawrence Livermore National Laboratory: “LLNL and HPE to partner with AMD on El Capitan, projected as world’s fastest supercomputer” 

    From Lawrence Livermore National Laboratory


    Jeremy Thomas

    Lawrence Livermore National Laboratory (LLNL), Hewlett Packard Enterprise (HPE) and Advanced Micro Devices Inc. (AMD) today announced the selection of AMD as the node supplier for El Capitan, projected to be the world’s most powerful supercomputer when it is fully deployed in 2023.

    HPE Cray Shasta El Capitan supercomputer at LLNL

    With its advanced computing and graphics processing units (CPUs/GPUs), El Capitan’s peak performance is expected to exceed 2 exaFLOPS, ensuring the National Nuclear Security Administration (NNSA) laboratories — LLNL, Sandia National Laboratories and Los Alamos National Laboratory — can meet their primary mission of keeping the nation’s nuclear stockpile safe, secure and reliable. (An exaFLOP is one quintillion floating point operations per second.)

    Funded by the Advanced Simulation and Computing (ASC) program at the Department of Energy’s (DOE) NNSA, El Capitan will perform complex and increasingly predictive modeling and simulation for NNSA’s vital Life Extension Programs (LEPs), which address weapons aging and emergent threat issues in the absence of underground nuclear testing.

    “This unprecedented computing capability, powered by advanced CPU and GPU technology from AMD, will sustain America’s position on the global stage in high-performance computing and provide an observable example of the commitment of the country to maintaining an unparalleled nuclear deterrent,” said LLNL Director Bill Goldstein. “Today’s news provides a prime example of how government and industry can work together for the benefit of the entire nation.”

    El Capitan will be powered by next-generation AMD EPYC processors, code-named “Genoa” and featuring the “Zen 4” processor core, next-generation AMD Radeon Instinct GPUs based on a new compute-optimized architecture for workloads including HPC and AI, and the AMD Radeon Open Compute platform (ROCm) heterogenous computing software. The nodes will support simulations used by the NNSA labs to address the demands of the LEPs, whose computational requirements are growing due to the ramping up of stockpile modernization efforts and in response to rapidly evolving threats from America’s adversaries.

    Providing enormous computation capability for the energy used, the GPUs will provide the majority of the peak floating-point performance of El Capitan. This enables LLNL scientists to run high-resolution 3D models quicker, as well as increase the fidelity and repeatability of calculations, thus making those simulations truer to life.

    “We have been pursuing a balanced investment effort at NNSA in advancing our codes, our platforms and our facilities in an integrated and focused way,” said Michel McCoy, Weapon Simulation and Computing Program Director at LLNL. “And our teams and industrial partners will deliver this capability as planned to the nation. Naturally, this has required an intimate, sustained partnership with our industry technology partners and between the tri-labs to be successful.”

    Anticipated to be one of the most capable supercomputers in the world, El Capitan will have a significantly greater per-node capability than any current systems, LLNL researchers said. El Capitan’s graphics processors will be amenable to AI and machine learning-assisted data analysis, further propelling LLNL’s sizable investment in AI-driven scientific workloads. These workloads will supplement scientific models that researchers hope will be faster, more accurate and intrinsically capable of quantifying uncertainty in their predictions, and will be increasingly used for stockpile stewardship applications. The use of AMD’s GPUs also is anticipated to dramatically increase El Capitan’s energy efficiency as compared to systems using today’s graphical processors.

    “El Capitan will drive unprecedented advancements in HPC and AI, powered by the next-generation AMD EPYC CPUs and Radeon Instinct GPUs,” said Forrest Norrod, senior vice president and general manager, Datacenter and Embedded Systems Group, AMD. “Building on our strong foundation in high-performance computing and adding transformative coherency capabilities, AMD is enabling the NNSA Tri-Lab community — LLNL, Los Alamos and Sandia national laboratories — to achieve their mission-critical objectives and contribute new AI advancements to the industry. We are extremely proud to continue our exascale work with HPE and NNSA and look forward to the delivery of the most powerful supercomputer in the world, expected in early 2023.”

    El Capitan also will integrate many advanced features that are not yet widely deployed, including HPE’s advanced Cray Slingshot interconnect network, which will enable large calculations across many nodes, an essential requirement for the NNSA laboratories’ simulation workloads. In addition to the capabilities that Cray Slingshot provides, HPE and LLNL are partnering to actively explore new HPE optics technologies that integrate electrical-to-optical interfaces that could deliver higher data transmission at faster speeds with improved power efficiency and reliability. El Capitan also will feature the new Cray Shasta software platform, which will have a new container-based architecture to enable administrators and developers to be more productive, and to orchestrate LLNL’s complex new converged HPC and AI workflows at scale.

    “As an industry and as a nation, we have achieved a major milestone in computing. HPE is honored to support DOE, NNSA and Lawrence Livermore National Laboratory in a critical strategic mission to advance the United States’ position in security and defense,” said Peter Ungaro, senior vice president and general manager, HPC and Mission Critical Systems (MCS), at HPE. “The computing power and capabilities of this system represent a new era of innovation that will unlock solutions to society’s most complex issues and answer questions we never thought were possible.”

    The exascale ecosystem being developed through the sustained efforts of DOE’s Exascale Computing Initiative will further ensure El Capitan has formidable capabilities from day one. Through funding from NNSA’s ASC program, in collaboration with the DOE Office of Science’s Advanced Scientific Computing Research program, ongoing investments in hardware and software technology will assure highly functional hardware and tools to meet DOE’s needs in the next decade. The El Capitan system also will benefit from a partnership with Oak Ridge National Laboratory, which is taking delivery of a similar system from HPE about one year earlier than El Capitan.

    El Capitan would not have been possible without the investments made by DOE’s Exascale PathForward program, which provided funding for American companies including HPE/Cray and AMD to accelerate the technologies necessary to maximize energy efficiency and performance of exascale supercomputers.

    Besides supporting the nuclear stockpile, El Capitan will perform secondary national security missions, including nuclear nonproliferation and counterterrorism. NNSA laboratories are building machine learning and AI into computational techniques and analysis that will benefit NNSA’s primary missions and unclassified projects such as climate modeling and cancer research for DOE.

    See the full article here .


    Please help promote STEM in your local schools.

    Stem Education Coalition

    Operated by Lawrence Livermore National Security, LLC, for the Department of Energy’s National Nuclear Security Administration
    Lawrence Livermore National Laboratory (LLNL) is an American federal research facility in Livermore, California, United States, founded by the University of California, Berkeley in 1952. A Federally Funded Research and Development Center (FFRDC), it is primarily funded by the U.S. Department of Energy (DOE) and managed and operated by Lawrence Livermore National Security, LLC (LLNS), a partnership of the University of California, Bechtel, BWX Technologies, AECOM, and Battelle Memorial Institute in affiliation with the Texas A&M University System. In 2012, the laboratory had the synthetic chemical element livermorium named after it.
    LLNL is self-described as “a premier research and development institution for science and technology applied to national security.” Its principal responsibility is ensuring the safety, security and reliability of the nation’s nuclear weapons through the application of advanced science, engineering and technology. The Laboratory also applies its special expertise and multidisciplinary capabilities to preventing the proliferation and use of weapons of mass destruction, bolstering homeland security and solving other nationally important problems, including energy and environmental security, basic science and economic competitiveness.

    The Laboratory is located on a one-square-mile (2.6 km2) site at the eastern edge of Livermore. It also operates a 7,000 acres (28 km2) remote experimental test site, called Site 300, situated about 15 miles (24 km) southeast of the main lab site. LLNL has an annual budget of about $1.5 billion and a staff of roughly 5,800 employees.

    LLNL was established in 1952 as the University of California Radiation Laboratory at Livermore, an offshoot of the existing UC Radiation Laboratory at Berkeley. It was intended to spur innovation and provide competition to the nuclear weapon design laboratory at Los Alamos in New Mexico, home of the Manhattan Project that developed the first atomic weapons. Edward Teller and Ernest Lawrence,[2] director of the Radiation Laboratory at Berkeley, are regarded as the co-founders of the Livermore facility.

    The new laboratory was sited at a former naval air station of World War II. It was already home to several UC Radiation Laboratory projects that were too large for its location in the Berkeley Hills above the UC campus, including one of the first experiments in the magnetic approach to confined thermonuclear reactions (i.e. fusion). About half an hour southeast of Berkeley, the Livermore site provided much greater security for classified projects than an urban university campus.

    Lawrence tapped 32-year-old Herbert York, a former graduate student of his, to run Livermore. Under York, the Lab had four main programs: Project Sherwood (the magnetic-fusion program), Project Whitney (the weapons-design program), diagnostic weapon experiments (both for the Los Alamos and Livermore laboratories), and a basic physics program. York and the new lab embraced the Lawrence “big science” approach, tackling challenging projects with physicists, chemists, engineers, and computational scientists working together in multidisciplinary teams. Lawrence died in August 1958 and shortly after, the university’s board of regents named both laboratories for him, as the Lawrence Radiation Laboratory.

    Historically, the Berkeley and Livermore laboratories have had very close relationships on research projects, business operations, and staff. The Livermore Lab was established initially as a branch of the Berkeley laboratory. The Livermore lab was not officially severed administratively from the Berkeley lab until 1971. To this day, in official planning documents and records, Lawrence Berkeley National Laboratory is designated as Site 100, Lawrence Livermore National Lab as Site 200, and LLNL’s remote test location as Site 300.[3]

    The laboratory was renamed Lawrence Livermore Laboratory (LLL) in 1971. On October 1, 2007 LLNS assumed management of LLNL from the University of California, which had exclusively managed and operated the Laboratory since its inception 55 years before. The laboratory was honored in 2012 by having the synthetic chemical element livermorium named after it. The LLNS takeover of the laboratory has been controversial. In May 2013, an Alameda County jury awarded over $2.7 million to five former laboratory employees who were among 430 employees LLNS laid off during 2008.[4] The jury found that LLNS breached a contractual obligation to terminate the employees only for “reasonable cause.”[5] The five plaintiffs also have pending age discrimination claims against LLNS, which will be heard by a different jury in a separate trial.[6] There are 125 co-plaintiffs awaiting trial on similar claims against LLNS.[7] The May 2008 layoff was the first layoff at the laboratory in nearly 40 years.[6]

    On March 14, 2011, the City of Livermore officially expanded the city’s boundaries to annex LLNL and move it within the city limits. The unanimous vote by the Livermore city council expanded Livermore’s southeastern boundaries to cover 15 land parcels covering 1,057 acres (4.28 km2) that comprise the LLNL site. The site was formerly an unincorporated area of Alameda County. The LLNL campus continues to be owned by the federal government.


    DOE Seal

  • richardmitnick 11:00 am on November 23, 2019 Permalink | Reply
    Tags: , Argonne Leadership Computing Facility, , , Cray Intel SC18 Shasta Aurora exascale supercomputer, Exascale computing,   

    From Argonne Leadership Computing Facility: “Argonne teams up with Altair to manage use of upcoming Aurora supercomputer” 

    Argonne Lab
    News from Argonne National Laboratory

    From Argonne Leadership Computing Facility

    November 19, 2019
    Jo Napolitano

    Depiction of ANL ALCF Cray Intel SC18 Shasta Aurora exascale supercomputer

    The U.S. Department of Energy’s (DOE) Argonne National Laboratory has teamed up with the global technology company Altair to implement a new scheduling system that will be employed on the Aurora supercomputer, slated for delivery in 2021.

    Aurora will be one of the nation’s first exascale systems; capable of performing a billion billion – that’s a quintillion – calculations per second. It will be nearly 100 times faster than Argonne’s current supercomputer, Theta, which went online just two years ago.

    Aurora will be in high demand from researchers around the world and, as a result, will need a sophisticated workload manager to sort and prioritize requested jobs.

    It found a natural partner in Altair to meet that need. Founded in 1985 and headquartered in Troy, Michigan, the company provides software and cloud solutions in the areas of product development, high-performance computing (HPC) and data analytics.

    Argonne was initially planning an update to its own workload manager COBALT (Component-Based Lightweight Toolkit) which was developed 20 years ago within the lab’s own Mathematics and Computer Science Division.

    COBALT has served the Argonne Leadership Computing Facility (ALCF), a DOE Office of Science User Facility, for years, but after careful consideration of several factors, including cost and efficiency, the laboratory determined that a collaboration with Altair on the PBS Professional™ open source solution was the best path forward.

    “When we went to talk to Altair, we were looking for a resource manager (one of the components in a workload manager) we could use,” said Bill Allcock, manager of the Advanced Integration Group at the ALCF. ​“We decided to collaborate on the entire workload manager rather than just the resource manager because our future roadmaps were well aligned.”

    Altair was already working on a couple of important features that the laboratory wanted to employ with Aurora, Allcock said.

    And most importantly, the teams meshed well together.

    “Exascale will be a huge milestone in HPC — to make better products, to make better decisions, to make the world a better place,” said Bill Nitzberg, chief technology officer of Altair PBS Works™. ​“Getting to exascale requires innovation, especially in systems software, like job scheduling. The partnership between Altair and Argonne will enable effective exascale scheduling, not only for Aurora, but also for the wider HPC world. This is a real 1+1=3 partnership.”

    Aurora is expected to have a significant impact on nearly every field of scientific endeavor, including artificial intelligence. It will improve extreme weather forecasting, accelerate medical treatments, help map the human brain, develop new materials and further our understanding of the universe.

    It will also play a pivotal role in national security and human health.

    “We want to enable researchers to conduct the most important science possible, projects that cannot be done anywhere else in the world because they demand a machine of this size, and this partnership will help us reach this goal,” said Allcock.

    See the full article here .


    Please help promote STEM in your local schools.

    Stem Education Coalition

    Argonne National Laboratory seeks solutions to pressing national problems in science and technology. The nation’s first national laboratory, Argonne conducts leading-edge basic and applied scientific research in virtually every scientific discipline. Argonne researchers work closely with researchers from hundreds of companies, universities, and federal, state and municipal agencies to help them solve their specific problems, advance America’s scientific leadership and prepare the nation for a better future. With employees from more than 60 nations, Argonne is managed by UChicago Argonne, LLC for the U.S. Department of Energy’s Office of Science. For more visit http://www.anl.gov.

    About ALCF
    The Argonne Leadership Computing Facility’s (ALCF) mission is to accelerate major scientific discoveries and engineering breakthroughs for humanity by designing and providing world-leading computing facilities in partnership with the computational science community.

    We help researchers solve some of the world’s largest and most complex problems with our unique combination of supercomputing resources and expertise.

    ALCF projects cover many scientific disciplines, ranging from chemistry and biology to physics and materials science. Examples include modeling and simulation efforts to:

    Discover new materials for batteries
    Predict the impacts of global climate change
    Unravel the origins of the universe
    Develop renewable energy technologies

    Argonne is managed by UChicago Argonne, LLC for the U.S. Department of Energy’s Office of Science

    Argonne Lab Campus

  • richardmitnick 10:34 am on November 29, 2018 Permalink | Reply
    Tags: , , Exascale computing, ,   

    From Science Node: “The race to exascale” 

    Science Node bloc
    From Science Node

    30 Jan, 2018
    Alisa Alering

    Who will get the first exascale machine – a supercomputer capable of 10^18 floating point operations per second? Will it be China, Japan, or the US?

    When it comes to computing power you can never have enough. In the last sixty years, processing power has increased more than a trillionfold.

    Researchers around the world are excited because these new, ultra-fast computers represent a 50- to 100-fold increase in speed over today’s supercomputers and promise significant breakthroughs in many areas. That exascale supercomputers are coming is pretty clear. We can even predict the date, most likely in the mid-2020s. But the question remains as to what kind of software will run on these machines.

    Exascale computing heralds an era of ubiquitous massive parallelism, in which processors perform coordinated computations simultaneously. But the number of processors will be so high that computer scientists will have to constantly cope with failing components.

    The high number of processors will also likely slow programs tremendously. The consequence is that beyond the exascale hardware, we will also need exascale brains to develop new algorithms and implement them in exascale software.

    In 2011, the German Research Foundation established a priority program “Software for Exascale Computing”( SPPEXA ) to addresses fundamental research on various aspects of high performance computing (HPC) software, making the program the first of its kind in Germany.

    SPPEXA connects relevant sub-fields of computer science with the needs of computational science and engineering and HPC. The program provides the framework for closer cooperation and a co-design-driven approach. This is a shift from the current service-driven collaboration of groups focusing on fundamental HPC methodology (computer science or mathematics) on the one side with those working on science applications and providing the large codes (science and engineering) on the other side.

    Despite exascale computing still being several years away, SPPEXA scientists are well ahead of the game, developing scalable and efficient algorithms that will make the best use of resources when the new machines finally arrive. SPPEXA drives research towards extreme-scale computing in six areas: computational algorithms, system software, application software, data management and exploration, programming, and software tools.

    Some major projects include research on alternative sources of clean energy; stronger, lighter weight steel manufacturing; and unprecedented simulations of the earth’s convective processes:

    EXAHD supports Germany’s long-standing research into the use of plasma fusion as a clean, safe, and sustainable carbon-free energy source. One of the main goals of the EXAHD project is to develop scalable and efficient algorithms to run on distributed systems, with the aim of facilitating the progress of plasma fusion research.

    EXASTEEL is a massively parallel simulation environment for computational material science. Bringing together experts from mathematics, material and computer sciences, and engineering, EXASTEEL will serve as a virtual laboratory for testing new forms of steel with greater strengths and lower weight.

    TerraNeo addresses the challenges of understanding the convection of Earth’s mantle – the cause of most of our planet’s geological activity, from plate tectonics to volcanoes and earthquakes. Due to the sheer scale and complexity of the models, the advent of exascale computing offers a tremendous opportunity for greater understanding. But in order to take full advantage of the coming resources, TerraNeo is working to design new software with optimal algorithms that permit a scalable implementation.

    Exascale hardware is expected to have less consistent performance than current supercomputers due to fabrication, power, and heat issues. Their sheer size and unprecedented number of components will likely increase fault rates. Fast and Fault-Tolerant Microkernel-based Operating System for Exascale Computing (FFMK) aims to address these challenges through a coordinated approach that connects system software, computational algorithms, and application software.

    Mastering the various challenges related to the paradigm shift from moderately to massively parallel processing will be the key to any future capability computing application at exascale. It will also be crucial for learning how to effectively and efficiently deal with near-future commodity systems smaller-scale or capacity computing tasks. No matter who puts the first machine online, exascale supercomputing is coming. SPPEXA is making sure we are prepared to take full advantage of it.

    See the full article here .

    Please help promote STEM in your local schools.

    Stem Education Coalition

    Science Node is an international weekly online publication that covers distributed computing and the research it enables.

    “We report on all aspects of distributed computing technology, such as grids and clouds. We also regularly feature articles on distributed computing-enabled research in a large variety of disciplines, including physics, biology, sociology, earth sciences, archaeology, medicine, disaster management, crime, and art. (Note that we do not cover stories that are purely about commercial technology.)

    In its current incarnation, Science Node is also an online destination where you can host a profile and blog, and find and disseminate announcements and information about events, deadlines, and jobs. In the near future it will also be a place where you can network with colleagues.

    You can read Science Node via our homepage, RSS, or email. For the complete iSGTW experience, sign up for an account or log in with OpenID and manage your email subscription from your account preferences. If you do not wish to access the website’s features, you can just subscribe to the weekly email.”

  • richardmitnick 5:57 pm on September 5, 2018 Permalink | Reply
    Tags: , , , Exascale computing, ,   

    From PPPL and ALCF: “Artificial intelligence project to help bring the power of the sun to Earth is picked for first U.S. exascale system” 

    From PPPL


    Argonne Lab

    Argonne National Laboratory ALCF

    August 27, 2018
    John Greenwald

    Deep Learning Leader William Tang. (Photo by Elle Starkman/Office of Communications.)

    To capture and control the process of fusion that powers the sun and stars in facilities on Earth called tokamaks, scientists must confront disruptions that can halt the reactions and damage the doughnut-shaped devices.


    Now an artificial intelligence system under development at the U.S. Department of Energy’s (DOE) Princeton Plasma Physics Laboratory (PPPL) and Princeton University to predict and tame such disruptions has been selected as an Aurora Early Science project by the Argonne Leadership Computing Facility, a DOE Office of Science User Facility.

    Depiction of ANL ALCF Cray Shasta Aurora supercomputer

    The project, titled “Accelerated Deep Learning Discovery in Fusion Energy Science” is one of 10 Early Science Projects on data science and machine learning for the Aurora supercomputer, which is set to become the first U.S. exascale system upon its expected arrival at Argonne in 2021. The system will be capable of performing a quintillion (1018) calculations per second — 50-to-100 times faster than the most powerful supercomputers today.

    Fusion combines light elements

    Fusion combines light elements in the form of plasma — the hot, charged state of matter composed of free electrons and atomic nuclei — in reactions that generate massive amounts of energy. Scientists aim to replicate the process for a virtually inexhaustible supply of power to generate electricity.

    The goal of the PPPL/Princeton University project is to develop a method that can be experimentally validated for predicting and controlling disruptions in burning plasma fusion systems such as ITER — the international tokamak under construction in France to demonstrate the practicality of fusion energy. “Burning plasma” refers to self-sustaining fusion reactions that will be essential for producing continuous fusion energy.

    Heading the project will be William Tang, a principal research physicist at PPPL and a lecturer with the rank and title of professor in the Department of Astrophysical Sciences at Princeton University. “Our research will utilize capabilities to accelerate progress that can only come from the deep learning form of artificial intelligence,” Tang said.

    Networks analagous to a brain

    Deep learning, unlike other types of computational approaches, can be trained to solve with accuracy and speed highly complex problems that require realistic image resolution. Associated software consists of multiple layers of interconnected neural networks that are analogous to simple neurons in a brain. Each node in a network identifies a basic aspect of data that is fed into the system and passes the results along to other nodes that identify increasingly complex aspects of the data. The process continues until the desired output is achieved in a timely way.

    The PPPL/Princeton deep-learning software is called the “Fusion Recurrent Neural Network (FRNN),” composed of convolutional and recurrent neural nets that allow a user to train a computer to detect items or events of interest. The software seeks to speedily predict when disruptions will break out in large-scale tokamak plasmas, and to do so in time for effective control methods to be deployed.

    The project has greatly benefited from access to the huge disruption-relevant data base of the Joint European Torus (JET) in the United Kingdom, the largest and most powerful tokamak in the world today.

    Joint European Torus, at the Culham Centre for Fusion Energy in the United Kingdom

    The FRNN software has advanced from smaller computer clusters to supercomputing systems that can deal with such vast amounts of complex disruption-relevant data. Running the data aims to identify key pre-disruption conditions, guided by insights from first principles-based theoretical simulations, to enable the “supervised machine learning” capability of deep learning to produce accurate predictions with sufficient warning time.

    Access to Tiger computer cluster

    The project has gained from access to Tiger, a high-performance Princeton University cluster equipped with advanced image-resolution GPUs that have enabled the deep learning software to advance to the Titan supercomputer at Oak Ridge National Laboratory and to powerful international systems such as the Tsubame 3.0 supercomputer in Tokyo, Japan.

    Tiger supercomputer at Princeton University

    ORNL Cray XK7 Titan Supercomputer

    Tsubame 3.0 supercomputer in Tokyo, Japan

    The overall goal is to achieve the challenging requirements for ITER, which will need predictions to be 95 percent accurate with less than 5 percent false alarms at least 30 milliseconds or longer before disruptions occur.

    ITER Tokamak in Saint-Paul-lès-Durance, which is in southern France

    The team will continue to build on advances that are currently supported by the DOE while preparing the FRNN software for Aurora exascale computing. The researchers will also move forward with related developments on the SUMMIT supercomputer at Oak Ridge.

    ORNL IBM AC922 SUMMIT supercomputer. Credit: Carlos Jones, Oak Ridge National Laboratory/U.S. Dept. of Energy

    Members of the team include Julian Kates-Harbeck, a graduate student at Harvard University and a DOE Office of Science Computational Science Graduate Fellow (CSGF) who is the chief architect of the FRNN. Researchers include Alexey Svyatkovskiy, a big-data, machine learning expert who will continue to collaborate after moving from Princeton University to Microsoft; Eliot Feibush, a big data analyst and computational scientist at PPPL and Princeton, and Kyle Felker, a CSGF member who will soon graduate from Princeton University and rejoin the FRNN team as a post-doctoral research fellow at Argonne National Laboratory.

    See the full article here .


    Please help promote STEM in your local schools.

    Stem Education Coalition

    PPPL campus

    Princeton Plasma Physics Laboratory is a U.S. Department of Energy national laboratory managed by Princeton University. PPPL, on Princeton University’s Forrestal Campus in Plainsboro, N.J., is devoted to creating new knowledge about the physics of plasmas — ultra-hot, charged gases — and to developing practical solutions for the creation of fusion energy. Results of PPPL research have ranged from a portable nuclear materials detector for anti-terrorist use to universally employed computer codes for analyzing and predicting the outcome of fusion experiments. The Laboratory is managed by the University for the U.S. Department of Energy’s Office of Science, which is the largest single supporter of basic research in the physical sciences in the United States, and is working to address some of the most pressing challenges of our time. For more information, please visit science.energy.gov.

Compose new post
Next post/Next comment
Previous post/Previous comment
Show/Hide comments
Go to top
Go to login
Show/Hide help
shift + esc
%d bloggers like this: