Tagged: OLCF Toggle Comment Threads | Keyboard Shortcuts

  • richardmitnick 12:23 pm on April 12, 2019 Permalink | Reply
    Tags: "Scaling Deep Learning for Scientific Workloads on the #1 Summit Supercomputer", , OLCF, , ORNL Cray XK7 Titan Supercomputer once the fastest in the world now No.9 on the TOP500, ORNL IBM AC922 SUMMIT supercomputer No.1 on the TOP500   

    From insideHPC: “Scaling Deep Learning for Scientific Workloads on the #1 Summit Supercomputer” 

    From insideHPC

    April 11, 2019
    Rich Brueckner


    In this video from GTC 2018, Jack Wells from ORNL presents: Scaling Deep Learning for Scientific Workloads on Summit.

    2
    Jack Wells is the Director of Science for the Oak Ridge Leadership Computing Facility (OLCF).

    “HPC centers have been traditionally configured for simulation workloads, but deep learning has been increasingly applied alongside simulation on scientific datasets. These frameworks do not always fit well with job schedulers, large parallel file systems, and MPI backends. We’ll discuss examples of how deep learning workflows are being deployed on next-generation systems at the Oak Ridge Leadership Computing Facility. We’ll share benchmarks between native compiled versus containers on Power systems, like Summit, as well as best practices for deploying learning and models on HPC resources on scientific workflows.”

    The biggest problems in science require supercomputers of unprecedented capability. That’s why the US Department of Energy’s Oak Ridge National Laboratory (ORNL) launched Summit, a system 8 times more powerful than ORNL’s previous top-ranked system Titan. Summit is providing scientists with incredible computing power to solve challenges in energy, artificial intelligence, human health, and other research areas, that were simply out of reach until now. These discoveries will help shape our understanding of the universe, bolster US economic competitiveness, and contribute to a better future.

    ORNL IBM AC922 SUMMIT supercomputer, No.1 on the TOP500. Credit: Carlos Jones, Oak Ridge National Laboratory/U.S. Dept. of Energy

    Summit Specifications:
    Application Performance: 200 PF (currently #1 on the TOP500)
    Number of Nodes: 4,608
    Node performance: 42 TF
    Memory per Node: 512 GB DDR4 + 96 GB HBM2
    NV memory per Node: 1600 GB
    Total System Memory: >10 PB DDR4 + HBM2 + Non-volatile
    Processors:
    2 IBM POWER9 9,216 CPUs
    6 NVIDIA Volta 27,648 GPUs

    File System: 250 PB, 2.5 TB/s, GPFS
    Power Consumption: 13 MW
    Interconnect: Mellanox EDR 100G InfiniBand
    Operating System: Red Hat Enterprise Linux (RHEL) version 7.4

    Jack Wells is the Director of Science for the Oak Ridge Leadership Computing Facility (OLCF), a DOE Office of Science national user facility, and the Titan supercomputer, located at Oak Ridge National Laboratory (ORNL).

    ORNL Cray XK7 Titan Supercomputer, once the fastest in the world, now No.9 on the TOP500.

    Wells is responsible for the scientific outcomes of the OLCF’s user programs. Wells has previously lead both ORNL’s Computational Materials Sciences group in the Computer Science and Mathematics Division and the Nanomaterials Theory Institute in the Center for Nanophase Materials Sciences. Prior to joining ORNL as a Wigner Fellow in 1997, Wells was a postdoctoral fellow within the Institute for Theoretical Atomic and Molecular Physics at the Harvard-Smithsonian Center for Astrophysics. Wells has a Ph.D. in physics from Vanderbilt University, and has authored or co-authored over 100 scientific papers and edited 1 book, spanning nanoscience, materials science and engineering, nuclear and atomic physics computational science, applied mathematics, and novel analytics measuring the impact of scientific publications.

    See the full article here .

    five-ways-keep-your-child-safe-school-shootings

    Please help promote STEM in your local schools.

    Stem Education Coalition

    Founded on December 28, 2006, insideHPC is a blog that distills news and events in the world of HPC and presents them in bite-sized nuggets of helpfulness as a resource for supercomputing professionals. As one reader said, we’re sifting through all the news so you don’t have to!

    If you would like to contact me with suggestions, comments, corrections, errors or new company announcements, please send me an email at rich@insidehpc.com. Or you can send me mail at:

    insideHPC
    2825 NW Upshur
    Suite G
    Portland, OR 97239

    Phone: (503) 877-5048

     
  • richardmitnick 10:08 am on April 26, 2017 Permalink | Reply
    Tags: , , Building the Bridge to Exascale, , , OLCF, ,   

    From OLCF at ORNL: “Building the Bridge to Exascale” 

    i1

    Oak Ridge National Laboratory

    OLCF

    April 18, 2017 [Where was this hiding?]
    Katie Elyce Jones

    Building an exascale computer—a machine that could solve complex science problems at least 50 times faster than today’s leading supercomputers—is a national effort.

    To oversee the rapid research and development (R&D) of an exascale system by 2023, the US Department of Energy (DOE) created the Exascale Computing Project (ECP) last year. The project brings together experts in high-performance computing from six DOE laboratories with the nation’s most powerful supercomputers—including Oak Ridge, Argonne, Lawrence Berkeley, Lawrence Livermore, Los Alamos, and Sandia—and project members work closely with computing facility staff from the member laboratories.

    ORNL IBM Summit supercomputer depiction.

    At the Exascale Computing Project’s (ECP’s) annual meeting in February 2017, Oak Ridge Leadership Computing Facility (OLCF) staff discussed OLCF resources that could be leveraged for ECP research and development, including the facility’s next flagship supercomputer, Summit, expected to go online in 2018.

    At the first ECP annual meeting, held January 29–February 3 in Knoxville, Tennessee, about 450 project members convened to discuss collaboration in breakout sessions focused on project organization and upcoming R&D milestones for applications, software, hardware, and exascale systems focus areas. During facility-focused sessions, senior staff from the Oak Ridge Leadership Computing Facility (OLCF) met with ECP members to discuss opportunities for the project to use current petascale supercomputers, test beds, prototypes, and other facility resources for exascale R&D. The OLCF is a DOE Office of Science User Facility located at DOE’s Oak Ridge National Laboratory (ORNL).

    “The ECP’s fundamental responsibilities are to provide R&D to build exascale machines more efficiently and to prepare the applications and software that will run on them,” said OLCF Deputy Project Director Justin Whitt. “The facilities’ responsibilities are to acquire, deploy, and operate the machines. We are currently putting advanced test beds and prototypes in place to evaluate technologies and enable R&D efforts like those in the ECP.”

    ORNL has a unique connection to the ECP. The Tennessee-based laboratory is the location of the project office that manages collaboration within the ECP and among its facility partners. ORNL’s Laboratory Director Thom Mason delivered the opening talk at the conference, highlighting the need for coordination in a project of this scope.

    On behalf of facility staff, Mark Fahey, director of operations at the Argonne Leadership Computing Facility, presented the latest delivery and deployment plans for upcoming computing resources during a plenary session. From the OLCF, Project Director Buddy Bland and Director of Science Jack Wells provided a timeline for the availability of Summit, OLCF’s next petascale supercomputer, which is expected to go online in 2018; it will be at least 5 times more powerful than the OLCF’s 27-petaflop Titan supercomputer.

    ORNL Cray XK7 Titan Supercomputer.

    “Exascale hardware won’t be around for several more years,” Wells said. “The ECP will need access to Titan, Summit, and other leadership computers to do the work that gets us to exascale.”

    Wells said he was able to highlight the spring 2017 call for Innovative and Novel Computational Impact on Theory and Experiment, or INCITE, proposals, which will give 2-year projects the first opportunity for computing time on Summit. OLCF staff also introduced a handful of computing architecture test beds—including the developmental environment for Summit known as Summitdev, NVIDIA’s deep learning and accelerated analytics system DGX-1, an experimental cluster of ARM 64-bit compute nodes, and a Cray XC40 cluster of 168 nodes known as Percival—that are now available for OLCF users.

    In addition to leveraging facility resources for R&D, the ECP must understand the future needs of facilities to design an exascale system that is ready for rigorous computational science simulations. Facilities staff can offer insight about the level of performance researchers will expect from science applications on exascale systems and estimate the amount of space and electrical power that will be available in the 2023 timeframe.

    “Getting to capable exascale systems will require careful coordination between the ECP and the user facilities,” Whitt said.

    One important collaboration so far was the development of a request for information, or RFI, for exascale R&D that the ECP released in February to industry vendors. The RFI enables the ECP to evaluate potential software and hardware technologies for exascale systems—a step in the R&D process that facilities often undertake. Facilities will later release requests for proposals when they are ready to begin building exascale systems

    See the full article here .

    Please help promote STEM in your local schools.

    STEM Icon

    Stem Education Coalition

    ORNL is managed by UT-Battelle for the Department of Energy’s Office of Science. DOE’s Office of Science is the single largest supporter of basic research in the physical sciences in the United States, and is working to address some of the most pressing challenges of our time.

    i2

    The Oak Ridge Leadership Computing Facility (OLCF) was established at Oak Ridge National Laboratory in 2004 with the mission of accelerating scientific discovery and engineering progress by providing outstanding computing and data management resources to high-priority research and development projects.

    ORNL’s supercomputing program has grown from humble beginnings to deliver some of the most powerful systems in the world. On the way, it has helped researchers deliver practical breakthroughs and new scientific knowledge in climate, materials, nuclear science, and a wide range of other disciplines.

    The OLCF delivered on that original promise in 2008, when its Cray XT “Jaguar” system ran the first scientific applications to exceed 1,000 trillion calculations a second (1 petaflop). Since then, the OLCF has continued to expand the limits of computing power, unveiling Titan in 2013, which is capable of 27 petaflops.


    ORNL Cray XK7 Titan Supercomputer

    Titan is one of the first hybrid architecture systems—a combination of graphics processing units (GPUs), and the more conventional central processing units (CPUs) that have served as number crunchers in computers for decades. The parallel structure of GPUs makes them uniquely suited to process an enormous number of simple computations quickly, while CPUs are capable of tackling more sophisticated computational algorithms. The complimentary combination of CPUs and GPUs allow Titan to reach its peak performance.

    The OLCF gives the world’s most advanced computational researchers an opportunity to tackle problems that would be unthinkable on other systems. The facility welcomes investigators from universities, government agencies, and industry who are prepared to perform breakthrough research in climate, materials, alternative energy sources and energy storage, chemistry, nuclear physics, astrophysics, quantum mechanics, and the gamut of scientific inquiry. Because it is a unique resource, the OLCF focuses on the most ambitious research projects—projects that provide important new knowledge or enable important new technologies.

     
  • richardmitnick 1:13 pm on March 29, 2017 Permalink | Reply
    Tags: OLCF, , , , What's next for Titan?   

    From OLCF via TheNextPlatform: “Scaling Deep Learning on an 18,000 GPU Supercomputer” 

    i1

    Oak Ridge National Laboratory

    OLCF

    2

    TheNextPlatform

    March 28, 2017
    Nicole Hemsoth


    ORNL Cray XK7 Titan Supercomputer

    It is one thing to scale a neural network on a single GPU or even a single system with four or eight GPUs. But it is another thing entirely to push it across thousands of nodes. Most centers doing deep learning have relatively small GPU clusters for training and certainly nothing on the order of the Titan supercomputer at Oak Ridge National Laboratory.

    The emphasis on machine learning scalability has often been focused on node counts in the past for single-model runs. This is useful for some applications, but as neural networks become more integrated into existing workflows, including those in HPC, there is another way to consider scalability. Interestingly, the lesson comes from an HPC application area like weather modeling where, instead of one monolithic model to predict climate, an ensemble of forecasts run in parallel on a massive supercomputer are meshed together for the best result. Using this ensemble method on deep neural networks allows for scalability across thousands of nodes, with the end result being derived from an average of the ensemble–something that is acceptable in an area that does not require the kind of precision (in more ways than one) that some HPC calculations do.

    This approach has been used on the Titan supercomputer at Oak Ridge, which is a powerhouse for deep learning training given its high GPU counts. Titan’s 18,688 Tesla K20X GPUs have proven useful for a large number of scientific simulations and are now pulling double-duty on deep learning frameworks, including Caffe, to boost the capabilities of HPC simulations (classification, filtering of noise, etc.). The next generation supercomputer at the lab, the future “Summit” machine (expected to be operational at the end of 2017) will provide even more GPU power with the “Volta” generation Tesla graphics coprocessors from Nvidia, high-bandwidth memory, NVLink for faster data movement, and IBM Power9 CPUs.


    ORNL IBM Summit supercomputer depiction

    ORNL researchers used this ensemble approach to neural networks and were able to stretch these across all of the GPUs in the machine. This is a notable feat, even for the types of large simulations that are built to run on big supercomputers. What is interesting is that while the frameworks might come from the deep learning (Caffe in ORNL’s case), the node to node communication is rooted in HPC. As we have described before, MPI is still the best method out there for fast communication across InfiniBand-connected nodes and like researchers elsewhere, ORNL has adapted it to deep learning at scale.

    Right now, the team is using each individual node to train an individual deep learning network, but all of those different networks need to have the same data if training from the same set. The question is how to feed that same data to over 18,000 different GPUs at almost the same time—and on a system that wasn’t designed with that in mind? The answer is in a custom MPI-based layer that can divvy up the data and distribute it. With the coming Summit supercomputer—the successor to Titan, which will sport six Volta GPUs per node—the other problem is multi-GPU scaling, something application teams across HPC are tackling as well.

    Ultimately, the success of MPI for deep learning at such scale will depend on how many messages the system and MPI can handle since there is both results between nodes in addition to thousands of synchronous updates for training iterations. Each iteration will cause a number of neurons within the network to be updated, so if the network is spread across multiple nodes, all of that will have to be communicated. That is large enough task on its own—but also consider the delay of the data that needs to be transferred to and from disk (although a burst buffer can be of use here). “There are also new ways of looking at MPI’s guarantees for robustness, which limits certain communication patterns. HPC needs this, but neural networks are more fault-tolerant than many HPC applications,” Patton says. “Going forward, that the same I/O is being used to communicate between the nodes and from disk, so when the datasets are large enough the bandwidth could quickly dwindle.

    In addition to their work scaling deep neural networks across Titan, the team has also developed a method of automatically designing neural networks for use across multiple datasets. Before, a network designed for image recognition could not be reused for speech, but their own auto-designing code has scaled beyond 5,000 (single GPU) nodes on Titan with up to 80 percent accuracy.

    “The algorithm is evolutionary, so it can take design parameters of a deep learning network and evolve those automatically,” Robert Patton, a computational analytics scientist at Oak Ridge, tells The Next Platform. “We can take a dataset that no one has looked at before and automatically generate a network that works well on that dataset.”

    Since developing the auto-generating neural networks, Oak Ridge researchers have been working with key application groups that can benefit from the noise filtering and data classification that large-scale neural nets can provide. These include high-energy particle physics, where they are working with Fermi National Lab to classify neutrinos and subatomic particles. “Simulations produce so much data and it’s too hard to go through it all or even keep it all on disk,” says Patton. “We want to identify things that are interesting in data in real time in a simulation so we can snapshot parts of the data in high resolution and go back later.”

    It is with an eye on “Summit” and the challenges to programming the system that teams at Oak Ridge are swiftly figuring out where deep learning fits into existing HPC workflows and how to maximize the hardware they’ll have on hand.

    “We started taking notice of deep learning in 2012 and things really took off then, in large part because of the move of those algorithms to the GPU, which allowed researchers to speed the development process,” Patton explains. “There has since been a lot of progress made toward tackling some of the hardest problems and by 2014, we started seeing that if one GPU is good for deep learning, what could we do with 18,000 of them on the Titan supercomputer.”

    While large supercomputers like Titan have the hybrid GPU/CPU horsepower for deep learning at scale, they are not built for these kinds of workloads. Some hardware changes in Summit will go a long way toward speeding through some bottlenecks, but the right combination of hardware might include some non-standard accelerators like neuromorphic devices and other chips to bolster training or inference. “Right now, if we were to use machine learning in real-time for HPC applications, we still have the problem of training. We are loading the data from disk and the processing can’t continue until the data comes off disk, so we are excited for Summit, which will give us the ability to get the data off disk faster in the nodes, which will be thicker, denser and have more memory and storage,” Patton says.

    “It takes a lot of computation on expensive HPC systems to find the distinguishing features in all the noise,” says Patton. “The problem is, we are throwing away a lot of good data. For a field like materials science, for instance, it’s not unlikely for them to pitch more than 90 percent of their data because it’s so noisy and they lack the tools to deal with it.” He says this is also why his teams are looking at integrating novel architectures to offload to, including neuromorphic and quantum computers—something we will talk about more later this week in an interview with ORNL collaborator, Thomas Potok.

    [I WANT SOMEONE TO TELL ME WHAT HAPPENS TO TITAN NEXT.]

    See the full article here .

    Please help promote STEM in your local schools.

    STEM Icon

    Stem Education Coalition

    ORNL is managed by UT-Battelle for the Department of Energy’s Office of Science. DOE’s Office of Science is the single largest supporter of basic research in the physical sciences in the United States, and is working to address some of the most pressing challenges of our time.

    i2

    The Oak Ridge Leadership Computing Facility (OLCF) was established at Oak Ridge National Laboratory in 2004 with the mission of accelerating scientific discovery and engineering progress by providing outstanding computing and data management resources to high-priority research and development projects.

    ORNL’s supercomputing program has grown from humble beginnings to deliver some of the most powerful systems in the world. On the way, it has helped researchers deliver practical breakthroughs and new scientific knowledge in climate, materials, nuclear science, and a wide range of other disciplines.

    The OLCF delivered on that original promise in 2008, when its Cray XT “Jaguar” system ran the first scientific applications to exceed 1,000 trillion calculations a second (1 petaflop). Since then, the OLCF has continued to expand the limits of computing power, unveiling Titan in 2013, which is capable of 27 petaflops.


    ORNL Cray XK7 Titan Supercomputer

    Titan is one of the first hybrid architecture systems—a combination of graphics processing units (GPUs), and the more conventional central processing units (CPUs) that have served as number crunchers in computers for decades. The parallel structure of GPUs makes them uniquely suited to process an enormous number of simple computations quickly, while CPUs are capable of tackling more sophisticated computational algorithms. The complimentary combination of CPUs and GPUs allow Titan to reach its peak performance.

    The OLCF gives the world’s most advanced computational researchers an opportunity to tackle problems that would be unthinkable on other systems. The facility welcomes investigators from universities, government agencies, and industry who are prepared to perform breakthrough research in climate, materials, alternative energy sources and energy storage, chemistry, nuclear physics, astrophysics, quantum mechanics, and the gamut of scientific inquiry. Because it is a unique resource, the OLCF focuses on the most ambitious research projects—projects that provide important new knowledge or enable important new technologies.

     
  • richardmitnick 3:07 pm on March 15, 2017 Permalink | Reply
    Tags: , Coding a Starkiller, , OLCF,   

    From OLCF via ASCR and DOE: “Coding a Starkiller” 

    i1

    Oak Ridge National Laboratory

    OLCF

    ASCR

    March 2017

    The Titan supercomputer and a tool called Starkiller help Stony Brook University-led team simulate key moments in exploding stars.

    1
    A volume rendering of the density after 0.6 and 0.9 solar mass white dwarfs merge. The image is derived from a calculation performed on the Oak Ridge Leadership Computing facility’s Titan supercomputer. The model used Castro, an adaptive mesh astrophysical radiation hydrodynamics simulation code. Image courtesy of Stony Brook University / Max Katz et al.

    The spectacular Supernova 1987A, whose light reached Earth on Feb. 23 of the year it’s named for, captured the public’s fancy. It’s located at the edge of the Milky Way, in a dwarf galaxy called the Large Magellanic Cloud. It had been four centuries since earthlings had witnessed light from a star exploding in our galaxy.

    1
    NASA

    A supernova’s awesome light show heralds a giant star’s death, and the next supernova’s post-mortem will generate reams of data, compared to the paltry dozen or so neutrinos and X-rays harvested from the 1987 event.

    Astrophysicists Michael Zingale and Bronson Messer aren’t waiting. They’re aggressively anticipating the next supernova by leading teams in high-performance computer simulations of explosive stellar events, including different supernova types and their accompanying X-ray bursts. Zingale, of Stony Brook University, and Messer, of the Department of Energy’s Oak Ridge National Laboratory (ORNL), are in the midst of an award from the DOE Office of Science’s Innovative and Novel Computational Impact on Theory and Experiment (INCITE) program. It provides an allocation of 45 million processor hours of computer time on Titan, a Cray XK7 that’s one of the world’s most powerful supercomputers, at the Oak Ridge Leadership Computing Facility, or OLCF – a DOE Office of Science user facility.

    The simulations run on workhorse codes developed by the INCITE collaborators and at the DOE’s Lawrence Berkeley National Laboratory – codes that “are often modified toward specific problems,” Zingale says. “And the common problem we share with ORNL is that we have to put more and more of our algorithms on the Titan graphics processor units (GPUs),” specialized computer chips that accelerate calculations. While the phenomena they’re modeling “are really far away and on scales that are hard to imagine,” the codes have other applications closer to home: “terrestrial phenomena, like terrestrial combustion.” The team’s codes – Maestro, Castro, Chimera and FLASH – are available to other modelers free through online code repository Github.

    With a previous INCITE award, the researchers realized the possibility of attacking the GPU problem together. They envisioned codes comprised of multiphysics modules that compute common pieces of most kinds of explosive activities, Messer says. They dubbed the growing collection of GPU-enabled modules Starkiller.

    “Starkiller ties this INCITE project together,” he says. “We realized we didn’t want to reinvent the wheel with each new simulation.” For example, a module that tracks nuclear burning helps the researchers create larger networks for nucleosynthesis, a supernova process in which elements form in the turbulent flow on the stellar surface.

    “In the past, we were able to do only a little more than a dozen different elements, and now we’re routinely doing 150,” Messer says. “We can make the GPU run so much faster. That’s part of Titan’s advantage to us.”

    Supernova 1987A, a type II supernova, arose from the gravitational collapse of a stellar core, the consistent fate of massive stars. Type Ia supernovae follow from intense thermonuclear activities that eventually drive the explosion of a white dwarf – a star that has used up all its hydrogen. Zingale’s group is focused on type Ia, Messer’s on type II. A type II leaves a remnant star; a type Ia does not.

    Stars like the sun burn hydrogen into helium and, over enormous stretches of time, burn the helium into carbon. Once our sun starts burning carbon, it will gradually peter out, Messer says, because it’s not massive enough to turn the carbon into something heavier.

    “A star begins life as a big ball of hydrogen, and its whole life is this fight between gravity trying to suck it into the middle and thermonuclear reactions keeping it supported against its own gravity,” he adds. “Once it gets to the point where it’s burning some carbon, the sun will just give up. It will blow a big smoke ring into space and become a planetary nebula, and at the center it will become a white dwarf.”

    Zingale is modeling two distinct thermonuclear modes. One is for a white dwarf in a binary system – two stars orbiting one another – that consumes additional material from its partner. As the white dwarf grows in mass, it gets hotter and denser in the center, creating conditions that drive thermonuclear reactions.

    “This star is made mostly of carbon and oxygen,” Zingale says. “When you get up to a few hundred million K, you have densities of a few billion grams per cubic centimeter. Carbon nuclei get fused and make things like neon and sodium and magnesium, and the star gets energy out in that process. We are modeling the star’s convection, the creation of a rippling burning front that converts the carbon and oxygen into heavier elements such as iron and nickel. This creates such an enormous amount of energy that it overcomes the force of gravity that’s holding the star together, and the whole thing blows apart.”

    The other mode is being modeled with former Stony Brook graduate student and INCITE co-principal investigator Max Katz, who want to understand whether merging stars can create a burning point that leads to a supernova, as some observations suggest. His simulations feature two white dwarfs so close that they emit gravitational radiation, robbing energy from the system and causing the stars to spiral inward. Eventually, they get so close that the more massive one rips the lesser apart via tidal energy.

    Zingale’s group also continues to model the convective burning on stars, known as X-ray bursts, providing a springboard to more in-depth studies. He says they’re the first to simulate them in three dimensions. That work and additional supernova studies were supported by the DOE Office of Science and performed at OLCF and the National Energy Research Scientific Computing Center, a DOE Office of Science user facility at Lawrence Berkeley National Laboratory.

    See the full article here .

    Please help promote STEM in your local schools.

    STEM Icon

    Stem Education Coalition

    ORNL is managed by UT-Battelle for the Department of Energy’s Office of Science. DOE’s Office of Science is the single largest supporter of basic research in the physical sciences in the United States, and is working to address some of the most pressing challenges of our time.

    i2

    The Oak Ridge Leadership Computing Facility (OLCF) was established at Oak Ridge National Laboratory in 2004 with the mission of accelerating scientific discovery and engineering progress by providing outstanding computing and data management resources to high-priority research and development projects.

    ORNL’s supercomputing program has grown from humble beginnings to deliver some of the most powerful systems in the world. On the way, it has helped researchers deliver practical breakthroughs and new scientific knowledge in climate, materials, nuclear science, and a wide range of other disciplines.

    The OLCF delivered on that original promise in 2008, when its Cray XT “Jaguar” system ran the first scientific applications to exceed 1,000 trillion calculations a second (1 petaflop). Since then, the OLCF has continued to expand the limits of computing power, unveiling Titan in 2013, which is capable of 27 petaflops.


    ORNL Cray XK7 Titan Supercomputer

    Titan is one of the first hybrid architecture systems—a combination of graphics processing units (GPUs), and the more conventional central processing units (CPUs) that have served as number crunchers in computers for decades. The parallel structure of GPUs makes them uniquely suited to process an enormous number of simple computations quickly, while CPUs are capable of tackling more sophisticated computational algorithms. The complimentary combination of CPUs and GPUs allow Titan to reach its peak performance.

    The OLCF gives the world’s most advanced computational researchers an opportunity to tackle problems that would be unthinkable on other systems. The facility welcomes investigators from universities, government agencies, and industry who are prepared to perform breakthrough research in climate, materials, alternative energy sources and energy storage, chemistry, nuclear physics, astrophysics, quantum mechanics, and the gamut of scientific inquiry. Because it is a unique resource, the OLCF focuses on the most ambitious research projects—projects that provide important new knowledge or enable important new technologies.

     
  • richardmitnick 9:10 am on September 30, 2016 Permalink | Reply
    Tags: , MAESTRO code for supercomputing, OLCF, OLCF Team Resolves Performance Bottleneck in OpenACC Code, , ,   

    From ORNL: “OLCF Team Resolves Performance Bottleneck in OpenACC Code” 

    i1

    Oak Ridge National Laboratory

    1

    Oak Ridge Leadership Computing Facility

    September 28, 2016
    Elizabeth Rosenthal

    2
    By improving its MAESTRO code, a team led by Michael Zingale of Stony Brook University is modeling astrophysical phenomena with improved fidelity. Pictured above, a three-dimensional simulation of Type I x-ray bursts, a recurring explosive event triggered by the buildup of hydrogen and helium on the surface of a neutron star. No image caption.

    For any high-performance computing code, the best performance is both highly effective and highly efficient, using little power but producing high-quality results. However, performance bottlenecks can arise within these codes, which can hinder projects and require researchers to search for the underlying problem.

    A team at the Oak Ridge Leadership Computing Facility (OLCF), a US Department of Energy (DOE) Office of Science User Facility located at DOE’s Oak Ridge National Laboratory, recently addressed a performance bottleneck in one portion of an OLCF user’s application. Because of its efforts, the user’s team saw a sixfold performance improvement in the code. Team members for this project include Frank Winkler (OLCF), Oscar Hernandez (OLCF), Adam Jacobs (Stony Brook University), Jeff Larkin (NVIDIA), and Robert Dietrich (Dresden University of Technology).

    “If the code runs faster, then you need less power. Everything is better, more efficient,” said Winkler, performance tools specialist at the OLCF. “That’s why we have performance analysis tools.”

    Known as MAESTRO, the astrophysics code in question models the burning of exploding stars and other stellar phenomena. Such modeling is possible because of the code’s OpenACC configuration, an approach meant to simplify the programming of CPU and GPU systems. The OLCF team worked specifically with the piece of the algorithm that models the physics of nuclear burning.

    Initially that portion of MAESTRO did not perform as well as expected because the GPUs could not quickly access the data. To remedy the situation the team used diagnostic analysis tools to discover the reason for the delay. Winkler explained that Score-P, a performance measurement tool, traces the application, whereas VAMPIR, a performance visualization tool, conceptualizes the trace file, allowing users to see a timeline of activity within a code.

    “When you trace the code, you record each significant event in sequence,” Winkler said.

    By analyzing the results the team found that although data moving from CPUs to GPUs performed adequately, the code was significantly slower when sending data from GPUs to CPUs. Larkin, an NVIDIA software engineer, suggested using a compiler flag—custom instructions that modify how programming commands are expressed in code—to store data in a more convenient location for the GPUs, which resulted in the code’s dramatic speedup.

    Jacobs, an astrophysicist working on a PhD at Stony Brook, brought the OpenACC code to the OLCF in June to get expert assistance. Jacobs is a member of a research group led by Michael Zingale, also of Stony Brook.

    During the week Jacobs spent at the OLCF, the team ran MAESTRO on the Titan supercomputer, the OLCF’s flagship hybrid system.

    ORNL Cray Titan Supercomputer
    ORNL Cray Titan Supercomputer

    By leveraging tools like Score-P and VAMPIR on this system, the team employed problem-solving skills and computational analysis to resolve the bottleneck—and did so after just a week of working with the code. Both Winkler and Jacobs stressed that their rapid success depended on collaboration; the individuals involved, as well as the OLCF, provided the necessary knowledge and resources to reach a mutually beneficial outcome.

    “We are working with technology in a way that was not possible a year ago,” Jacobs said. “I am so grateful that the OLCF hosted me and gave me their time and experience.”

    Because of these improvements, the MAESTRO code can run the latest nuclear burning models faster and perform higher-level physics than before—capabilities that are vital to computational astrophysicists’ investigation of astronomical events like supernovas and x-ray bursts.

    “There are two main benefits to this performance improvement,” Jacobs said. “First, your code is now getting to a solution faster, and second, you can now spend a similar amount of time working on something much more complicated.”

    See the full article here .

    Please help promote STEM in your local schools.

    STEM Icon

    Stem Education Coalition

    ORNL is managed by UT-Battelle for the Department of Energy’s Office of Science. DOE’s Office of Science is the single largest supporter of basic research in the physical sciences in the United States, and is working to address some of the most pressing challenges of our time.

    i2

     
c
Compose new post
j
Next post/Next comment
k
Previous post/Previous comment
r
Reply
e
Edit
o
Show/Hide comments
t
Go to top
l
Go to login
h
Show/Hide help
shift + esc
Cancel
%d bloggers like this: