Tagged: Supercomputing Toggle Comment Threads | Keyboard Shortcuts

  • richardmitnick 12:13 pm on May 19, 2019 Permalink | Reply
    Tags: "CosmoGAN Neural Network to Study Dark Matter", , , , , , , , New deep learning network, Supercomputing   

    From insideHPC: “CosmoGAN Neural Network to Study Dark Matter” 

    From insideHPC

    May 18, 2019
    Rich Brueckner

    As cosmologists and astrophysicists delve deeper into the darkest recesses of the universe, their need for increasingly powerful observational and computational tools has expanded exponentially. From facilities such as the Dark Energy Spectroscopic Instrument to supercomputers like Lawrence Berkeley National Laboratory’s Cori system at NERSC, they are on a quest to collect, simulate, and analyze increasing amounts of data that can help explain the nature of things we can’t see, as well as those we can.

    Why opt for GANs instead of other types of generative models? Performance and precision, according to Mustafa.

    “From a deep learning perspective, there are other ways to learn how to generate convergence maps from images, but when we started this project GANs seemed to produce very high-resolution images compared to competing methods, while still being computationally and neural network size efficient,” he said.

    “We were looking for two things: to be accurate and to be fast,” added co-author Zaria Lukic, a research scientist in the Computational Cosmology Center at Berkeley Lab. “GANs offer hope of being nearly as accurate compared to full physics simulations.”

    The research team is particularly interested in constructing a surrogate model that would reduce the computational cost of running these simulations. In the Computational Astrophysics and Cosmology paper, they outline a number of advantages of GANs in the study of large physics simulations.

    “GANs are known to be very unstable during training, especially when you reach the very end of the training and the images start to look nice – that’s when the updates to the network can be really chaotic,” Mustafa said. “But because we have the summary statistics that we use in cosmology, we were able to evaluate the GANs at every step of the training, which helped us determine the generator we thought was the best. This procedure is not usually used in training GANs.”

    Using the CosmoGAN generator network, the team has been able to produce convergence maps that are described by – with high statistical confidence – the same summary statistics as the fully simulated maps. This very high level of agreement between convergence maps that are statistically indistinguishable from maps produced by physics-based generative models offers an important step toward building emulators out of deep neural networks.

    1
    Weak lensing convergence maps for the ΛCDM cosmological model. Randomly selected maps from validation dataset (top) and GAN-generated examples (bottom).

    Weak gravitational lensing NASA/ESA Hubble

    Lambda-Cold Dark Matter, Accelerated Expansion of the Universe, Big Bang-Inflation (timeline of the universe) Date 2010 Credit: Alex Mittelmann Cold creation


    NERSC Cray Cori II supercomputer at NERSC at LBNL, named after Gerty Cori, the first American woman to win a Nobel Prize in science

    LBNL/DESI spectroscopic instrument on the Mayall 4-meter telescope at Kitt Peak National Observatory starting in 2018

    NOAO/Mayall 4 m telescope at Kitt Peak, Arizona, USA, Altitude 2,120 m (6,960 ft)

    Toward this end, gravitational lensing is one of the most promising tools scientists have to extract this information by giving them the ability to probe both the geometry of the universe and the growth of cosmic structure.

    Gravitational Lensing NASA/ESA

    Gravitational lensing distorts images of distant galaxies in a way that is determined by the amount of matter in the line of sight in a certain direction, and it provides a way of looking at a two-dimensional map of dark matter, according to Deborah Bard, Group Lead for the Data Science Engagement Group at NERSC.

    “Gravitational lensing is one of the best ways we have to study dark matter, which is important because it tells us a lot about the structure of the universe,” she said. “The majority of matter in the universe is dark matter, which we can’t see directly, so we have to use indirect methods to study how it is distributed.”

    But as experimental and theoretical datasets grow, along with the simulations needed to image and analyze this data, a new challenge has emerged: these simulations are increasingly – even prohibitively – computationally expensive. So computational cosmologists often resort to computationally cheaper surrogate models, which emulate expensive simulations. More recently, however, “advances in deep generative models based on neural networks opened the possibility of constructing more robust and less hand-engineered surrogate models for many types of simulators, including those in cosmology,” said Mustafa Mustafa, a machine learning engineer at NERSC and lead author on a new study that describes one such approach developed by a collaboration involving Berkeley Lab, Google Research, and the University of KwaZulu-Natal.

    A variety of deep generative models are being investigated for science applications, but the Berkeley Lab-led team is taking a unique tack: generative adversarial networks (GANs). In a paper published May 6, 2019 in Computational Astrophysics and Cosmology, they discuss their new deep learning network, dubbed CosmoGAN, and its ability to create high-fidelity, weak gravitational lensing convergence maps.

    “A convergence map is effectively a 2D map of the gravitational lensing that we see in the sky along the line of sight,” said Bard, a co-author on the Computational Astrophysics and Cosmology paper. “If you have a peak in a convergence map that corresponds to a peak in a large amount of matter along the line of sight, that means there is a huge amount of dark matter in that direction.”

    The Advantages of GANs

    “The huge advantage here was that the problem we were tackling was a physics problem that had associated metrics,” Bard said. “But with our approach, there are actual metrics that allow you to quantify how accurate your GAN is. To me that is what is really exciting about this – how these kinds of physics problems can influence machine learning methods.”

    Ultimately such approaches could transform science that currently relies on detailed physics simulations that require billions of compute hours and occupy petabytes of disk space – but there is considerable work still to be done. Cosmology data (and scientific data in general) can require very high-resolution measurements, such as full-sky telescope images.

    “The 2D images considered for this project are valuable, but the actual physics simulations are 3D and can be time-varying ?and irregular, producing a rich, web-like structure of features,” said Wahid Bhmiji, a big data architect in the Data and Analytics Services group at NERSC and a co-author on the Computational Astrophysics and Cosmology paper. “In addition, the approach needs to be extended to explore new virtual universes rather than ones that have already been simulated – ultimately building a controllable CosmoGAN.”

    “The idea of doing controllable GANs is essentially the Holy Grail of the whole problem that we are working on: to be able to truly emulate the physical simulators we need to build surrogate models based on controllable GANs,” Mustafa added. “Right now we are trying to understand how to stabilize the training dynamics, given all the advances in the field that have happened in the last couple of years. Stabilizing the training is extremely important to actually be able to do what we want to do next.”

    See the full article here .

    five-ways-keep-your-child-safe-school-shootings

    Please help promote STEM in your local schools.

    Stem Education Coalition

    Founded on December 28, 2006, insideHPC is a blog that distills news and events in the world of HPC and presents them in bite-sized nuggets of helpfulness as a resource for supercomputing professionals. As one reader said, we’re sifting through all the news so you don’t have to!

    If you would like to contact me with suggestions, comments, corrections, errors or new company announcements, please send me an email at rich@insidehpc.com. Or you can send me mail at:

    insideHPC
    2825 NW Upshur
    Suite G
    Portland, OR 97239

    Phone: (503) 877-5048

     
  • richardmitnick 11:21 am on May 11, 2019 Permalink | Reply
    Tags: "NVIDIA Builds First AI Platform for NHS Hospitals in the U.K.", , , NVIDIA DGX-2 systems, Supercomputing   

    From insideHPC: “NVIDIA Builds First AI Platform for NHS Hospitals in the U.K.” 

    From insideHPC

    May 10, 2019
    Rich Brueckner

    1
    NVIDIA DGX-2

    Today NVIDIA and King’s College London announced they are partnering to build an AI platform designed to allow specialists in the U.K.’s National Health Service (NHS) to train computers to automate the most time-consuming part of radiology interpretation.

    “This centre marks a significant chapter in the future of AI-enabled NHS hospitals, and the infrastructure is an essential part of building new AI tools which will benefit patients and the healthcare system as a whole,” said Professor Ourselin. “The NVIDIA DGX-2 AI system’s large memory and massive computing power make it possible for us to tackle training of large, 3D datasets in minutes instead of days while keeping the data secure on the premises of the hospital.”

    The collaboration is part of King’s London Medical Imaging & AI Centre for Value-Based Healthcare, an ongoing project intended to transform 12 clinical pathways in oncology, cardiology and neurology, as well as improve diagnoses and patient care in the NHS. The work could lead to breakthroughs in classifying stroke and neurological impairments, determining the underlying causes of cancers and recommending the best treatments for patients.

    NVIDIA DGX-2 AI Systems Power First Point-of-Care Platform

    King’s is implementing NVIDIA DGX-2 systems, which are 2-petaflops GPU-powered supercomputers for AI research, as part of the first phase of the project. It will also use the NVIDIA Clara AI toolkit with its own imaging technologies, for example NiftyNet, as well as those from partners such as Kheiron Medical, Mirada and Scan.

    The NVIDIA Clara AI toolkit is a key part of the NVIDIA Clara developer platform, on which intelligent workflows can be built. NVIDIA Clara consists of libraries for data and image processing, AI model processing, and visualization.

    Researchers and engineers from NVIDIA and King’s will also join clinicians from major London hospitals onsite at King’s College Hospital, Guy’s and St Thomas’, and South London and Maudsley. This combination of research, technology and clinicians will accelerate the discovery of data strategies, resolve targeted AI problems and speed up deployment in clinics.

    Federated Learning Supports Data Privacy

    For the first time in the NHS, federated learning will be applied to algorithm development, ensuring the privacy of patient data. Federated learning allows AI algorithms to be developed at multiple sites, using data from each individual hospital, without the need for data to travel outside of its own domain.

    This approach is crucial for the development of AI in clinical environments, where the security and governance of data is of the highest importance. AI models will be developed in different NHS trusts across the U.K., built on data from different patient demographics and clinical attributes.

    With models developed at individual NHS trusts, the data will give more accurate and representative insight into patients from that particular area. The NHS will also be able to combine these trust-specific models to build a larger, demographically richer overall model.

    By bringing together a critical mass of industry and university partners, the London Medical Imaging & AI Centre for Value-Based Healthcare will allow the NHS to share and analyze data on a scale that has not previously been possible, according to Professor Sebastien Ourselin, head of the School of Biomedical Engineering & Imaging Sciences at King’s College London.

    “Together with King’s College London, we’re working to push the envelope in AI for healthcare,” said Jaap Zuiderveld, vice president for EMEA at NVIDIA. “DGX-2 systems with the NVIDIA Clara platform will enable the project to scale and drive breakthroughs in radiology ultimately help improve patient outcomes within the NHS.”

    The collaboration between NVIDIA and King’s College London is part of the UKRI program for Radiology and Pathology, an innovation fund that has supported the growing community looking to integrate AI workflows into the NHS.

    See the full article here .

    five-ways-keep-your-child-safe-school-shootings

    Please help promote STEM in your local schools.

    Stem Education Coalition

    Founded on December 28, 2006, insideHPC is a blog that distills news and events in the world of HPC and presents them in bite-sized nuggets of helpfulness as a resource for supercomputing professionals. As one reader said, we’re sifting through all the news so you don’t have to!

    If you would like to contact me with suggestions, comments, corrections, errors or new company announcements, please send me an email at rich@insidehpc.com. Or you can send me mail at:

    insideHPC
    2825 NW Upshur
    Suite G
    Portland, OR 97239

    Phone: (503) 877-5048

     
  • richardmitnick 10:03 am on May 9, 2019 Permalink | Reply
    Tags: A big universe needs big computing-Sijacki accessed HPC resources through XSEDE in the US and PRACE in Europe, , , , , Debora Sijacki, , She now uses the UK’s National Computing Service DiRAC in combination with PRACE, Sijacki wants to understand the role supermassive black holes (SMBH) play in galaxy formation., Supercomputing,   

    From Science Node: Women in STEM- “Shining a light on cosmic darkness” Debora Sijacki 

    Science Node bloc
    From Science Node

    08 May, 2019
    Alisa Alering

    1
    Debora Sijacki. Courtesy David Orr.

    Award-winning astrophysicist Debora Sijacki wants to understand how galaxies form.

    Carl Sagan once described the Earth as a “pale blue dot, a lonely speck in the great enveloping cosmic dark.”

    The need to shine a light into that cosmic darkness has long inspired astronomers to investigate the wonders that lie beyond our lonely planet. For Debora Sijacki, a reader in astrophysics and cosmology at the University of Cambridge, her curiosity takes the form of simulating galaxies in order to understand their origins.

    2
    A supermassive black hole at the center of a young, star-rich galaxy. SMBHs distort space and light around them, as illustrated by the warped stars behind the black hole. Courtesy NASA/JPL-Caltech.

    “We human beings are a part of our Universe and we ultimately want to understand where we came from,” says Sijacki. “We want to know what is this bigger picture that we are taking part in.”

    Sijacki is the winner of the 2019 PRACE Ada Lovelace Award for HPC for outstanding contributions to and impact on high-performance computing (HPC). Initiated in 2016, the award recognizes female scientists working in Europe who have an outstanding impact on HPC research and who provide a role model for other women.

    Specifically, Sijacki wants to understand the role supermassive black holes (SMBH) play in galaxy formation. These astronomical objects are so immense that they contain mass on the order of hundreds of thousands to even billions of times the mass of the Sun. At the same time they are so compact that, if the Earth were a black hole, it would fit inside a penny.

    The first image of a black hole, Messier 87 Credit Event Horizon Telescope Collaboration, via NSF and ERC 4.10.19

    SMBHs are at the center of many massive galaxies—there’s even one at the center of our own galaxy, The Milky Way. Astronomers theorize that these SMBHs are important not just in their own right but because they affect the properties of the galaxies themselves.

    Sgr A* from ESO VLT


    SGR A* ,the supermassive black hole at the center of the Milky Way. NASA’s Chandra X-Ray Observatory

    “What we think happens is that when gas accretes very efficiently and draws close to the SMBH it eventually falls into the SMBH,” says Sijacki. “The SMBH then grows in mass, but at the same time this accretion process is related to an enormous release of energy that can actually change the properties of galaxies themselves.”

    A big universe needs big computing

    To investigate the interplay of these astronomical phenomena, Sijacki and her team create simulations where they can zoom into details of SMBHs while at the same time viewing a large patch of the Universe. This allows them to focus on the physics of how black holes influence galaxies and even larger environments.

    3
    Dark matter density (l) transitioning to gas density (r). Large-scale projection through the Illustris volume at z=0, centered on the most massive galaxy cluster of the Illustris cosmological simulation. Courtesy Illustris Simulation.

    But in order to study something as big as the Universe, you need a big computer. Or several. As a Hubble Fellow at Harvard University, Sijacki accessed HPC resources through XSEDE in the US and PRACE in Europe. She now uses the UK’s National Computing Service DiRAC in combination with PRACE.

    4

    DiRAC is the UK’s integrated supercomputing facility for theoretical modelling and HPC-based research in particle physics, astronomy and cosmology.

    PRACE supercomputing resources

    Hazel Hen, GCS@HLRS, Cray XC40 supercomputer Germany

    JOLIOT CURIE of GENCI Atos BULL Sequana system X1000 supercomputer France

    JUWELS, GCS@FZJ, Atos supercomputer Germany

    MARCONI, CINECA, Lenovo NeXtScale supercomputer Italy

    MareNostrum Lenovo supercomputer of the National Supercomputing Center in Barcelona

    Cray Piz Daint Cray XC50/XC40 supercomputer of the Swiss National Supercomputing Center (CSCS)

    SuperMUC-NG, GCS@LRZ, Lenovo supercomputer Germany

    According to Sijacki, in the 70s, 80s, and 90s, astrophysicists laid the foundations of galaxy formation and developed some of the key ideas that still guide our understanding. But it was soon recognized that these theories needed to be refined—or even refuted.

    “There is only so much we can do with the pen-and-paper approach,” says Sijacki. “The equations we are working on are very complex and we have to solve them numerically. And it’s not just a single physical process, but many different mechanisms that we want to explain. Often when you put different bits of complex physics together, you can’t easily predict the outcome.”

    The other motivation for high-performance computing is the need for higher resolution models. This is because the physics in the real Universe occurs on a vast range of scales.

    “We’re talking about billions and trillions of resolution elements,” says Sijacki. “It requires massive parallel calculations on thousands of cores to evolve this really complex system with many resolution elements.”

    In recent years, high-performance computing resources have become more powerful and more widely available. New architectures and novel algorithms promise even greater efficiency and optimized parallelization.

    4
    Jet feedback from active galactic nuclei. (A) Large-scale image of the gas density centered on a massive galaxy cluster. (B) High-velocity jet launched by the central supermassive black hole. (C) Cold disk-like structure around the SMBH from which black hole is accreting. (D) 2D Voronoi mesh reconstruction and (E) velocity streamline map of a section of the jet, illustrating massive increase in spatial resolution achieved by this simulation. Courtesy Bourne, Sijacki, and Puchwein.

    Given these advances, Sijacki projects a near-future where astrophysicists can, for the first time, perform simulations that can consistently track individual stars in a given galaxy and follow that galaxy within a cosmological framework.

    “Full predictive models of the evolution of our Universe is our ultimate goal,” says Sijacki. “We would like to have a theory that is completely predictive, free of ill-constrained parameters, where we can theoretically understand how the Universe was built and how the structures in the Universe came about. This is our guiding star.”

    Awards matter

    When asked about the significance of the award, Sijacki says that she is proud to have her research recognized—and to be associated with the name of Ada Lovelace.

    Perhaps more importantly, the award has already had an immediate effect on the female PhD students and post-docs at Cambridge’s Institute of Astronomy. Sijacki says the recognition motivates the younger generations of female scientists, by showing them that this is a possible career path that leads to success and recognition.

    “I have seen how my winning this award makes them more enthusiastic—and more ambitious,” says Sijacki. “I was really happy to see that.”

    See the full article here .


    five-ways-keep-your-child-safe-school-shootings
    Please help promote STEM in your local schools.

    Stem Education Coalition

    Science Node is an international weekly online publication that covers distributed computing and the research it enables.

    “We report on all aspects of distributed computing technology, such as grids and clouds. We also regularly feature articles on distributed computing-enabled research in a large variety of disciplines, including physics, biology, sociology, earth sciences, archaeology, medicine, disaster management, crime, and art. (Note that we do not cover stories that are purely about commercial technology.)

    In its current incarnation, Science Node is also an online destination where you can host a profile and blog, and find and disseminate announcements and information about events, deadlines, and jobs. In the near future it will also be a place where you can network with colleagues.

    You can read Science Node via our homepage, RSS, or email. For the complete iSGTW experience, sign up for an account or log in with OpenID and manage your email subscription from your account preferences. If you do not wish to access the website’s features, you can just subscribe to the weekly email.”

     
  • richardmitnick 10:27 am on May 7, 2019 Permalink | Reply
    Tags: , , , , , , , Supercomputing, TRC- Translational Research Capability   

    From Oak Ridge National Laboratory: “New research facility will serve ORNL’s growing mission in computing, materials R&D” 

    i1

    From Oak Ridge National Laboratory

    May 7, 2019
    Bill H Cabage
    cabagewh@ornl.gov
    865-574-4399

    1
    Pictured in this early conceptual drawing, the Translational Research Capability planned for Oak Ridge National Laboratory will follow the design of research facilities constructed during the laboratory’s modernization campaign.

    Energy Secretary Rick Perry, Congressman Chuck Fleischmann and lab officials today broke ground on a multipurpose research facility that will provide state-of-the-art laboratory space for expanding scientific activities at the Department of Energy’s Oak Ridge National Laboratory.

    The new Translational Research Capability, or TRC, will be purpose-built for world-leading research in computing and materials science and will serve to advance the science and engineering of quantum information.

    “Through today’s groundbreaking, we’re writing a new chapter in research at the Translational Research Capability Facility,” said U.S. Secretary of Energy Rick Perry. “This building will be the home for advances in Quantum Information Science, battery and energy storage, materials science, and many more. It will also be a place for our scientists, researchers, engineers, and innovators to take on big challenges and deliver transformative solutions.”

    With an estimated total project cost of $95 million, the TRC, located in the central ORNL campus, will accommodate sensitive equipment, multipurpose labs, heavy equipment and inert environment labs. Approximately 75 percent of the facility will contain large, modularly planned and open laboratory areas with the rest as office and support spaces.

    “This research and development space will advance and support the multidisciplinary mission needs of the nation’s advanced computing, materials research, fusion science and physics programs,” ORNL Director Thomas Zacharia said. “The new building represents a renaissance in the way we carry out research allowing more flexible alignment of our research activities to the needs of frontier research.”

    The flexible space will support the lab’s growing fundamental materials research to advance future quantum information science and computing systems. The modern facility will provide atomic fabrication and materials characterization capabilities to accelerate the development of novel quantum computing devices. Researchers will also use the facility to pursue advances in quantum modeling and simulation, leveraging a co-design approach to develop algorithms along with prototype quantum systems.

    The new laboratories will provide noise isolation, electromagnetic shielding and low vibration environments required for multidisciplinary research in quantum information science as well as materials development and performance testing for fusion energy applications. The co-location of the flexible, modular spaces will enhance collaboration among projects.

    At approximately 100,000 square feet, the TRC will be similar in size and appearance to another modern ORNL research facility, the Chemical and Materials Sciences Building, which was completed in 2011 and is located nearby.

    The facility’s design and location will also conform to sustainable building practices with an eye toward encouraging collaboration among researchers. The TRC will be centrally located in the ORNL main campus area on a brownfield tract that was formerly occupied by one of the laboratory’s earliest, Manhattan Project-era structures.

    ORNL began a modernization campaign shortly after UT-Battelle arrived in 2000 to manage the national laboratory. The new construction has enabled the laboratory to meet growing space and infrastructure requirements for rapidly advancing fields such as scientific computing while vacating legacy spaces with inherent high operating costs, inflexible infrastructure and legacy waste issues.

    The construction is supported by the Science Laboratory Infrastructure program of the DOE Office of Science.

    See the full article here .


    five-ways-keep-your-child-safe-school-shootings
    Please help promote STEM in your local schools.

    Stem Education Coalition

    ORNL is managed by UT-Battelle for the Department of Energy’s Office of Science. DOE’s Office of Science is the single largest supporter of basic research in the physical sciences in the United States, and is working to address some of the most pressing challenges of our time.

    i2

     
  • richardmitnick 6:37 pm on May 6, 2019 Permalink | Reply
    Tags: , , Jülich Supercomputing Centre Germany, MEGWARE Cluster Module, Supercomputing   

    From insideHPC: “Jülich installs first Cluster Module for DEEP-EST Exascale Project” 

    From insideHPC

    Researchers at the Jülich Supercomputing Centre have installed the first-ever Cluster Module from MEGWARE as part of the European DEEP-EST project for Exascale computing. The Cluster Module consist of a single rack with 50 Intel Xeon Scalable Processor-based dual-socket nodes with a Mellanox EDR-InfiniBand 100Gbps high-performance cluster fabric.

    1
    MEGWARE has installed the first module, the Cluster Module at the Jülich Supercomputing Centre. The two remaining compute modules (ESB and DAM) will follow by the end of this year.

    “The DEEP-EST Modular Supercomputer Architecture (MSA) is an innovative approach to build High-Performance Computing and High-Performance Data Analytics (HPDA) systems by coupling various compute modules, following a building-block principle. Each module is tailored to the needs of a specific group of applications, and all modules together behave as a single machine. This is ensured by connecting them through a high-speed network federation and operating them with a uniform system software and programming environment. This allows one application or workflow to be distributed over several modules, running each part of its code onto the best suited hardware module.”

    Creating a modular supercomputer that best fits the requirements of the diverse, increasingly complex, and newly emerging applications is the objective of DEEP-EST, an EU project launched on July 1, 2017, lead and coordinated by the Jülich Supercomputing Centre (JSC). The DEEP-EST project builds a prototype with three compute modules: the Cluster Module (CM), the Extreme Scale Booster (ESB), and the Data Analytics Module (DAM). The CM is a general-purpose cluster and targets low/medium scalable applications, while the ESB is built as a cluster of accelerators to provide energy-efficient computing power to high scalable codes. Last, but not least, the DAM addresses the specific needs of Machine/Deep Learning, Artificial Intelligence and Big Data applications and workloads.

    The CM is designed to support the full range of general-purpose HPC cluster applications and workloads. For efficiency, its integration is done using MEGWARE’s ColdCon direct liquid (hot-water) cooling and SlideSX-LC packaging technologies.

    “Based on several years of intensive experience in energy efficient computing, MEGWARE’s award winning direct liquid cooling solution represents a leading European HPC technology that is scalable and sustainable,” said Dr. Herbert Cornelius, Principal System Architect at MEGWARE. “Energy efficiency is one of several critical design points for future Exascale supercomputer solutions.”

    2

    The Cluster Module is the first step in the installation of the DEEP-EST prototype, and an important milestone in JSC’s strategy around the Modular Supercomputing Architecture.

    “We see today how our users increasingly combine different simulation models to reproduce complex phenomena. They also employ both HPC and Data Analytics approaches. This diversifies our user-portfolio enormously, making it hardly possible to fulfill all needs with one supercomputer,” said Prof. Thomas Lippert, director of the Jülich Supercomputing Centre. “The DEEP-EST prototype will demonstrate that a Modular Supercomputer is much more flexible than a monolithic one, and matches very diverse application profiles in a cost-effective way.”’

    See the full article here .

    five-ways-keep-your-child-safe-school-shootings

    Please help promote STEM in your local schools.

    Stem Education Coalition

    Founded on December 28, 2006, insideHPC is a blog that distills news and events in the world of HPC and presents them in bite-sized nuggets of helpfulness as a resource for supercomputing professionals. As one reader said, we’re sifting through all the news so you don’t have to!

    If you would like to contact me with suggestions, comments, corrections, errors or new company announcements, please send me an email at rich@insidehpc.com. Or you can send me mail at:

    insideHPC
    2825 NW Upshur
    Suite G
    Portland, OR 97239

    Phone: (503) 877-5048

     
  • richardmitnick 11:43 am on May 1, 2019 Permalink | Reply
    Tags: "The ‘Little’ Computer Cluster That Could" The Parallel Distributed Systems Facility (PDSF) cluster, , , , , Supercomputing   

    From Lawrence Berkeley National Lab: “The ‘Little’ Computer Cluster That Could” 

    Berkeley Logo

    From Lawrence Berkeley National Lab

    May 1, 2019
    Glenn Roberts Jr.
    geroberts@lbl.gov
    (510) 486-5582

    Decades before “big data” and “the cloud” were a part of our everyday lives and conversations, a custom computer cluster based at the Department of Energy’s Lawrence Berkeley National Laboratory (Berkeley Lab) enabled physicists around the world to remotely and simultaneously analyze and visualize data.

    2
    The PDSF computer cluster in 2003. (Credit: Berkeley lab)

    The Parallel Distributed Systems Facility (PDSF) cluster, which had served as a steady workhorse in supporting groundbreaking and even Nobel-winning research around the world since the 1990s, switched off last month.

    NERSC PDSF

    During its lifetime the cluster and its dedicated support team racked up many computing achievements and innovations in support of large collaborative efforts in nuclear physics and high-energy physics. Some of these innovations have persevered and evolved in other systems.

    The cluster handled data for experiments that produce a primordial “soup” of subatomic particles to teach us about the makings of matter, search for intergalactic particle signals deep within Antarctic ice, and hunt for dark matter in a mile-deep tank of liquid xenon at a former mine site. It also handled data for a space observatory mapping the universe’s earliest light, and for Earth-based observations of supernovas.

    It supported research leading to the discoveries of the morphing abilities of ghostly particles called neutrinos, the existence of the Higgs boson and the related Higgs field that generates mass through particle interactions, and the accelerating expansion rate of the universe that is attributed to a mysterious force called dark energy.

    CERN CMS Higgs Event


    CERN ATLAS Higgs Event

    Lambda-Cold Dark Matter, Accelerated Expansion of the Universe, Big Bang-Inflation (timeline of the universe) Date 2010 Credit: Alex Mittelmann Cold creation

    Dark Energy Camera Enables Astronomers a Glimpse at the Cosmic Dawn. CREDIT National Astronomical Observatory of Japan

    Some of PDSF’s collaboration users have transitioned to the Cori supercomputer at Berkeley Lab’s National Energy Research Scientific Computing Center (NERSC), with other participants moving to other systems. The transition to Cori gives users access to more computing power in an era of increasingly hefty and complex datasets and demands.

    NERSC

    NERSC Cray Cori II supercomputer at NERSC at LBNL, named after Gerty Cori, the first American woman to win a Nobel Prize in science

    NERSC Hopper Cray XE6 supercomputer


    LBL NERSC Cray XC30 Edison supercomputer


    The Genepool system is a cluster dedicated to the DOE Joint Genome Institute’s computing needs. Denovo is a smaller test system for Genepool that is primarily used by NERSC staff to test new system configurations and software.

    NERSC PDSF


    PDSF is a networked distributed computing cluster designed primarily to meet the detector simulation and data analysis requirements of physics, astrophysics and nuclear science collaborations.

    Future:

    Cray Shasta Perlmutter SC18 AMD Epyc Nvidia pre-exascale supeercomputer

    “A lot of great physics and science was done at PDSF,” said Richard Shane Canon, a project engineer at NERSC who served as a system lead for PDSF from 2003-05. “We learned a lot of cool things from it, and some of those things even became part of how we run our supercomputers today. It was also a unique partnership between experiments and a supercomputing facility – it was the first of its kind.”

    PDSF was small when compared to its supercomputer counterparts that handle a heavier load of computer processors, data, and users, but it had developed a reputation for being responsive and adaptable, and its support crew over the years often included physicists who understood the science as well as the hardware and software capabilities and limitations.

    “It was ‘The Little Engine That Could,’” said Iwona Sakrejda, a nuclear physicist who supported PDSF and its users for over a decade in a variety of roles at NERSC and retired from Berkeley Lab in 2015. “It was the ‘boutique’ computer cluster.”

    PDSF, because it was small and flexible, offered an R&D environment that allowed researchers to test out new ideas for analyzing and visualizing data. Such an environment may have been harder to find on larger systems, she said. Its size also afforded a personal touch.

    “When things didn’t work, they had more handholding,” she added, recalling the numerous researchers that she guided through the PDSF system – including early career researchers working on their theses.

    “It was gratifying. I developed a really good relationship with the users,” Sakrejda said. “I understood what they were trying to do and how their programs worked, which was important in creating the right architecture for what they were trying to accomplish.”

    She noted that because the PDSF system was constantly refreshed, it sometimes led to an odd assortment of equipment put together from different generations of hardware, in sharp contrast to the largely homogenous architecture of today’s supercomputers.

    PDSF participants included collaborations for the Sudbury Neutrino Observatory (SNO) in Canada, the Solenoid Tracker at Brookhaven National Laboratory’s Relativistic Heavy Ion Collider (STAR), IceCube near the South Pole, Daya Bay in China, the Cryogenic Underground Observatory for Rare Events (CUORE) in Italy, the Large Underground Xenon (LUX), LUX-ZEPLIN (LZ), and MAJORANA experiments in South Dakota, the Collider Detector at Fermilab (CDF), and the ATLAS Experiment and A Large Ion Collider Experiment (ALICE) at Europe’s CERN laboratory, among others. The most data-intensive experiments use a distributed system of clusters like PDSF.

    SNOLAB, a Canadian underground physics laboratory at a depth of 2 km in Vale’s Creighton nickel mine in Sudbury, Ontario

    BNL/RHIC Star Detector

    U Wisconsin ICECUBE neutrino detector at the South Pole

    Daya Bay, approximately 52 kilometers northeast of Hong Kong and 45 kilometers east of Shenzhen, China

    CUORE experiment,at the Italian National Institute for Nuclear Physics’ (INFN’s) Gran Sasso National Laboratories (LNGS) in Italy,a search for neutrinoless double beta decay

    LBNL LZ project at SURF, Lead, SD, USA

    U Washington Majorana Demonstrator Experiment at SURF

    FNAL/Tevatron CDF detector

    CERN ATLAS Image Claudia Marcelloni ATLAS CERN

    CERN/ALICE Detector

    3
    This chart shows the physics collaborations that used PDSF over the years, with the heaviest usage by the STAR and ALICE collaborations. (Credit: Berkeley Lab)

    The STAR collaboration was the original participant and had by far the highest overall use of PDSF, and the ALICE collaboration had grown to become one of the largest PDSF users by 2010. Both experiments have explored the formation and properties of an exotic superhot particle soup known as the quark-gluon plasma by colliding heavy particles.

    SNO researchers’ findings about neutrinos’ mass and ability to change into different forms or flavors led to the 2015 Nobel Prize in physics. And PDSF played a notable role in the early analyses of SNO data.

    Art McDonald, who shared that Nobel as director of the SNO Collaboration, said, “The PDSF computing facility was used extensively by the SNO Collaboration, including our collaborators at Berkeley Lab.”

    He added, “This resource was extremely valuable in simulations and data analysis over many years, leading to our breakthroughs in neutrino physics and resulting in the award of the 2015 Nobel Prize and the 2016 Breakthrough Prize in Fundamental Physics to the entire SNO Collaboration. We are very grateful for the scientific opportunities provided to us through access to the PDSF facility.”

    PDSF’s fast processing of data from the Daya Bay nuclear reactor-based experiment was also integral in precise measurements of neutrino properties.

    The cluster was a trendsetter for a so-called condo model in shared computing. This model allowed collaborations to buy a share of computing power and dedicated storage space that was customized for their own needs, and a participant’s allocated computer processors on the system could also be temporarily co-opted by other cluster participants when they were not active.

    In this condo analogy, “You could go use your neighbor’s house if your neighbor wasn’t using it,” said Canon, a former experimental physicist. “If everybody else was idle you could take advantage of the free capacity.” Canon noted that many universities have adopted this kind of model for their computer users.

    Importantly, the PDSF system was also designed to provide easy access and support for individual collaboration members rather than requiring access to be funneled through one account per project or experiment. “If everybody had to log in to submit their jobs, it just wouldn’t work in these big collaborations,” Canon said.

    The original PDSF cluster, called the Physics Detector Simulation Facility, was launched in March 1991 to support analyses and simulations for a planned U.S. particle collider project known as the Superconducting Super Collider. It was set up in Texas, the planned home for the collider, though the collider project was ultimately canceled in 1993.

    Superconducting Super Collider map, in the vicinity of Waxahachie, Texas, Cancelled by The U.S. Congress in 1993 because it showed no “immediate economic benefit”

    5
    A diagram showing the Phase 3 design of the original PDSF system. (Credit: “Superconducting Super Collider: A Retrospective Summary 1989-1993,” Superconducting Super Collider Laboratory, Dallas, Texas)

    A 1994 retrospective report on the collider project notes that the original PDSF had been built up to perform a then-impressive 7 billion instructions per second and that the science need for PDSF to simulate complex particle collisions had driven “substantial technological advances” in the nation’s computer industry.

    At the time, PDSF was “the world’s most powerful high-energy physics computing facility,” the report also noted, and was built using non-proprietary systems and equipment from different manufacturers “at a fraction of the cost” of supercomputers.

    Longtime Berkeley Lab physicist Stu Loken, who had led the Lab’s Information and Computing Sciences Division from 1988-2000, had played a pivotal role in PDSF’s development and in siting the cluster at Berkeley Lab.

    7
    PDSF moved to Berkeley Lab’s Oakland Scientific Facility in 2000 before returning to the lab’s main site. (Credit: Berkeley Lab)

    PDSF moved to Berkeley Lab in 1996 with a new name and a new role. It was largely rebuilt with new hardware and was moved to a computer center in Oakland, Calif., in 2000 before returning once again to the Berkeley Lab site.

    “A lot of the tools that we deployed to facilitate the data processing on PDSF are now being used by data users at NERSC,” said Lisa Gerhardt, a big-data architect at NERSC who worked on the PDSF system. She previously had served as a neutrino astrophysicist for the IceCube experiment.

    Gerhardt noted that the cluster was nimble and responsive because of its focused user community. “Having a smaller and cohesive user pool made it easier to have direct relationships,” she said.

    And Jan Balewski, computing systems engineer at NERSC who worked to transition PDSF users to the new system, said the scientific background of PDSF staff through the years was beneficial for the cluster’s users.

    Balewski, a former experimental physicist, said, “Having our background, we were able to discuss with users what they really needed. And maybe, in some cases, what they were asking for was not what they really needed. We were able to help them find a solution.”

    R. Jefferson “Jeff” Porter, a computer systems engineer and physicist in Berkeley Lab’s Nuclear Science Division who began working with the PDSF cluster and users as a postdoctoral researcher at Berkeley Lab in the mid-1990s, said, “PDSF was a resource that dealt with big data – many years before big data became a big thing for the rest of the world.”

    It had always used off-the-shelf hardware and was steadily upgraded – typically twice a year. Even so, it was dwarfed by its supercomputer counterparts. About seven years ago the PDSF cluster had about 1,500 computer cores, compared to about 100,000 on a neighboring supercomputer at NERSC at the time. A core is the part of a computer processor that performs calculations

    Porter was later hired by NERSC to support grid computing, a distributed form of computing in which computers in different locations can work together to perform larger tasks. He returned to the Nuclear Science Division to lead the ALICE USA computing project, which established PDSF as one of about 80 grid sites for CERN’s ALICE experiment. Use of PDSF by ALICE was an easy fit, since the PDSF community “was at the forefront of grid computing,” Porter said.

    In some cases, the unique demands of PDSF cluster users would also lead to the adoption of new tools at supercomputer systems. “Our community would push NERSC in ways they hadn’t been thinking,” he said. CERN developed a system to distribute software that was adopted by PDSF about five years ago, and that has also been adopted by many scientific collaborations. NERSC put in a big effort, Porter said, to integrate this system into larger machines: Cori and Edison.

    8
    PDSF’s configuration in 2017. (Credit: Berkeley Lab)

    Supporting multiple projects on a single system was a challenge for PDSF since each project had unique software needs, so Canon led the development of a system known as Chroot OS (CHOS) to enable each project to have a custom computing environment.

    Porter explained that CHOS was an early form of “container computing” that has since enjoyed widespread adoption.

    PDSF was run by a Berkeley Lab-based steering committee that typically had a member from each participating experiment and a member from NERSC, and Porter had served for about five years as the committee chair. He had been focused for the past year on how to transition users to the Cori supercomputer and other computing resources, as needed.

    Balewski said that the leap of users from PDSF to Cori brings them access to far greater computing power, and allows them to “ask questions they could never ask on a smaller system.”

    He added, “It’s like moving from a small town – where you know everyone but resources are limited – to a big city that is more crowded but also offers more opportunities.”

    See the full article here .

    five-ways-keep-your-child-safe-school-shootings

    Please help promote STEM in your local schools.

    Stem Education Coalition

    Bringing Science Solutions to the World

    In the world of science, Lawrence Berkeley National Laboratory (Berkeley Lab) is synonymous with “excellence.” Thirteen Nobel prizes are associated with Berkeley Lab. Seventy Lab scientists are members of the National Academy of Sciences (NAS), one of the highest honors for a scientist in the United States. Thirteen of our scientists have won the National Medal of Science, our nation’s highest award for lifetime achievement in fields of scientific research. Eighteen of our engineers have been elected to the National Academy of Engineering, and three of our scientists have been elected into the Institute of Medicine. In addition, Berkeley Lab has trained thousands of university science and engineering students who are advancing technological innovations across the nation and around the world.

    Berkeley Lab is a member of the national laboratory system supported by the U.S. Department of Energy through its Office of Science. It is managed by the University of California (UC) and is charged with conducting unclassified research across a wide range of scientific disciplines. Located on a 202-acre site in the hills above the UC Berkeley campus that offers spectacular views of the San Francisco Bay, Berkeley Lab employs approximately 3,232 scientists, engineers and support staff. The Lab’s total costs for FY 2014 were $785 million. A recent study estimates the Laboratory’s overall economic impact through direct, indirect and induced spending on the nine counties that make up the San Francisco Bay Area to be nearly $700 million annually. The Lab was also responsible for creating 5,600 jobs locally and 12,000 nationally. The overall economic impact on the national economy is estimated at $1.6 billion a year. Technologies developed at Berkeley Lab have generated billions of dollars in revenues, and thousands of jobs. Savings as a result of Berkeley Lab developments in lighting and windows, and other energy-efficient technologies, have also been in the billions of dollars.

    Berkeley Lab was founded in 1931 by Ernest Orlando Lawrence, a UC Berkeley physicist who won the 1939 Nobel Prize in physics for his invention of the cyclotron, a circular particle accelerator that opened the door to high-energy physics. It was Lawrence’s belief that scientific research is best done through teams of individuals with different fields of expertise, working together. His teamwork concept is a Berkeley Lab legacy that continues today.

    A U.S. Department of Energy National Laboratory Operated by the University of California.

    University of California Seal

    DOE Seal

     
  • richardmitnick 11:12 am on April 23, 2019 Permalink | Reply
    Tags: DiRAC is the integrated supercomputing facility for theoretical modeling and HPC-based research in particle physics and astrophysics cosmology and nuclear physics all areas in which the UK is world-le, , Supercomputing   

    From insideHPC: “40 Powers of 10 – Simulating the Universe with the DiRAC HPC Facility” 

    From insideHPC

    DiRAC is the UK’s integrated supercomputing facility for theoretical modelling and HPC-based research in particle physics, astronomy and cosmology.


    49 minutes, worth your time
    In this video from the Swiss HPC Conference, Mark Wilkinson presents: 40 Powers of 10 – Simulating the Universe with the DiRAC HPC Facility.

    2
    Dr. Mark Wilkinson is the Project Director at DiRAC.

    “DiRAC is the integrated supercomputing facility for theoretical modeling and HPC-based research in particle physics, and astrophysics, cosmology, and nuclear physics, all areas in which the UK is world-leading. DiRAC provides a variety of compute resources, matching machine architecture to the algorithm design and requirements of the research problems to be solved. As a single federated Facility, DiRAC allows more effective and efficient use of computing resources, supporting the delivery of the science programs across the STFC research communities. It provides a common training and consultation framework and, crucially, provides critical mass and a coordinating structure for both small- and large-scale cross-discipline science projects, the technical support needed to run and develop a distributed HPC service, and a pool of expertise to support knowledge transfer and industrial partnership projects. The on-going development and sharing of best-practice for the delivery of productive, national HPC services with DiRAC enables STFC researchers to produce world-leading science across the entire STFC science theory program.”

    3

    Dr. Mark Wilkinson is the Project Director at DiRAC. He obtained his BA and MSc in Theoretical Physics at Trinity College Dublin and a DPhil in Theoretical Astronomy at the University of Oxford. Between 2000 and 2006, he was a post-doc at the Institute of Astronomy, Cambridge. He subsequently moved to the University of Leicester to take up a Royal Society University Research Fellow. I am currently a Reader in the Theoretical Astrophysics Group of the Dept. of Physics & Astronomy.

    See the full article here .

    five-ways-keep-your-child-safe-school-shootings

    Please help promote STEM in your local schools.

    Stem Education Coalition

    Founded on December 28, 2006, insideHPC is a blog that distills news and events in the world of HPC and presents them in bite-sized nuggets of helpfulness as a resource for supercomputing professionals. As one reader said, we’re sifting through all the news so you don’t have to!

    If you would like to contact me with suggestions, comments, corrections, errors or new company announcements, please send me an email at rich@insidehpc.com. Or you can send me mail at:

    insideHPC
    2825 NW Upshur
    Suite G
    Portland, OR 97239

    Phone: (503) 877-5048

     
  • richardmitnick 10:48 am on April 22, 2019 Permalink | Reply
    Tags: "Optimizing Network Software to Advance Scientific Discovery", , , , CSI-The computer is installed at Brookhaven's Scientific Data and Computing Center, DiRAC-Distributed Research Using Advanced Computing, Intel's high-speed communication network to accelerate application codes for particle physics and machine learning, Supercomputing   

    From Brookhaven National Lab: “Optimizing Network Software to Advance Scientific Discovery” 

    From Brookhaven National Lab

    April 16, 2019
    Ariana Tantillo
    atantillo@bnl.gov

    A team of computer scientists, physicists, and software engineers optimized software for Intel’s high-speed communication network to accelerate application codes for particle physics and machine learning.

    1
    Brookhaven Lab collaborated with Columbia University, University of Edinburgh, and Intel to optimize the performance of a 144-node parallel computer built from Intel’s Xeon Phi processors and Omni-Path high-speed communication network. The computer is installed at Brookhaven’s Scientific Data and Computing Center, as seen above with technology engineer Costin Caramarcu.

    High-performance computing (HPC)—the use of supercomputers and parallel processing techniques to solve large computational problems—is of great use in the scientific community. For example, scientists at the U.S. Department of Energy’s (DOE) Brookhaven National Laboratory rely on HPC to analyze the data they collect at the large-scale experimental facilities on site and to model complex processes that would be too expensive or impossible to demonstrate experimentally.

    Modern science applications, such as simulating particle interactions, often require a combination of aggregated computing power, high-speed networks for data transfer, large amounts of memory, and high-capacity storage capabilities. Advances in HPC hardware and software are needed to meet these requirements. Computer and computational scientists and mathematicians in Brookhaven Lab’s Computational Science Initiative (CSI) are collaborating with physicists, biologists, and other domain scientists to understand their data analysis needs and provide solutions to accelerate the scientific discovery process.

    An HPC industry leader

    2
    An image of the Xeon Phi Knights Landing processor die. A die is a pattern on a wafer of semiconducting material that contains the electronic circuitry to perform a particular function. Credit: Intel.

    For decades, Intel Corporation has been one of the leaders in developing HPC technologies. In 2016, the company released the Intel® Xeon PhiTM processors (formerly code-named “Knights Landing”), its second-generation HPC architecture that integrates many processing units (cores) per chip. The same year, Intel released the Intel® Omni-Path Architecture high-speed communication network. In order for the 5,000 to 100,000 individual computers, or nodes, in modern supercomputers to work together to solve a problem, they must be able to quickly communicate with each other while minimizing network delays.

    Soon after these releases, Brookhaven Lab and RIKEN, Japan’s largest comprehensive research institution, pooled their resources to purchase a small 144-node parallel computer built from Xeon Phi processors and two independent network connections, or rails, using Intel’s Omni-Path Architecture.

    The computer was installed at Brookhaven Lab’s Scientific Data and Computing Center, which is part of CSI.

    With the installation completed, physicist Chulwoo Jung and CSI computational scientist Meifeng Lin of Brookhaven Lab; theoretical physicist Christoph Lehner, a joint appointee at Brookhaven Lab and the University of Regensburg in Germany; Norman Christ, the Ephraim Gildor Professor of Computational Theoretical Physics at Columbia University; and theoretical particle physicist Peter Boyle of the University of Edinburgh worked in close collaboration with software engineers at Intel to optimize the network software for two science applications: particle physics and machine learning.

    “CSI had been very interested in the Intel Omni-Path Architecture since it was announced in 2015,” said Lin. “The expertise of Intel engineers was critical to implementing the software optimizations that allowed us to fully take advantage of this high-performance communication network for our specific application needs.”

    Network requirements for scientific applications

    For many scientific applications, running one rank (a value that distinguishes one process from another) or possibly a few ranks per node on a parallel computer is much more efficient than running several ranks per node. Each rank typically executes as an independent process that communicates with the other ranks by using a standard protocol known as Message Passing Interface (MPI).

    4
    A schematic of the lattice for quantum chromodynamics calculations. The intersection points on the grid represent quark values, while the lines between them represent gluon values.

    For example, physicists seeking to understand how the early universe formed run complex numerical simulations of particle interactions based on the theory of quantum chromodynamics (QCD). This theory explains how elementary particles called quarks and gluons interact to form the particles we directly observe, such as protons and neutrons. Physicists model these interactions by using supercomputers that represent the three dimensions of space and the dimension of time in a four-dimensional (4D) lattice of equally spaced points, similar to that of a crystal. The lattice is split into smaller identical sub-volumes. For lattice QCD calculations, data need to be exchanged at the boundaries between the different sub-volumes. If there are multiple ranks per node, each rank hosts a different 4D sub-volume. Thus, splitting up the sub-volumes creates more boundaries where data need to be exchanged and therefore unnecessary data transfers that slow down the calculations.

    Software optimizations to advance science

    To optimize the network software for such a computationally intensive scientific application, the team focused on enhancing the speed of a single rank.

    “We made the code for a single MPI rank run faster so that a proliferation of MPI ranks would not be needed to handle the large communication load present for each node,” explained Christ.

    The software within the MPI rank exploits the threaded parallelism available on Xeon Phi nodes. Threaded parallelism refers to the simultaneous execution of multiple processes, or threads, that follow the same instructions while sharing some computing resources. With the optimized software, the team was able to create multiple communication channels on a single rank and to drive these channels using different threads.

    5
    Two-dimensional illustration of threaded parallelism. Key: green lines separate physical compute nodes; black lines separate MPI ranks; red lines are the communication contexts, with the arrows denoting communication between nodes or memory copy within a node via the Intel Omni-Path hardware.

    The MPI software was now set up for the scientific applications to run more quickly and to take full advantage of the Intel Omni-Path communications hardware. But after implementing the software, the team members encountered another challenge: in each run, a few nodes would inevitably communicate slowly and hold the others back.

    They traced this problem to the way that Linux—the operating system used by the majority of HPC platforms—manages memory. In its default mode, Linux divides memory into small chunks called pages. By reconfiguring Linux to use large (“huge”) memory pages, they resolved the issue. Increasing the page size means that fewer pages are needed to map the virtual address space that an application uses. As a result, memory can be accessed much more quickly.

    With the software enhancements, the team members analyzed the performance of the Intel Omni-Path Architecture and Intel Xeon Phi processor compute nodes installed on Intel’s dual-rail “Diamond” cluster and the Distributed Research Using Advanced Computing (DiRAC) single-rail cluster in the United Kingdom.

    DiRAC is the UK’s integrated supercomputing facility for theoretical modelling and HPC-based research in particle physics, astronomy and cosmology.

    For their analysis, they used two different classes of scientific applications: particle physics and machine learning. For both application codes, they achieved near-wirespeed performance—the theoretical maximum rate of data transfer. This improvement represents an increase in network performance that is between four and ten times that of the original codes.

    “Because of the close collaboration between Brookhaven, Edinburgh, and Intel, these optimizations were made available worldwide in a new version of the Intel Omni-Path MPI implementation and a best-practice protocol to configure Linux memory management,” said Christ. “The factor of five speedup in the execution of the physics code on the Xeon Phi computer at Brookhaven Lab—and on the University of Edinburgh’s new, even larger 800-node Hewlett Packard Enterprise “hypercube” computer—is now being put to good use in ongoing studies of fundamental questions in particle physics.”

    See the full article here .


    five-ways-keep-your-child-safe-school-shootings

    Please help promote STEM in your local schools.

    Stem Education Coalition

    BNL Campus


    BNL Center for Functional Nanomaterials

    BNL NSLS-II


    BNL NSLS II

    BNL RHIC Campus

    BNL/RHIC Star Detector

    BNL RHIC PHENIX

    One of ten national laboratories overseen and primarily funded by the Office of Science of the U.S. Department of Energy (DOE), Brookhaven National Laboratory conducts research in the physical, biomedical, and environmental sciences, as well as in energy technologies and national security. Brookhaven Lab also builds and operates major scientific facilities available to university, industry and government researchers. The Laboratory’s almost 3,000 scientists, engineers, and support staff are joined each year by more than 5,000 visiting researchers from around the world. Brookhaven is operated and managed for DOE’s Office of Science by Brookhaven Science Associates, a limited-liability company founded by Stony Brook University, the largest academic user of Laboratory facilities, and Battelle, a nonprofit, applied science and technology organization.
    i1

     
  • richardmitnick 11:15 am on April 13, 2019 Permalink | Reply
    Tags: , , , LLNL Penguin Computing Corona AMD EPYC Radeon Instinct Cluster, , Supercomputing   

    From insideHPC: “AMD Powers Corona Cluster for HPC Analytics at Livermore” 

    From insideHPC

    April 12, 2019
    Rich Brueckner

    Lawrence Livermore National Lab has deployed a 170-node HPC cluster from Penguin Computing. Based on AMD EPYC processors and Radeon Instinct GPUs, the new Corona cluster will be used to support the NNSA Advanced Simulation and Computing (ASC) program in an unclassified site dedicated to partnerships with American industry.

    1
    The Corona cluster is comprised of AMD EPYC processors, AMD Radeon Instinct GPUs connected with Mellanox HDR 200 Gigabit InfiniBand.

    In searching for a commercial processor that could handle the demands of HPC and data analytics, Matt Leininger, Deputy for Advanced Technology Projects, LLNL, said several factors influence the choice of CPU including single core performance, the number of cores, and the memory performance per socket. All these factors drove LLNL to seriously consider the EPYC processor.

    “Our simulations require a balance of memory and compute capabilities. The number of high-performance memory channels and CPU cores on each AMD EPYC socket are a solution for our mission needs.” he said.

    The lab’s latest HPC cluster deployment—named Corona—is built with AMD EPYC CPUs and Radeon Instinct MI25 GPUs. “We are excited to have these high-end products and to apply them to our challenging HPC simulations,” said Leininger.

    Simulations requiring 100’s of petaFLOPS (quadrillions of floating-point operations per second) speed are run on the largest supercomputers at LLNL, which are among the fastest in the world. Supporting the larger systems are the Commodity Technology Systems (CTS), what Leininger calls “everyday workhorses” serving the LLNL user community.

    The new Corona cluster will bring new levels of machine learning capabilities to the CTS resources. The integration of HPC simulation and machine learning into a cognitive simulation capability is an active area of research at LLNL.

    “Coupling large scale deep learning with our traditional scientific simulation workload will allow us to dramatically increase scientific output and utilize our HPC resources more efficiently,” said LLNL Informatics Group leader and computer scientist Brian Van Essen. “These new Deep Learning enabled HPC systems are critical as we develop new machine learning algorithms and architectures that are optimized for scientific computing.”

    The computing platform is Penguin Computing’s XO1114GT platform, with nodes connected by Mellanox HDR InfiniBand networking technology.

    “We have folks thinking about what they can pull off on this machine that they couldn’t have done before,” Leininger said.

    CPU/GPU Powering Machine Learning in HPC

    “We’ve been working to understand how to enable HPC simulation using GPUs and also using machine learning in combination with HPC to solve some of our most challenging scientific problems,” Leininger said. “Even as we do more of our computing on GPUs, many of our codes have serial aspects that need really good single core performance. That lines up well with AMD EPYC.”

    The EPYC processor-based Corona cluster will help LLNL use machine learning to conduct its simulations more efficiently with an active-learning approach, called Cognitive Simulation, that can be used to optimize solutions with a significant reduction in compute requirements. Leininger explained that multi-physics simulations, which include a significant number of modeling and calculations around hydrodynamic and materials problems important to NNSA, are the lab’s most complicated. These analytic simulations produce a range of parameter space results that are used to construct error bars which depict uncertainty levels that must be understood and reduced.

    “We are looking to use some machine learning techniques where the machine would figure out how much of the parameter space we really need to explore or what part of it we need to explore more than others,” Van Essen said.

    Using EPYC-powered servers with the Radeon Instinct MI25 for machine learning, LLNL will be able to determine exactly where to explore further in order to detect what component is driving the majority of the error bars and significantly reduce time on task to achieve better science.

    See the full article here .

    five-ways-keep-your-child-safe-school-shootings

    Please help promote STEM in your local schools.

    Stem Education Coalition

    Founded on December 28, 2006, insideHPC is a blog that distills news and events in the world of HPC and presents them in bite-sized nuggets of helpfulness as a resource for supercomputing professionals. As one reader said, we’re sifting through all the news so you don’t have to!

    If you would like to contact me with suggestions, comments, corrections, errors or new company announcements, please send me an email at rich@insidehpc.com. Or you can send me mail at:

    insideHPC
    2825 NW Upshur
    Suite G
    Portland, OR 97239

    Phone: (503) 877-5048

     
  • richardmitnick 10:34 am on April 11, 2019 Permalink | Reply
    Tags: , , , , Supercomputing   

    From Science Node: “The end of an era” 

    Science Node bloc
    From Science Node

    10 Apr, 2019
    Alisa Alering

    For the last fifty years, computer technology has been getting faster and cheaper. Now that extraordinary progress is coming to an end. What happens next?

    John Shalf, department head for Computer Science at Berkeley Lab, has a few ideas. He’s going to share them in his keynote at ISC High Performance 2019 in Frankfurt, Germany (June 16-20), but he gave Science Node a sneak preview.

    Moore’s Law is based on Gordon Moore’s 1965 prediction that the number of transistors on a microchip doubles every two years, while the cost is halved. His prediction proved true for several decades. What’s different now?

    1
    Double trouble. From 1965 to 2004, the number of transistors on a microchip doubled every two years while cost decreased. Now that you can’t get more transistors on a chip, high-performance computing is in need of a new direction. Data courtesy Data quest/Intel.

    The end of Dennard scaling happened in 2004, when we couldn’t crank up the clock frequencies anymore on chips, so we moved to exponentially increasing parallelism in order to continue performance scaling. It was not an ideal solution, but it enabled us to continue some semblance of performance scaling. Now we’ve gotten to the point where we can’t squeeze any more transistors onto the chip.

    If you can’t cram any more transistors on the chip, then we can’t continue to scale the number of cores as a means to scale performance. And we’ll get no power improvement: with the end of Moore’s Law, in order to get ten times more performance we would need ten times more power in the future. Capital equipment cost won’t improve either. Meaning that if I spend $100 million and can get a 100 petaflop machine today, then I spend $100 million ten years from now, I’ll get the same machine.

    That sounds fairly dire. Is there anything we can do?

    There are three dimensions we can pursue: One is new architectures and packaging, the other is CMOS transistor replacements using new materials, the third is new models of computation that are not necessarily digital.

    Let’s break it down. Tell me about architectures.

    2
    John Shalf, of Lawrence Berkeley National Laboratory, wants to consider all options—from new materials and specialization to industry partnerships–when it comes to imagining the future of high-performance computing. Courtesy John Shalf.

    We need to change course and learn from our colleagues in other industries. Our friends in the phone business and in mega data centers are already pointing out the solution. Architectural specialization is one of the biggest sources of improvement in the iPhone. The A8 chip, introduced in 2014, had 29 different discreet accelerators. We’re now at the A11, and it has nearly 40 different discreet hardware accelerators. Future generation chips are slowly squeezing out the CPUs and having special function accelerators for different parts of their workload.

    And for the mega-data center, Google is making its own custom chip. They weren’t seeing the kind of performance improvements they needed from Intel or Nvidia, so they’re building their own custom chips tailored to improve the performance for their workloads. So are Facebook and Amazon. The only people absent from this are HPC.

    With Moore’s Law tapering off, the only way to get a leg up in performance is to go back to customization. The embedded systems and the ARM ecosystem is an example where, even though the chips are custom, the components—the little circuit designs on those chips—are reusable across many different disciplines. The new commodity is going to be these little IP blocks we arrange on the chip. We may need to add some IP blocks that are useful for scientific applications, but there’s a lot of IP reuse in that embedded ecosystem and we need to learn how to tap into that.

    How do new materials fit in?

    We’ve been using silicon for the past several decades because it is inexpensive and ubiquitous, and has many years of development effort behind it. We have developed an entire scalable manufacturing infrastructure around it, so it continues to be the most cost-effective route for mass-manufacture of digital devices. It’s pretty amazing, to use one material system for that long. But now we need to look at some new transistor that can continue to scale performance beyond what we’re able to wring out of silicon. Silicon is, frankly, not that great of a material when it comes to electron mobility.

    _________________________________________________________
    The Materials Project

    The current pace of innovation is extremely slow because the primary means available for characterizing new materials is to read a lot of papers. One solution might be Kristin Persson’s Materials Project, originally invented to advance the exploration of battery materials.

    By scaling materials computations over supercomputing clusters, research can be targeted to the most promising compounds, helping to remove guesswork from materials design. The hope is that reapplying this technology to also discover better electronic materials will speed the pace of discovery for new electronic devices.
    In 2016, an eight laboratory consortium was formed to push this in the DOE “Big ideas Summit” where grass-roots ideas from the labs are presented to the highest levels of DOE leadership. Read the whitepaper and elevator pitch here.

    After the ‘Beyond Moore’s Law’ project was invited back for the 2017 Big Ideas Summit, the DOE created a Microelectronics BRN (Basic Research Needs) Workshop. The initial report from that meeting is released, and the DOE’s FY20 budget includes a line item for Microelectronics research.
    _________________________________________________________

    The problem is, we know historically that once you demonstrate a new device concept in the laboratory, it takes about ten years to commercialize it. Prior experience has shown a fairly consistent timeline of 10 years from lab to fab. Although there are some promising directions, nobody has demonstrated something that’s clearly superior to silicon transistors in the lab yet. With no CMOS replacement imminent, that means we’re already ten years too late! We need to develop tools and processes to accelerate the pace for discovery of more efficient microelectronic devices to replace CMOS and the materials that make them possible.

    So, until we find a new material for the perfect chip, can we solve the problem with new models of computing. What about quantum computing?

    New models would include quantum and neuromorphic computing. These models expand computing into new directions, but they’re best at computing problems that are done poorly using digital computing.

    I like to use the example of ‘quantum Excel.’ Say I balance my checkbook by creating a spreadsheet with formulas, and it tells me how balanced my checkbook is. If I were to use a quantum computer for that—and it would be many, many, many years in the future where we’d have enough qubits to do it, but let’s just imagine—quantum Excel would be the superposition of all possible balanced checkbooks.

    And a neuromorphic computer would say, ‘Yes, it looks correct,’ and then you’d ask it again and it would say, ‘It looks correct within an 80% confidence interval.’ Neuromorphic is great at pattern recognition, but it wouldn’t be as good for running partial differential equations and computing exact arithmetic.

    We really need to go back to the basics. We need to go back to ‘What are the application requirements?’

    Clearly there are a lot of challenges. What’s exciting about this time right now?

    3
    The Summit supercomputer at Oak Ridge National Laboratory operates at a top speed of 200 petaflops and is currently the world’s fastest computer. But the end of Moore’s Law means that to get 10x that performance in the future, we also would need 10x more power. Courtesy Carlos Jones/ORNL.

    Computer architecture has become very, very important again. The previous era of exponential scaling created a much narrower space for innovation because the focus was general purpose computing, the universal machine. The problems we now face opens up the door again to mathematicians and computer architects to collaborate to solve big problems together. And I think that’s very exciting. Those kinds of collaborations lead to really fun, creative, and innovative solutions to worldwide important scientific problems.

    The real issue is that our economic model for acquiring supercomputing systems will be deeply disrupted. Originally, systems were designed by mathematicians to solve important mathematical problems. However, the exponential improvement rates of Moore’s law ensured that the most general purpose machines that were designed for the broadest range of problems would have a superior development budget and, over time, would ultimately deliver more cost-effective performance than specialized solutions.

    The end of Moore’s Law spells the end of general purpose computing as we know it. Continuing with this approach dooms us to modest or even non-existent performance improvements. But the cost of customization using current processes is unaffordable.

    We must reconsider our relationship with industry to re-enable specialization targeted at our relatively small HPC market. Developing a self-sustaining business model is paramount. The embedded ecosystem (including the ARM ecosystem) provides one potential path forward, but there is also the possibility of leveraging the emerging open source hardware ecosystem and even packaging technologies such as Chiplets to create cost-effective specialization.

    We must consider all options for business models and all options for partnerships across agencies or countries to ensure an affordable and sustainable path forward for the future of scientific and technical computing.

    See the full article here .


    five-ways-keep-your-child-safe-school-shootings
    Please help promote STEM in your local schools.

    Stem Education Coalition

    Science Node is an international weekly online publication that covers distributed computing and the research it enables.

    “We report on all aspects of distributed computing technology, such as grids and clouds. We also regularly feature articles on distributed computing-enabled research in a large variety of disciplines, including physics, biology, sociology, earth sciences, archaeology, medicine, disaster management, crime, and art. (Note that we do not cover stories that are purely about commercial technology.)

    In its current incarnation, Science Node is also an online destination where you can host a profile and blog, and find and disseminate announcements and information about events, deadlines, and jobs. In the near future it will also be a place where you can network with colleagues.

    You can read Science Node via our homepage, RSS, or email. For the complete iSGTW experience, sign up for an account or log in with OpenID and manage your email subscription from your account preferences. If you do not wish to access the website’s features, you can just subscribe to the weekly email.”

     
c
Compose new post
j
Next post/Next comment
k
Previous post/Previous comment
r
Reply
e
Edit
o
Show/Hide comments
t
Go to top
l
Go to login
h
Show/Hide help
shift + esc
Cancel
%d bloggers like this: