Tagged: insideHPC Toggle Comment Threads | Keyboard Shortcuts

  • richardmitnick 2:16 pm on November 25, 2019 Permalink | Reply
    Tags: "SDSC Conducts 50000+ GPU Cloudburst Experiment with Wisconsin IceCube Particle Astrophysics Center", , insideHPC, ,   

    From insideHPC: “SDSC Conducts 50,000+ GPU Cloudburst Experiment with Wisconsin IceCube Particle Astrophysics Center” 

    From insideHPC

    November 25, 2019

    SDSC Triton HP supercomputer

    SDSC Gordon-Simons supercomputer

    SDSC Dell Comet supercomputer

    1

    U Wisconsin ICECUBE neutrino detector at the South Pole

    Researchers at SDSC and the Wisconsin IceCube Particle Astrophysics Center have successfully completed a computational experiment as part of a multi-institution collaboration that marshaled all globally available for sale GPUs (graphics processing units) across Amazon Web Services, Microsoft Azure, and the Google Cloud Platform.

    2
    The chart shows the time evolution of the burst over the course of ~200 minutes. The black line is the number of GPUs used for science, peaking at 51,500 GPUs. Each color shows the number of GPUs purchased in a region of a cloud provider. The steep rise indicates the burst capability of the infrastructure to support short but intense computation for science. Credit: Igor Sfiligoi, SDSC/UC San Diego

    In all, some 51,500 GPU processors were used during the approximately two-hour experiment conducted on November 16 and funded under a National Science Foundation EAGER grant.

    The experiment used simulations from the IceCube Neutrino Observatory, an array of some 5,160 optical sensors deep within a cubic kilometer of ice at the South Pole. In 2017, researchers at the NSF-funded observatory found the first evidence of a source of high-energy cosmic neutrinos – subatomic particles that can emerge from their sources and pass through the universe unscathed, traveling for billions of light years to Earth from some of the most extreme environments in the universe.

    The experiment – completed just prior to the opening of the International Conference for High Performance Computing, Networking, Storage, and Analysis (SC19) in Denver, CO – was coordinated by Frank Würthwein, SDSC Lead for High-Throughput Computing, and Benedikt Riedel, Computing Manager for the IceCube Neutrino Observatory and Global Computing Coordinator at WIPAC.

    Igor Sfiligoi, SDSC’s lead scientific software developer for high-throughput computing, and David Schultz, a production software manager with IceCube, conducted the actual run.

    “We focused this GPU cloud burst in the area of multi-messenger astrophysics, which is based on the observation and analysis of what we call ‘messenger’ signals, in this case neutrinos,” said Würthwein, also a physics professor at UC San Diego and executive director of the Open Science Grid (OSG), a multi-disciplinary research partnership specializing in high-throughput computational services funded by the NSF and the U.S. Department of Energy.”

    ““The NSF chose multi messenger astronomy as one of its 10 Big Ideas to focus on during the next few years,” said Würthwein. “We now have instruments that can measure gravitational waves, neutrinos, and various forms of light to see the most violent events in the universe. We’re only starting to understand the physics behind such energetic celestial phenomena that can reach Earth from deepest space.”

    Exascale Extrapolations

    The net result was a peak of about 51k GPUs of various kinds, with an aggregate peak of about 350 PFLOP32s (according to NVIDIA specifications), according to Sfiligoi.

    “For comparison, the Number 1 TOP100 HPC system, Summit, (based at Oak Ridge National Laboratory) has a nominal performance of about 400 PFLOP32s. So, at peak, our cloud-based cluster provided almost 90% of the performance of Summit, at least for the purpose of IceCube simulations.

    ORNL IBM AC922 SUMMIT supercomputer, No.1 on the TOP500. Credit: Carlos Jones, Oak Ridge National Laboratory/U.S. Dept. of Energy

    The relatively short time span of the experiment showed the ability to conduct a massive amount of data processing within a very short period – an advantage for research projects that must meet a tight deadline. Francis Halzen, principal investigator for IceCube, a Distinguished Professor at the University of Wisconsin-Madison, and director of the university’s Institute for Elementary Particle Physics, foresaw this several years ago.

    “We have initiated an effort to improve the calibration of the instrument that will result in sensitivity improved by an estimated factor of four,” wrote Halzen. “We can apply this improvement to 10 years of archived data, thus obtaining the equivalent of 40 years of current IceCube data.”

    ““We conducted this experiment with three goals in mind,” said IceCube’s Riedel, “One obvious goal was to produce simulations that will be used to do science with IceCube for multi-messenger astrophysics. But we also wanted to understand the readiness of our cyberinfrastructure for bursting into future Exascale-class facilities such as Argonne’s Aurora or Oak Ridge’s Frontier, when they become available. And more generally, we sought to determine how much GPU capacity can be bought today for an hour or so GPU burst in the commercial cloud.”

    ____________________________________________________

    3
    No one region contributed more than 11% of the total science output, showing the power of dHTC in aggregating resources globally to achieve large-scale computation. Credit: Igor Sfiligoi, SDSC/UC San Diego.

    “This was a social experiment as well,” added Würthwein. “We scavenged up all available GPUs on demand across 28 cloud regions across three continents – North America, Europe, and Asia. The results of this experiment tell us that we can elastically burst to very large scales of GPUs using the cloud, given that exascale computers don’t exist now but may soon be used in the coming years. The demo also shows such bursting of massive data, is suitable for a wide range of challenges across astronomy and other sciences. To the extent that the elasticity is there, we believe that this can be applied across all of scientific research to get results quickly.”
    ____________________________________________________

    HTCondor was used to integrate all purchased GPUs into a single resource pool that IceCube submitted their workflows to from their home base in Wisconsin. This was accomplished by aggregating resources in each cloud region, and then aggregating those aggregators into a single global pool at SDSC.

    “This is very similar to the production infrastructure that OSG operates for IceCube to aggregate dozens of ‘on-prem’ clusters into a single global resource pool across the U.S., Canada, and Europe,” said Sfiligoi.

    An additional experiment to reach even higher scales is likely to be made sometime around the Christmas and New Year holidays, when commercial GPU use is traditionally lower, and therefore availability of such GPUs for scientific research is greater.

    See the full article here .

    five-ways-keep-your-child-safe-school-shootings

    Please help promote STEM in your local schools.

    Stem Education Coalition

    Founded on December 28, 2006, insideHPC is a blog that distills news and events in the world of HPC and presents them in bite-sized nuggets of helpfulness as a resource for supercomputing professionals. As one reader said, we’re sifting through all the news so you don’t have to!

    If you would like to contact me with suggestions, comments, corrections, errors or new company announcements, please send me an email at rich@insidehpc.com. Or you can send me mail at:

    insideHPC
    2825 NW Upshur
    Suite G
    Portland, OR 97239

    Phone: (503) 877-5048

     
  • richardmitnick 1:58 pm on October 24, 2019 Permalink | Reply
    Tags: Cray Storm supercomputer, High-Performance Computing Center of the University of Stuttgart (HLRS) in Germany, insideHPC   

    From insideHPC: “Cray CS-Storm Supercomputer coming to HLRS in Germany” 

    From insideHPC

    Today Cray announced that the High-Performance Computing Center of the University of Stuttgart (HLRS) in Germany has selected a new Cray CS-Storm GPU-accelerated supercomputer to advance its computing infrastructure in response to user demand for processing-intensive applications like machine learning and deep learning.

    1

    The new Cray system is tailored for artificial intelligence (AI) and includes the Cray Urika-CS AI and Analytics suite, enabling HLRS to accelerate AI workloads, arm users to address complex computing problems and process more data with higher accuracy of AI models in engineering, automotive, energy, and environmental industries and academia.

    “As we extend our service portfolio with AI, we require an infrastructure that can support the convergence of traditional high-performance computing applications and AI workloads to better support our users and customers,” said Prof. Dr. Michael Resch, director at HRLS. “We’ve found success working with our current Cray Urika-GX system for data analytics, and we are now at a point where AI and deep learning have become even more important as a set of methods and workflows for the HPC community. Our researchers will use the new CS-Storm system to power AI applications to achieve much faster results and gain new insights into traditional types of simulation results.”

    Supercomputer users at HLRS are increasingly asking for access to systems containing AI acceleration capabilities. With the GPU-accelerated CS-Storm system and Urika-CS AI and Analytics suite, which leverages popular machine intelligence frameworks like TensorFlow and PyTorch, HLRS can provide machine learning and deep learning services to its leading teaching and training programs, global partners and R&D. The Urika-CS AI and Analytics suite includes Cray’s Hyperparameter Optimization (HPO) and Cray Programming Environment Deep Learning Plugin, arming system users with the full potential of deep learning and advancing the services HLRS offers to its users interested in data analytics, machine learning and related fields.

    “The future will be driven by the convergence of modeling and simulation with AI and analytics and we’re honored to be working with HLRS to further their AI initiatives by providing advanced computing technology for the Center’s engineering and HPC training and research endeavors,” said Peter Ungaro, president and CEO at Cray, a Hewlett Packard Enterprise company. “HLRS has the opportunity to apply AI to improve and scale data analysis for the benefit of its core research areas, such as looking at trends in industrial HPC usage, creating models of car collisions, and visualizing black holes. The Cray CS-Storm combined with the unique Cray-CS AI and Analytics suite will allow HLRS to better tackle converged AI and simulation workloads in the exascale era.”

    In addition to the Cray CS-Storm architecture and Cray-CS AI and Analytics suite, the system will feature NVIDIA V100 Tensor Core GPUs and Intel Xeon Scalable processors.

    “The convergence of AI and scientific computing has accelerated the pace of scientific progress and is helping solve the world’s most challenging problems,” said Paresh Kharya, Director of Product Management and Marketing at NVIDIA. “Our work with Cray and HLRS on their new GPU-accelerated system will result in a modern HPC infrastructure that addresses the demands of the Center’s research community to combine simulation with the power of AI to advance science, find cures for disease, and develop new forms of energy.”

    See the full article here .

    five-ways-keep-your-child-safe-school-shootings

    Please help promote STEM in your local schools.

    Stem Education Coalition

    Founded on December 28, 2006, insideHPC is a blog that distills news and events in the world of HPC and presents them in bite-sized nuggets of helpfulness as a resource for supercomputing professionals. As one reader said, we’re sifting through all the news so you don’t have to!

    If you would like to contact me with suggestions, comments, corrections, errors or new company announcements, please send me an email at rich@insidehpc.com. Or you can send me mail at:

    insideHPC
    2825 NW Upshur
    Suite G
    Portland, OR 97239

    Phone: (503) 877-5048

     
  • richardmitnick 12:24 pm on October 23, 2019 Permalink | Reply
    Tags: , Cray Archer2, , insideHPC,   

    From insideHPC: “ARCHER2 to be first Cray Shasta System in Europe” 

    From insideHPC

    October 22, 2019

    Today Cray, a Hewlett Packard Enterprise company, announced a £48 million contract award in the UK to expand its high-performance computing capabilities with Cray’s next-generation Shasta supercomputer. The new ARCHER2 supercomputer will be the first Shasta system announced in EMEA and the second system worldwide used for academic research. ARCHER2 will be the UK’s most powerful supercomputer and will be equipped with the revolutionary Slingshot interconnect, Cray ClusterStor high-performance storage, the Cray Shasta Software platform, and 2nd Gen AMD EPYC processors. The new supercomputer will be 11X higher performance than its predecessor, ARCHER.

    3

    1
    UK Research and Innovation (UKRI) has once again contracted the team at CRAY to build their follow-up to the Archer supercomputer. Archer 2 is reported to offer up to 11x the throughput of the previous Archer supercomputer put into service back in late 2013. Archer 2 is going to be powered by 12,000 EPYC Rome 64 Core CPUs with 5,848 compute nodes, each having two of the 64 core behemoths. The total core count is 748,544 ( 1,497,088 threads) and 1.57PB for the entire system. The CPU speed is listed as 2.2GHz, which we must assume they are running off of the base clock, so that would be EPYC 7742 CPUs with a 225W TDP. These sorts of specs are insane but also will make some significant heat. Archer 2 will be cooled by 23 Shasta Mountain direct liquid cooling and associated liquid cooling cabinets. The back end for connectivity is Cray’s next-gen slingshot 100Gbps network compute groups. AMD GPUs are part of this array, but the information I have not found yet on which GPU units from AMD will be used. Estimated peak performance is 28 PFLOP/s and the transition for the Archer to the Archer 2 will begin in Q1 2020 and be completed late 1H 2020 as long as things go as planned.

    “ARCHER2 will be an important resource for the UK’s research community, providing them with the capability to pursue investigations which are not possible using current resources, said Lynn Gladden, executive chair, professor at the Engineering and Physical Sciences Research Council (ESPRC). “The new system delivered by Cray will greatly increase the potential for researchers to make discoveries across fields such as physics, chemistry, healthcare and technology development.”

    The new Cray Shasta-based ARCHER2 system will replace the existing ARCHER Cray XC30 in 2020 and be an even greater capability resource for academic researchers and industrial users from the UK, Europe and the rest of the world. At rates previously unattainable, the new supercomputer will achieve 11X higher performance with only a 27% increase in grid power. The ARCHER2 project provides resources for exploration in research disciplines including oil and gas, sustainability and natural resources, mental and physical health, oceanography, atomistic structures, and technology advancement.

    “We’re pleased to continue supporting UKRI’s mission and provide the most advanced high-end computing resources for the UK’s science and research endeavors,” said Peter Ungaro, president and CEO at Cray, a Hewlett Packard Enterprise company. “As traditional modeling and simulation applications and workflows converge with AI and analytics, a new Exascale Era architecture is required. Shasta will uniquely provide this new capability and ARCHER2 will be the first of its kind in Europe, as its next-gen architecture will provide UK and neighboring scientists and researchers the ability to meet their research requirements across a broad range of disciplines, faster.”

    The new Shasta system will be the third Cray supercomputer delivered to UKRI, with the previous systems being HECToR and ARCHER. ARCHER2 will be supported by 2nd Gen AMD EPYC processors.

    4

    “AMD is incredibly proud to continue our collaboration with Cray to deliver what will be the most powerful supercomputer in the UK, helping to process data faster and reduce the time it takes to reach critical scientific conclusions,” said Forrest Norrod, senior vice president and general manager, AMD Datacenter and Embedded Systems Group. “Investments in high-performance computing technology are imperative to keep up with today’s increasingly complex problems and explosive data growth. The 2nd Gen AMD EPYC processors paired with Cray Shasta will provide a powerful resource for the next generation of research in the UK when ARCHER2 is delivered next year.”

    See the full article here .

    five-ways-keep-your-child-safe-school-shootings

    Please help promote STEM in your local schools.

    Stem Education Coalition

    Founded on December 28, 2006, insideHPC is a blog that distills news and events in the world of HPC and presents them in bite-sized nuggets of helpfulness as a resource for supercomputing professionals. As one reader said, we’re sifting through all the news so you don’t have to!

    If you would like to contact me with suggestions, comments, corrections, errors or new company announcements, please send me an email at rich@insidehpc.com. Or you can send me mail at:

    insideHPC
    2825 NW Upshur
    Suite G
    Portland, OR 97239

    Phone: (503) 877-5048

     
  • richardmitnick 10:17 am on October 14, 2019 Permalink | Reply
    Tags: "Supercomputing the Building Blocks of the Universe", , insideHPC, , ,   

    From insideHPC: “Supercomputing the Building Blocks of the Universe” 

    From insideHPC

    October 13, 2019

    In this special guest feature, ORNL profiles researcher Gaute Hagen, who uses the Summit supercomputer to model scientifically interesting atomic nuclei.

    1
    Gaute Hagen uses ORNL’s Summit supercomputer to model scientifically interesting atomic nuclei. To validate models, he and other physicists compare computations with experimental observations. Credit: Carlos Jones/ORNL

    At the nexus of theory and computation, physicist Gaute Hagen of the Department of Energy’s Oak Ridge National Laboratory runs advanced models on powerful supercomputers to explore how protons and neutrons interact to “build” an atomic nucleus from scratch. His fundamental research improves predictions about nuclear energy, nuclear security and astrophysics.

    “How did matter that forms our universe come to be?” asked Hagen. “How does matter organize itself based on what we know about elementary particles and their interactions? Do we fully understand how these particles interact?”

    The lightest nuclei, hydrogen and helium, formed during the Big Bang. Heavier elements, up to iron, are made in stars by progressively fusing those lighter nuclei. The heaviest nuclei form in extreme environments when lighter nuclei rapidly capture neutrons and undergo beta decays.

    For example, building nickel-78, a neutron-rich nucleus that is especially strongly bound, or “doubly magic,” requires 28 protons and 50 neutrons interacting through the strong force. “To solve the Schrödinger equation for such a huge system is a tremendous challenge,” Hagen said. “It is only possible using advanced quantum mechanical models and serious computing power.”

    Through DOE’s Scientific Discovery Through Advanced Computing program, Hagen participates in the NUCLEI project to calculate nuclear structure and reactions from first principles; its collaborators represent 7 universities and 5 national labs. Moreover, he is the lead principal investigator of a DOE Innovative and Novel Computational Impact on Theory and Experiment award of time on supercomputers at Argonne and Oak Ridge National Laboratories for computations that complement part of the physics addressed under NUCLEI.

    Theoretical physicists build models and run them on supercomputers to simulate the formation of atomic nuclei and study their structures and interactions. Theoretical predictions can then be compared with data from experiments at new facilities producing increasingly neutron-rich nuclei. If the observations are close to the predictions, the models are validated.

    ‘Random walk’

    “I never planned to become a physicist or end up at Oak Ridge,” said Hagen, who hails from Norway. “That was a random walk.”

    Graduating from high school in 1994, he planned to follow in the footsteps of his father, an economics professor, but his grades were not good enough to get into the top-ranked Norwegian School of Economics in Bergen. A year of mandatory military service in the King’s Guard gave Hagen fresh perspective on his life. At 20, he entered the University of Bergen and earned a bachelor’s degree in the philosophy of science. Wanting to continue for a doctorate, but realizing he lacked math and science backgrounds that would aid his dissertation, he signed up for classes in those fields—and a scientist was born. He went on to earn a master’s degree in nuclear physics.

    Entering a PhD program, he used pen and paper or simple computer codes for calculations of the Schrödinger equation pertaining to two or three particles. One day his advisor introduced him to University of Oslo professor Morten Hjorth-Jensen, who used advanced computing to solve physics problems.

    “The fact that you could use large clusters of computers in parallel to solve for several tens of particles was intriguing to me,” Hagen said. “That changed my whole perspective on what you can do if you have the right resources and employ the right methods.”

    Hagen finished his graduate studies in Oslo, working with Hjorth-Jensen and taking his computing class. In 2005, collaborators of his new mentor—ORNL’s David Dean and the University of Tennessee’s Thomas Papenbrock—sought a postdoctoral fellow. A week after receiving his doctorate, Hagen found himself on a plane to Tennessee.

    For his work at ORNL, Hagen used a numerical technique to describe systems of many interacting particles, such as atomic nuclei containing protons and neutrons. He collaborated with experts worldwide who were specializing in different aspects of the challenge and ran his calculations on some of the world’s most powerful supercomputers.

    “Computing had taken such an important role in the work I did that having that available made a big difference,” he said. In 2008, he accepted a staff job at ORNL.”

    That year Hagen found another reason to stay in Tennessee—he met the woman who became his wife. She works in TV production and manages a vintage boutique in downtown Knoxville.

    Hagen, his wife and stepson spend some vacations at his father’s farm by the sea in northern Norway. There the physicist enjoys snowboarding, fishing and backpacking, “getting lost in remote areas, away from people, where it’s quiet and peaceful. Back to the basics.”

    Summiting

    Hagen won a DOE early career award in 2013. Today, his research employs applied mathematics, computer science and physics, and the resulting descriptions of atomic nuclei enable predictions that guide earthly experiments and improve understanding of astronomical phenomena.

    A central question he is trying to answer is: what is the size of a nucleus? The difference between the radii of neutron and proton distributions—called the “neutron skin”— has implications for the equation-of-state of neutron matter and neutron stars.

    In 2015, a team led by Hagen predicted properties of the neutron skin of the calcium-48 nucleus; the results were published in Nature Physics. In progress or planned are experiments by others to measure various neutron skins. The COHERENT experiment at ORNL’s Spallation Neutron Source did so for argon-40 by measuring how neutrinos—particles that interact only weakly with nuclei—scatter off of this nucleus. Studies of parity-violating electron scattering on lead-208 and calcium-48—topics of the PREX2 and CREX experiments, respectively—are planned at Thomas Jefferson National Accelerator Facility.

    One recent calculation in a study Hagen led solved a 50-year-old puzzle about why beta decays of atomic nuclei are slower than expected based on the beta decays of free neutrons. Other calculations explore isotopes to be made and measured at DOE’s Facility for Rare Isotope Beams, under construction at Michigan State University, when it opens in 2022.

    Hagen’s team has made several predictions about neutron-rich nuclei observed at experimental facilities worldwide. For example, 2016 predictions for the magicity of nickel-78 were confirmed at RIKEN in Japan and published in Nature this year. Now the team is developing methods to predict behavior of neutron-rich isotopes beyond nickel-78 to find out how many neutrons can be added before a nucleus falls apart.

    “Progress has exploded in recent years because we have methods that scale more favorably with the complexity of the system, and we have ever-increasing computing power,” Hagen said. At the Oak Ridge Leadership Computing Facility, he has worked on Jaguar (1.75 peak petaflops), Titan (27 peak petaflops) and Summit [above] (200 peak petaflops) supercomputers. “That’s changed the way that we solve problems.”

    ORNL OCLF Jaguar Cray Linux supercomputer

    ORNL Cray XK7 Titan Supercomputer, once the fastest in the world, to be decommissioned

    His team currently calculates the probability of a process called neutrino-less double-beta decay in calcium-48 and germanium-76. This process has yet to be observed but if seen would imply the neutrino is its own anti-particle and open a path to physics beyond the Standard Model of Particle Physics.

    Looking to the future, Hagen eyes “superheavy” elements—lead-208 and beyond. Superheavies have never been simulated from first principles.

    “Lead-208 pushes everything to the limits—computing power and methods,” he said. “With this next generation computer, I think simulating it will be possible.”

    See the full article here .

    five-ways-keep-your-child-safe-school-shootings

    Please help promote STEM in your local schools.

    Stem Education Coalition

    Founded on December 28, 2006, insideHPC is a blog that distills news and events in the world of HPC and presents them in bite-sized nuggets of helpfulness as a resource for supercomputing professionals. As one reader said, we’re sifting through all the news so you don’t have to!

    If you would like to contact me with suggestions, comments, corrections, errors or new company announcements, please send me an email at rich@insidehpc.com. Or you can send me mail at:

    insideHPC
    2825 NW Upshur
    Suite G
    Portland, OR 97239

    Phone: (503) 877-5048

     
  • richardmitnick 10:15 am on October 13, 2019 Permalink | Reply
    Tags: HPC in Australia, insideHPC, , The Pawsey Supercomputing Centre   

    From insideHPC: “Video: The Pawsey Supercomputing Centre, SKA, and HPC in Australia” 

    From insideHPC

    October 12, 2019
    Rich Brueckner

    Magnus Cray XC40 supercomputer at Pawsey Supercomputer Centre Perth Australia

    Galaxy Cray XC30 Series Supercomputer at Pawsey Supercomputer Centre Perth Australia


    In this video from the HPC User Forum at Argonne, Mark Stickells presents: HPC and Data Down Under: The Pawsey Supercomputing Centre, SKA, and HPC in Australia.

    “The Pawsey Supercomputing Centre is an unincorporated joint venture between CSIRO, Curtin University, Edith Cowan University, Murdoch University and The University of Western Australia. It is supported by the Western Australian and Federal Governments. The Centre is one of two, Tier-1, High Performance Computing facilities in Australia, whose primary function is to accelerate scientific research for the benefit of the nation. Our service and expertise in supercomputing, data, cloud services and visualisation, enables research across a spread of domains including astronomy, life sciences, medicine, energy, resources and artificial intelligence.”

    1

    2

    3

    4

    5

    6

    7
    SKA

    8
    SKA

    See the full article here .

    five-ways-keep-your-child-safe-school-shootings

    Please help promote STEM in your local schools.

    Stem Education Coalition

    Founded on December 28, 2006, insideHPC is a blog that distills news and events in the world of HPC and presents them in bite-sized nuggets of helpfulness as a resource for supercomputing professionals. As one reader said, we’re sifting through all the news so you don’t have to!

    If you would like to contact me with suggestions, comments, corrections, errors or new company announcements, please send me an email at rich@insidehpc.com. Or you can send me mail at:

    insideHPC
    2825 NW Upshur
    Suite G
    Portland, OR 97239

    Phone: (503) 877-5048

     
  • richardmitnick 1:29 pm on October 9, 2019 Permalink | Reply
    Tags: "Harvard Names New Lenovo HPC Cluster after Astronomer Annie Jump Cannon", , , insideHPC,   

    From insideHPC: “Harvard Names New Lenovo HPC Cluster after Astronomer Annie Jump Cannon” 

    From insideHPC

    October 9, 2019

    Harvard has deployed a liquid-cooled supercomputer from Lenovo at it’s FASRC computing center. The system, named “Cannon” in honor of astronomer Annie Jump Cannon, is a large-scale HPC cluster supporting scientific modeling and simulation for thousands of Harvard researchers.

    Assembled with the support of the Faculty of Arts and Sciences, but since branching out to serve many Harvard units, Cannon occupies more than 10,000 square feet with hundreds of racks spanning three data centers separated by 100 miles. The primary compute is housed in MGHPCC, our green (LEED Platinum) data center in Holyoke, MA. Other systems, including storage, login, virtual machines, and specialty compute, are housed in our Boston and Cambridge facilities.

    1

    “This new cluster will have 30,000 cores of Intel 8268 “Cascade Lake” processors. Each node will have 48 cores and 192 GB of RAM. The interconnect is HDR 100 Gbps Infiniband (IB) connected in a single Fat Tree with 200 Gbps IB core. The entire system is water cooled which will allow us to run these processors at a much higher clock rate of ~3.4GHz. In addition to the general purpose compute resources we are also installing 16 SR670 servers each with four Nvidia V100 GPUs and 384 GB of RAM all connected by HDR IB.”

    2

    Highlights:

    Compute: The Cannon cluster is primarily comprised of 670 Lenovo SD650 NeXtScale servers, part of their new liquid-cooled Neptune line. Each chassis unit contains two nodes, each containing two Intel 8268 “Cascade Lake” processors and 192GB RAM per node. The nodes are interconnected by HDR 100 Gbps Infiniband (IB) in a single Fat Tree with a 200 Gbps IB core. The liquid cooling allows for efficient heat extraction while running higher clock speeds.
    Storage: FASRC now maintains over 40 PB of storage, and this keeps growing. Robust home directories are housed on enterprise-grade Isilon storage, while faster Lustre filesystems serve more performance-driven needs such as scratch and research shares. Our middle tier laboratory storage uses a mix of Lustre, Gluster and NFS filesystems. See our storage page for more details.
    Interconnect: Odyssey has two underlying networks: A traditional TCP/IP network and low-latency InfiniBand networks that enable high-throughput messaging for inter-node parallel-computing and fast access to Lustre mounted storage. The IP network topology connects the three data centers together and presents them as a single contiguous environment to FASRC users.
    Software: The core operating system is CentOS. FASRC maintains the configuration of the cluster and all related machines and services via Puppet. Cluster job scheduling is provided by SLURM (Simple Linux Utility for Resource Management) across several shared partitions, processing approximately 29,000,000 jobs per year.

    3

    See the full article here .

    five-ways-keep-your-child-safe-school-shootings

    Please help promote STEM in your local schools.

    Stem Education Coalition

    Founded on December 28, 2006, insideHPC is a blog that distills news and events in the world of HPC and presents them in bite-sized nuggets of helpfulness as a resource for supercomputing professionals. As one reader said, we’re sifting through all the news so you don’t have to!

    If you would like to contact me with suggestions, comments, corrections, errors or new company announcements, please send me an email at rich@insidehpc.com. Or you can send me mail at:

    insideHPC
    2825 NW Upshur
    Suite G
    Portland, OR 97239

    Phone: (503) 877-5048

     
  • richardmitnick 12:42 pm on October 8, 2019 Permalink | Reply
    Tags: insideHPC, , Traverse supercomputer   

    From insideHPC: “Traverse Supercomputer to accelerate fusion research at Princeton” 

    From insideHPC

    October 8, 2019
    Rich Brueckner

    Princeton’s High-Performance Computing Research Center recently hosted a ribbon-cutting ceremony for Traverse, it’s newest supercomputer. Traverse is a 1.4 petaflop HPC cluster that ranks among the top 500 systems in the world.

    “Traverse is a mini version of ORNL’s Summit, thereby providing a stepping stone for the research community to one of the world’s fastest supercomputers,” said Curt Hillegas, associate CIO, Research Computin. “Getting experience using Traverse will allow our research groups to adapt their codes, so they can use the current leadership-class machines and be best prepared for the new exascale systems — capable of at least one exaFLOPS, or a billion billion calculations per second — expected to come online in the upcoming two years.”

    Exascale speeds are expected to help fusion researchers finally clear the remaining hurdles in the development of safe and sustainable fusion energy. “At that scale we will be able to simulate and optimize fusion reactors, speeding the deployment of fusion energy in the global battle against climate change,” explained Steven Cowley, PPPL director. “We are very grateful to the University for this marvelous facility.”

    1
    Shown at Princeton’s Sept. 30 ribbon-cutting for the Traverse supercomputer are, from left to right: Craig Ferguson, deputy director for operations and chief operating officer at the Princeton Plasma Physics Laboratory (PPPL); Steven Cowley, director of PPPL; David McComas, Princeton University’s vice president for PPPL; Chelle Reno, Princeton University’s assistant vice president for operations for PPPL; and Jay Dominick, Princeton University’s vice president for information technology and chief information officer. Photo: Denise Applewhite, Office of Communications.

    Plasma, the hot ionized gas that fuels fusion reactions, must be heated to very high temperatures for the particles to fuse and release their energy. The focus of much fusion research is preventing the swings in density and temperature that cause instabilities such as plasma disruptions, edge localized modes and energetic-particle driven modes. Machine learning (ML) techniques are helping researchers create better models for rapid control and containment of plasma.

    “Artificial intelligence (AI) and machine learning techniques could be a game changer,” said C.S. Chang, who heads the Center for High-fidelity Boundary Plasma Simulation at PPPL. “Due to the complicated nonlinear physics involved in these problems, using a supercomputer became a necessity for theoretical understanding. PPPL scientists will use Traverse to attack many of these problems in experiments, to collaborate with domestic and international researchers, and to help predict plasma performance in ITER, the international plasma research project using the world’s largest magnetic fusion device, or tokamak.”

    The AI advantages for scientific discovery are numerous, explained Chang. The hope is that equations will be solved much faster without going through traditional time-consuming numerical processes; experimental and theoretical data will be used to formulate simple equations that govern the physics processes; and the plasma will be controlled almost instantaneously, in millisecond time-frames too fast for human intervention.

    “A GPU-dominated computer such as Traverse is ideal for such AI/ML studies,” said Chang. “Solving these, and other important, physics and AI/ML problems on Traverse will greatly enhance the capabilities of graduate students, postdoctoral scientists and researchers, and their ability to advance these highly impactful areas in the world fusion and computational science research.”

    “Traverse is a major initiative in the University-DOE partnership,” McComas said. “Princeton and the U.S. Department of Energy have a long-standing commitment to the shared missions of fundamental research, world-leading education and fusion as a safe energy source. With the launch of Traverse, we look forward to even stronger connections between the University, PPPL and the DOE, and to accelerating leading-edge research needed to make fusion an abundant, safe and sustainable energy source for the U.S. and humanity.”

    See the full article here .

    five-ways-keep-your-child-safe-school-shootings

    Please help promote STEM in your local schools.

    Stem Education Coalition

    Founded on December 28, 2006, insideHPC is a blog that distills news and events in the world of HPC and presents them in bite-sized nuggets of helpfulness as a resource for supercomputing professionals. As one reader said, we’re sifting through all the news so you don’t have to!

    If you would like to contact me with suggestions, comments, corrections, errors or new company announcements, please send me an email at rich@insidehpc.com. Or you can send me mail at:

    insideHPC
    2825 NW Upshur
    Suite G
    Portland, OR 97239

    Phone: (503) 877-5048

     
  • richardmitnick 10:39 am on October 7, 2019 Permalink | Reply
    Tags: "Supercomputing Earth’s Geomagnetism with Blue Waters", , insideHPC   

    From insideHPC: “Supercomputing Earth’s Geomagnetism with Blue Waters” 

    October 6, 2019


    Researchers are using the Blue Waters supercomputer at NCSA to better understand geomagnetic variations and their underlying mechanisms, so that better forecasting models can be developed.

    NCSA U Illinois Urbana-Champaign Blue Waters Cray Linux XE/XK hybrid machine supercomputer

    Deep in the center of the Earth is a fluid outer core that generates Earth’s magnetic field like a magnet with its two magnetic poles aligning closely to the geographic north and south poles. This alignment has been long used by mankind for navigation. But the magnetic field of the Earth plays a far more critical role in protecting the Earth’s habitats, by providing a strong magnetic shield to deflect solar wind, coronal mass ejection and solar energetic particles.

    3

    But the Earth’s core is not a conventional magnet, its magnetic field, called the geomagnetic field, changes substantially in both space and time, due to a turbulent dynamo action within the core. Thus it is very challenging to accurately predict geomagnetic variations in even several years in future. Dr. Nikolaos Pavlis, a scientist with the National Geospatial Intelligence Agency, and Dr. Weijia Kuang, a geophysicist in the Geodesy & Geophysics Lab at the NASA Goddard Space Flight Center in Greenbelt, Maryland, have been using the Blue Waters supercomputer to better understand this complex phenomena.

    2
    Dr. Weijia Kuang from NASA Goddard

    Dr. Kuang recently answered some questions about this research via email.

    Q: What can you tell about your research?

    A: The collaboration that we are working on is in the area of geomagnetism, an important discipline of Earth science. The research goal of this collaboration is, in one sentence, utilizing geomagnetic observations and geodynamo models to make accurate forecasts of geomagnetic temporal variation on five-year to 20-year time scales. To reach this goal, we focus first on numerically understanding the forecast accuracy convergence with the ensemble size used for the ensemble Kalman filter type algorithm employed in our system.

    There are a lot of technical details embedded in the above short description. Therefore I am writing a few more details and hope they are helpful. It is well known that Earth possesses a strong magnetic field (called the geomagnetic field) in much of its history (~ 4.5 billion years). This field is dominantly dipolar at the Earth’s surface and aligns approximately with the spin axis of the Earth, making the two poles pointing approximately to north and south, respectively. The fields are similar to the magnetic fields of a simple bar magnet. This north-south alignment has been used by mankind for navigation for several thousands of years.

    Like many other geophysical quantities, the geomagnetic field changes in time and in space. Its changes in time are called “secular variation” (SV). Such changes are due to vigorous fluid motion, called convection, in the Earth’s fluid outer core which is approximately 3000 km below the surface.

    The fundamental geodynamical process governing the core convection and the geomagnetic field is called “geodynamo.” At present, numerical modeling is the main tool to understand this dynamical process, its consequence on geomagnetic variation that is observable at the Earth’s surface, and its relevance to Earth’s evolution on geological time scales. Effort on accurate forecast of SV serves both the fundamental science and societal application needs.

    Q. How are you using Blue Waters for this research?

    A: We use Blue Waters for two main research tasks: (1) obtaining large ensemble of high-resolution geodynamo simulation solutions; and for (2) testing of forecast accuracy convergence with the ensemble size. These two can provide the knowledge on optimal ensemble sizes for geomagnetic forecast with given numerical resolutions and forecast accuracies. As you will find in the answers to the next two questions, the optimal ensemble size ensures cost-effective means for our research.

    Q. How many cores are you using on Blue Waters? How long do your runs take?

    A: Our project is computationally expensive. If we use 128 cores as the nominal usage for a single geodynamo simulation run, then 512 simultaneous runs will use 65,536 cores (or 2,048 nodes). However, due to research and technical reasons, we have tested so far only 1,024 cores (32 nodes).

    Q. Would this research be possible without Blue Waters?

    A. One main bottleneck of our research is the computing resource, in particular the CPU time. A typical geodynamo simulation requires ~ 1013 floating-point operations or “flops” with our current numerical resolution (100100100 in the three-dimensional space) and will require 1017 flops (100 petaflops) if higher resolutions are used for “Earth-like” parameters. Geomagnetic data assimilation can require three orders of magnitude more CPU time with ~1,000 ensemble members. If we look at it from the wall-clock time perspective, a single geodynamo simulation run can take up to two weeks (depending on numerical resolution and number of nodes used) on Blue Waters. Therefore, an ensemble of 512 simulation runs (which is expected to be typical) could last 10 years if they were executed sequentially. Blue Waters will enable us to have the entire ensemble runs executed simultaneously (parallel computation), thus allowing assimilation runs completed in the time frame comparable to that of a single run. Without Blue Waters (or any comparable computing facilities), we would have to scale back our ensemble size in order to complete all simulations within a reasonable time frame. This will certainly limit our ability to achieve meaningful research and application goals.

    See the full article here .

    five-ways-keep-your-child-safe-school-shootings

    Please help promote STEM in your local schools.

    Stem Education Coalition

    Founded on December 28, 2006, insideHPC is a blog that distills news and events in the world of HPC and presents them in bite-sized nuggets of helpfulness as a resource for supercomputing professionals. As one reader said, we’re sifting through all the news so you don’t have to!

    If you would like to contact me with suggestions, comments, corrections, errors or new company announcements, please send me an email at rich@insidehpc.com. Or you can send me mail at:

    insideHPC
    2825 NW Upshur
    Suite G
    Portland, OR 97239

    Phone: (503) 877-5048

     
  • richardmitnick 11:26 am on September 29, 2019 Permalink | Reply
    Tags: , EMC3, insideHPC,   

    From insideHPC: “LANL teams with Arm to for Extreme-scale Computing” 

    From insideHPC

    September 29, 2019

    2

    Los Alamos National Laboratory and Arm are teaming up to make efficient, workload-optimized processors tailored to the extreme-scale computing requirements of the Laboratory’s national-security mission. The collaboration addresses the challenges of connecting more and more processors as high performance computers become larger and more powerful.

    1

    “We are designing supercomputer systems today that will be deployed in the coming decade, and efficient processors and computer systems tailored to the Laboratory’s specific workloads are critical to maintaining our national leadership in HPC,” said Stephen Poole, chief HPC architect at Los Alamos National Laboratory. “With these new technologies, we will be able to expand and refine our research capabilities to enhance discoveries and increase our overall efficiencies in our mission applications.”

    High performance computers play a pivotal role in Los Alamos’ mission of maintaining the nation’s nuclear stockpile and understanding complicated physics through extreme-scale simulations that can take months to complete on today’s fastest computers. Along with Los Alamos, leaders in other industries, such as energy and film, will benefit from more efficient computers. The petroleum industry utilizes supercomputers to simulate and analyze underground petroleum reservoirs and geological substrates to guide investments of hundreds of millions of dollars in developing and processing oil fields. The film industry relies heavily on HPC systems to create and render detailed animations in blockbuster movies.

    Arm’s highly flexible processor architectures are well-suited for tailored design applications and potential customization for the Laboratory’s extreme scale computing needs. Los Alamos’ applications will receive direct benefits from Arm’s work to build out the software and hardware ecosystems collaboration. With Los Alamos, the collaboration will focus on future proposed enhancements to the overall processor architecture to meet the demands for information discovery at scale for the lab’s next generation applications.

    “This close collaboration with EMC3, is expected to start bearing performance results in near term systems as well some of our future systems in design,” said Gary Grider, division leader for HPC at Los Alamos National Laboratory.

    4

    The collaborative development will be done under the Efficient Mission Centric Computing Consortium (EMC3), centered at the LANL’s Ultra-Scale Systems Research Center (USRC). The EMC3 consortium mission is to investigate efficient ultra-scale computing and networking architectures, applications and environments, to provide the most efficient computing architectures needed for US Industry and national security.

    “The collaboration between Arm and Los Alamos represents a significant milestone in enabling next generation ultra-large scale HPC,” said Eric Van Hensbergen, Arm Research Fellow. “High-end ultra-scale compute users will benefit from future tailored enhancements to the base level Arm designs, including architecture blocks and cache management, which will have a direct impact on efficiency and performance for workload-optimized processors.”

    See the full article here .

    five-ways-keep-your-child-safe-school-shootings

    Please help promote STEM in your local schools.

    Stem Education Coalition

    Founded on December 28, 2006, insideHPC is a blog that distills news and events in the world of HPC and presents them in bite-sized nuggets of helpfulness as a resource for supercomputing professionals. As one reader said, we’re sifting through all the news so you don’t have to!

    If you would like to contact me with suggestions, comments, corrections, errors or new company announcements, please send me an email at rich@insidehpc.com. Or you can send me mail at:

    insideHPC
    2825 NW Upshur
    Suite G
    Portland, OR 97239

    Phone: (503) 877-5048

     
  • richardmitnick 3:33 pm on September 20, 2019 Permalink | Reply
    Tags: "A Simulation Booster for Nanoelectronics", , , insideHPC,   

    From insideHPC: “A Simulation Booster for Nanoelectronics” 

    From insideHPC

    September 20, 2019
    Simone Ulmer

    ETH Zurich bloc

    ETH Zürich

    1
    Self-heating in a so-called Fin field-effect transistor (FinFET) at high current densities. Each constituting Silicon atom is colored according to its temperature (Image: Jean Favre, CSCS)

    Two research groups from ETH Zürich have developed a method that can simulate nanoelectronics devices and their properties realistically, quickly and efficiently. This offers a ray of hope for the industry and data centre operators alike, both of which are struggling with the (over)heating that comes with increasingly small and powerful transistors – and with the high resulting electricity costs for cooling.

    Chip manufacturers are already assembling transistors that measure just a few nanometres across. They are much smaller than a human hair, whose diameter is approximately 20,000 nanometres in the case of finer strands. Now, demand for increasingly powerful supercomputers is driving the industry to develop components that are even smaller and yet more powerful at the same time.

    One of the 2019 Gordon Bell Prize Finalists

    However, in addition to physical laws that make it harder to build ultra-scaled transistors, the problem of the ever increasing heat dissipation is putting manufacturers in a tricky situation – partly due to steep rises in cooling requirements and the resulting demand for energy. Cooling the computers already accounts for up to 40 percent of power consumption in some data centres, as the research groups led by ETH professors Torsten Hoefler and Mathieu Luisier report in their latest study, which they hope will allow a better approach to be developed. With their study, the researchers are now nominated for the ACM Gordon Bell Prize, the most prestigious prize in the area of supercomputers, which is awarded annually at the SC supercomputing conference in the United States.

    To make today’s nanotransistors more efficient, the research group led by Luisier from the Integrated Systems Laboratory (IIS) at ETH Zürich simulates transistors using software named OMEN, which is a so-called quantum transport simulator. OMEN runs its calculations based on what is known as density functional theory (DFT), allowing a realistic simulation of transistors in atomic resolution and at the quantum mechanical level. This simulation visualises how electrical current flows through the nanotransistor and how the electrons interact with crystal vibrations, thus enabling researchers to precisely identify locations where heat is produced. In turn, OMEN also provides useful clues as to where there is room for improvement.

    Improving transistors using optimised simulations

    Until now, conventional programming methods and supercomputers only permitted researchers to simulate heat dissipation in transistors consisting of around 1,000 atoms, as data communication between the processors and memory requirements made it impossible to produce a realistic simulation of larger objects. Most computer programs do not spend most of their time performing computing operations, but rather moving data between processors, main memory and external interfaces. According to the scientists, OMEN also suffered from a pronounced bottleneck in communication, which curtailed performance. “The software is already used in the semiconductor industry, but there is considerable room for improvement in terms of its numerical algorithms and parallelisation,” says Luisier.

    Until now, the parallelization of OMEN was designed according to the physics of the electro-thermal problem, as Luisier explains. Now, Ph.D. student Alexandros Ziogas and the postdoc Tal Ben-Nun – working under Hoefler, head of the Scalable Parallel Computing Laboratory at ETH Zürich – have not looked at the physics but rather at the dependencies between the data. They reorganised the computing operations according to these dependencies, effectively without considering the underlying physics. In optimising the code, they had the help of two of the most powerful supercomputers in the world – “Piz Daint” at the Swiss National Supercomputing Centre (CSCS) and “Summit” at Oak Ridge National Laboratory in the US, the latter being the fastest supercomputer in the world.

    Cray Piz Daint Cray XC50/XC40 supercomputer of the Swiss National Supercomputing Center (CSCS)

    ORNL IBM AC922 SUMMIT supercomputer, No.1 on the TOP500. Credit: Carlos Jones, Oak Ridge National Laboratory/U.S. Dept. of Energy

    According to the researchers, the resulting code – dubbed DaCe OMEN – produced simulation results that were just as precise as those from the original OMEN software.

    For the first time, DaCe OMEN has reportedly made it possible for researchers to produce a realistic simulation of transistors ten times the size, made up of 10,000 atoms, on the same number of processors – and up to 14 times faster than the original method took for 1,000 atoms. Overall, DaCe OMEN is more efficient than OMEN by two orders of magnitude: on Summit, it was possible to simulate, among other things, a realistic transistor up to 140 times faster with a sustained performance of 85.45 petaflops per second – and indeed to do so in double precision on 4,560 computer nodes. This extreme boost in computing speed has earned the researchers a nomination for the Gordon Bell Prize.

    Data-centric programming

    he scientists achieved this optimisation by applying the principles of data-centric parallel programming (DAPP), which was developed by Hoefler’s research group. Here, the aim is to minimise data transport and therefore communication between the processors. “This type of programming allows us to very accurately determine not only where this communication can be improved on various levels of the program, but also how we can tune specific computing-intensive sections, known as computational kernels, within the calculation for a single state,” says Ben-Nun. This multilevel approach makes it possible to optimise an application without having to rewrite it every time. Data movements are also optimised without modifying the original calculation – and for any desired computer architecture. “When we optimise the code for the target architecture, we’re now only changing it from the perspective of the performance engineer, and not that of the programmer – that is, the researcher who translates the scientific problem into code,” says Hoefler. This, he says, leads to the establishment of a very simple interface between computer scientists and interdisciplinary programmers.

    The application of DaCe OMEN has shown that the most heat is generated near the end of the nanotransistor channel and revealed how it spreads from there and affects the whole system. The scientists are convinced that the new process for simulating electronic components of this kind has a variety of potential applications. One example is in the production of lithium batteries, which can lead to some unpleasant surprises when they overheat.

    Data-centric programming is an approach that ETH Professor Torsten Hoefler has been pursuing for a number of years with a goal of putting the power of supercomputers to more efficient use. In 2015, Hoefler received an ERC Starting Grant for his project, Data Centric Parallel Programming (DAPP).

    See the full article here .

    five-ways-keep-your-child-safe-school-shootings

    Please help promote STEM in your local schools.

    Stem Education Coalition

    Founded on December 28, 2006, insideHPC is a blog that distills news and events in the world of HPC and presents them in bite-sized nuggets of helpfulness as a resource for supercomputing professionals. As one reader said, we’re sifting through all the news so you don’t have to!

    If you would like to contact me with suggestions, comments, corrections, errors or new company announcements, please send me an email at rich@insidehpc.com. Or you can send me mail at:

    insideHPC
    2825 NW Upshur
    Suite G
    Portland, OR 97239

    Phone: (503) 877-5048

     
c
Compose new post
j
Next post/Next comment
k
Previous post/Previous comment
r
Reply
e
Edit
o
Show/Hide comments
t
Go to top
l
Go to login
h
Show/Hide help
shift + esc
Cancel
%d bloggers like this: