Tagged: HPC Toggle Comment Threads | Keyboard Shortcuts

  • richardmitnick 2:00 pm on June 15, 2021 Permalink | Reply
    Tags: "Forthcoming revolution will unveil the secrets of matter", , , , European High Performance Computer Joint Undertaking (EU), Exaflop computers, , HPC,   

    From CNRS-The National Center for Scientific Research [Centre national de la recherche scientifique] (FR) : “Forthcoming revolution will unveil the secrets of matter” 

    From CNRS-The National Center for Scientific Research [Centre national de la recherche scientifique] (FR)

    Martin Koppe

    ©Sikov /Stock.Adobe.com

    Provided adapted software can be developed, exascale computing, a new generation of supercomputers, will offer massive power to model the properties of molecules and materials, while taking into account their fundamental interactions and quantum mechanics. The TREX-Targeting Real Chemical accuracy at the EXascale (EU) project is set to meet the challenge.

    One quintillion operations per second. Exaflop computers – from the prefix -exa or 10^18, and flops, the number of floating-point operations that a computer can perform in one second – will offer this colossal computing power, as long as specifically designed programs and codes are available. An international race is thus underway to produce these impressive machines, and to take full advantage of their capacities. The European Commission is financing ambitious projects that are preparing the way for exascale, which is to say any form of high-performance computing that reaches an exaflop. The Targeting Real chemical precision at the EXascale (TREX)[1] programme focuses on highly precise computing methods in the fields of chemistry and materials physics.

    Compute nodes of the Jean Zay supercomputer, the first French converged supercomputer between intensive calculations and artificial intelligence. After its extension in the summer of 2020, it attained 28 petaflops, or 28 quintillion operations per second, thanks to its 86,344 cores supported by 2,696 GPU accelerators.
    © Cyril FRESILLON / Idris: A Language for Type-Driven Development / CNRS Photothèque.

    Officially inaugurated in October 2020, TREX is part of the broader European High Performance Computing (European High Performance Computer Joint Undertaking (EU)) joint undertaking, whose goal is to ensure Europe is a player alongside the United States and China in exascale computing. “The Japanese have already achieved exascale by lowering computational precision,” enthuses Anthony Scemama, a researcher at the LCPQ-Laboratoire de Chimie et Physique Quantiques (FR),[2] and one of the two CNRS coordinators of TREX. “A great deal of work remains to be done on codes if we want to take full advantage of these future machines.”

    Exascale computing will probably use GPUs as well as traditional processors, or CPUs. These graphics processors were originally developed for video games, but they have enjoyed increasing success in data-intensive computing applications. Here again, their use will entail rewriting programs to fully harness their power for those applications that will need it.

    “Chemistry researchers already have various computing techniques for producing simulations, such as modelling the interaction of light with a molecule,” Scemama explains. “TREX focuses on cases where the computing methods for a realistic and predictive description of the physical phenomena controlling chemical reactions are too costly.”

    “TREX is an interdisciplinary project that also includes physicists,” stresses CNRS researcher and project coordinator Michele Casula, at the Institute of Mineralogy, Material Physics and Cosmochemistry [Institut de minéralogie, de physique des matériaux et de cosmochimie (FR).[3] “Our two communities need computing methods that are powerful enough to accurately predict the behaviour of matter, which often requires far too much computation time for conventional computers.”
    The TREX team has identified several areas for applications. First of all, and surprising though it may seem, the physicochemical properties of water have not been sufficiently modelled. The best ab initio simulations – those based on fundamental interactions – are wrong by a few degrees when trying to estimate its boiling point.

    Improved water models will enable us to more effectively simulate the behaviour of proteins, which continually evolve in aqueous environments. The applications being developed in connection with the TREX project could have a significant impact on research in biology and pharmacy. For example, nitrogenases, which make essential contributions to life, transform nitrogen gas into ammonia, a form that can be used by organisms. However, the theoretical description of the physicochemical mechanisms used by this enzyme is not accurate enough under current models. Exascale computing should also improve experts’ understanding of highly correlated materials such as superconductors, which are characterised by the substantial interactions between the electrons they are made of.

    “The microscopic understanding of their functioning remains an unresolved issue, one that has nagged scientists ever since the 1980s,” Casula points out. “It is one of the major open problems in condensed matter physics. When mastered, these materials will, among other things, be able to transport electricity with no loss of energy.” 2D materials are also involved, especially those used in solar panels to convert light into power.

    “To model matter using quantum mechanics means relying on equations that become exponentially more complex, such as the Schrödinger equation, whose number of coordinates increases with the system, ” Casula adds. “In order to solve them in simulations, we either have to use quantum computers, or further explore the power of silicon analogue chips with exascale computing, along with suitable algorithms.”

    To achieve this, TREX members are counting on Quantum Monte Carlo (QMC), and developing libraries to integrate it into existing codes. “We are fortunate to have a method that perfectly matches exascale machines,” Scemama exclaims. QMC is particularly effective at digitally calculating observable values – the quantum equivalent of classical physical values – bringing into play quantum interactions between multiple particles.

    Modelling of electron trajectories in an aggregate of water, created by the QMC programme developed at the LCPQ in Toulouse (southwestern France). © Anthony Scemama / Laboratoire de Chimie et Physique Quantiques.

    “The full computation of these observables is too complex,” Casula stresses. “Accurately estimating them using deterministic methods could take more time than the age of the Universe. Simply put, QMC will not solve everything, but instead provides a statistical sampling of results. Exaflop computers could draw millions of samples per second, and thanks to statistical tools such as the central limit theorem, the more of these values we have, the closer we get to the actual result. We can thus obtain an approximation that is accurate enough to help researchers, all within an acceptable amount of time.”

    With regard to the study of matter, an exascale machine can provide a good description of the electron cloud and its interaction with nuclei. That is not the only advantage. “When configured properly, these machines may use thirty times more energy than classical supercomputers, but in return will produce a thousand times more computing power,” Scemama believes. “Researchers could launch very costly calculations, and use the results to build simpler models for future use.”

    The TREX team nevertheless insists that above all else, it creates technical and predictive tools for other researchers, who will then seek to develop concrete applications. Ongoing exchanges have made it possible to share best practices and feedback among processor manufacturers, physicists, chemists, researchers in high-performance computing, and TREX’s two computing centres.


    In addition to the CNRS, the project includes the universities of Versailles Saint-Quentin-en-Yvelines University [Université de Versailles Saint-Quentin-en-Yvelines – UVSQ] (FR); University of Twente [ Universiteit Twente] (NL), University of Vienna [Universität Wien] (AT)(Austria), Lodz University of Technology [Politechnika Łódzka] (PL) (Poland), the International School for Advanced Studies [Scuola Internazionale Superiore di Studi Avanzati] (IT) (Italy), the MPG Institutes (DE)(Germany), the Slovak University of Technology in Bratislava [Slovenská technická univerzita v Bratislave](STU)(SK) (Slovakia), as well as the Cineca (IT) (Italy) and Jülich Supercomputing Centre [Forschungszentrum Jülich ] (DE) (Germany) supercomputing centres, the MEGWARE [Deutsche Megware] Computer HPC Systems & Solutions (DE) and Trust-IT Services | Phidias (FR) companies.
    Laboratoire de chimie et physique quantiques (CNRS / Université Toulouse III – Paul Sabatier.
    CNRS / National Museum of Natural History [Muséum National d’Histoire Naturelle] (MNHN) (FR) / Sorbonne University [Sorbonne Université] (FR).

    See the full article here.


    Please help promote STEM in your local schools.

    Stem Education Coalition

    CNRS-The National Center for Scientific Research [Centre national de la recherche scientifique](FR) is the French state research organisation and is the largest fundamental science agency in Europe.

    In 2016, it employed 31,637 staff, including 11,137 tenured researchers, 13,415 engineers and technical staff, and 7,085 contractual workers. It is headquartered in Paris and has administrative offices in Brussels; Beijing; Tokyo; Singapore; Washington D.C.; Bonn; Moscow; Tunis; Johannesburg; Santiago de Chile; Israel; and New Delhi.

    The CNRS was ranked No. 3 in 2015 and No. 4 in 2017 by the Nature Index, which measures the largest contributors to papers published in 82 leading journals.

    The CNRS operates on the basis of research units, which are of two kinds: “proper units” (UPRs) are operated solely by the CNRS, and “joint units” (UMRs – French: Unité mixte de recherche)[9] are run in association with other institutions, such as universities or INSERM. Members of joint research units may be either CNRS researchers or university employees (maîtres de conférences or professeurs). Each research unit has a numeric code attached and is typically headed by a university professor or a CNRS research director. A research unit may be subdivided into research groups (“équipes”). The CNRS also has support units, which may, for instance, supply administrative, computing, library, or engineering services.

    In 2016, the CNRS had 952 joint research units, 32 proper research units, 135 service units, and 36 international units.

    The CNRS is divided into 10 national institutes:

    Institute of Chemistry (INC)
    Institute of Ecology and Environment (INEE)
    Institute of Physics (INP)
    Institute of Nuclear and Particle Physics (IN2P3)
    Institute of Biological Sciences (INSB)
    Institute for Humanities and Social Sciences (INSHS)
    Institute for Computer Sciences (INS2I)
    Institute for Engineering and Systems Sciences (INSIS)
    Institute for Mathematical Sciences (INSMI)
    Institute for Earth Sciences and Astronomy (INSU)

    The National Committee for Scientific Research, which is in charge of the recruitment and evaluation of researchers, is divided into 47 sections (e.g. section 41 is mathematics, section 7 is computer science and control, and so on).Research groups are affiliated with one primary institute and an optional secondary institute; the researchers themselves belong to one section. For administrative purposes, the CNRS is divided into 18 regional divisions (including four for the Paris region).

    Some selected CNRS laboratories

    APC laboratory
    Centre d’Immunologie de Marseille-Luminy
    Centre d’Etude Spatiale des Rayonnements
    Centre européen de calcul atomique et moléculaire
    Centre de Recherche et de Documentation sur l’Océanie
    CINTRA (joint research lab)
    Institut de l’information scientifique et technique
    Institut de recherche en informatique et systèmes aléatoires
    Institut d’astrophysique de Paris
    Institut de biologie moléculaire et cellulaire
    Institut Jean Nicod
    Laboratoire de Phonétique et Phonologie
    Laboratoire d’Informatique, de Robotique et de Microélectronique de Montpellier
    Laboratory for Analysis and Architecture of Systems
    Laboratoire d’Informatique de Paris 6
    Laboratoire d’informatique pour la mécanique et les sciences de l’ingénieur
    Observatoire océanologique de Banyuls-sur-Mer

  • richardmitnick 3:43 pm on June 19, 2018 Permalink | Reply
    Tags: ClusterStor, , Cray Introduces All Flash Lustre Storage Solution Targeting HPC, HPC, L300F a scalable all-flash storage solution, Lustre 2.11,   

    From HPC Wire: “Cray Introduces All Flash Lustre Storage Solution Targeting HPC” 

    From HPC Wire

    June 19, 2018
    John Russell


    Citing the rise of IOPS-intensive workflows and more affordable flash technology, Cray today introduced the L300F, a scalable all-flash storage solution whose primary use case is to support high IOPS rates to/from a scratch storage pool in the Lustre file system. Cray also announced that sometime in August, it would be supporting Lustre 2.11 just released in April. This rapid productizing of Lustre’s latest release is likely to be appreciated by the user community which sometimes criticizes vendors for being slow to commercialize the latest features of the open source parallel file system.

    “Lustre 2.11 has been one of the drivers for us because it has unique performance enhancements, usability enhancements, and we think some of those features will pair nicely with a flash-based solution that’s sitting underneath the file system,” said Mark Wiertalla, product marketing director.

    The broader driver is the rise in use cases with demanding IOPS characteristics often including files of small size. Hard disk drives, by their nature, handle these workloads poorly. Cray cites AI, for example, as a good use case with high IOPS requirements.


    Here’s a brief description from Cray of how L300F fits into the Cray ClusterStor systems:

    Unlike the existing building blocks in the ClusterStor family which use a 5U84 form factor (5 rack units high/84 drives slots) mainly for Hard Disk Drives (HDD) the L300F is a 2U24 form factorfilled exclusively with Solid State Drives (SDD).
    Like the existing building blocks (L300 and L300N) the L300F features two embedded server modules in a high availability configuration for the Object Storage Server (OSS) functionality of the open source, parallel file system Lustre.
    Like the existing building blocks, the L300 converges the Lustre Object Storage Servers (OSS) and the Object Storage Targets (OST) in the same building block for linear scalability.
    Like all ClusterStor building blocks the L300F is purpose-engineered to deliver the most effective parallel file system storage infrastructure for the leadership class of supercomputing environments.

    The existing L300 model is an all-HDD Lustre solution, well suited for environments using applications with large, sequential I/O workloads. The L300N model, by contrast, is a hybrid SSD/HDD solution with flash-accelerated NXD software that redirects I/O to the appropriate storage medium, delivering cost-effective, consistent performance on mixed I/O workloads while shielding the application, file system and users from complexity through transparent flash acceleration.

    In positioning L300F, Cray said, “L300F enables users such as engineers, researchers and scientists to dramatically reduce the runtime of their applications allowing jobs to reliably complete within their required schedule, supporting more iterations and faster time to insight. Supplementing Cray’s ClusterStor portfolio with an all-flash storage option, the ClusterStor L300F integrates with and complements the existing L300/L300N models to provide a comprehensive storage architecture. It allows customers to address performance bottlenecks without needlessly overprovisioning HDD storage capacity, creating a cost-competitive solution for improved application run time.”


    Analysts are likewise bullish on flash. “Flash is poised to become an essential technology in every HPC storage solution,” said Jeff Janukowicz, IDC’s Research vice president, Solid State Drives and Enabling Technologies. “It has the unique role of satisfying the high-performance appetite of artificial intelligence applications even while helping customers optimize their storage budget for big data. With the ClusterStor L300F, Cray has positioned itself to be at the leading edge of next generation of HPC storage solutions.”

    According to Cray L300F simplifies storage management for storage administrators, allowing them to stand up a high-performance flash pool within their existing Lustre file system using existing tools and skills. “This eliminates the need for product-unique training or to administer a separate file system. Using ClusterStor Manager, administrators can reduce the learning curve and accelerate time-to-proficiency, thereby improving ROI. When coupled with Cray’s exclusive monitoring application Cray View for ClusterStor, administrators get an end-to-end view of Lustre jobs, network status and storage system performance. Cray View forClusterStor provides visibility into job runtime variability, event correlation, trend analysis and offers custom alerts based on any selected metric,” according to the announcement.

    Price remains an issue for flash. It’s currently about 13X more expensive on per terabyte basis. “But when flash is viewed on a dollar per IOPS basis, it is small fraction of the cost compared to hard disk drives. What our customers are telling us is they have unlocked that secret. Now they can think about uses cases and say here’s three of them that make sense immediately. That’s how they will deploy it. They’ll use it as a tactical tool,” said Wiertalla.

    “We see the L300F allowing many customers to start testing the waters with flash storage. We are seeing RFPs [and] we think we are going to see, as the delta in prices between flash and disk narrows over the next 3-5 years, that customers will find incrementally new use cases where flash become cost competitive and they will adopt it gradually. Maybe in the 2020s we’ll start to see customers think about putting file systems exclusively on flash.”

    Given Cray is approaching the first anniversary of its acquisition of the ClusterStor portfolio it is likely to showcase the line at ISC2018 (booth #E-921) next week (see HPCwire article, Cray Moves to Acquire the Seagate ClusterStor Line) and perhaps issue other news in its storage line.

    See the full article here .


    Please help promote STEM in your local schools.

    Stem Education Coalition

    HPCwire is the #1 news and information resource covering the fastest computers in the world and the people who run them. With a legacy dating back to 1987, HPC has enjoyed a legacy of world-class editorial and topnotch journalism, making it the portal of choice selected by science, technology and business professionals interested in high performance and data-intensive computing. For topics ranging from late-breaking news and emerging technologies in HPC, to new trends, expert analysis, and exclusive features, HPCwire delivers it all and remains the HPC communities’ most reliable and trusted resource. Don’t miss a thing – subscribe now to HPCwire’s weekly newsletter recapping the previous week’s HPC news, analysis and information at: http://www.hpcwire.com.

  • richardmitnick 9:26 am on January 10, 2018 Permalink | Reply
    Tags: , , , HPC   

    From HPC Wire: “Momentum Builds for US Exascale” 

    HPC Wire

    January 9, 2018
    Alex R. Larzelere


    2018 looks to be a great year for the U.S. exascale program. The last several months of 2017 revealed a number of important developments that help put the U.S. quest for exascale on a solid foundation. In my last article, I provided a description of the elements of the High Performance Computing (HPC) ecosystem and its importance for advancing and sustaining this strategically important technology. It is good to report that the U.S. exascale program seems to be hitting the full range of ecosystem elements.

    As a reminder, the National Strategic Computing Initiative (NSCI) assigned the U.S. Department of Energy (DOE) Office of Science (SC) and the National Nuclear Security Administration (NNSA) to execute a joint program to deliver capable exascale computing that emphasizes sustained performance on relevant applications and analytic computing to support their missions. The overall DOE program is known as the Exascale Computing Initiative (ECI) and is funded by the SC Advanced Scientific Computing Research (ASCR) program and the NNSA Advanced Simulation and Computing (ASC) program.

    Elements of the ECI include the procurement of exascale class systems and the facility investments in site preparations and non-recurring engineering. Also, ECI includes the Exascale Computing Project (ECP) that will conduct the Research and Development (R&D) in the areas of middleware (software stack), applications, and hardware to ensure that exascale systems will be productively usable to address Office of Science and NNSA missions.

    In the area of hardware – the last part of 2017 revealed a number of important developments. First and most visible, is the initial installation of the SC Summit system at Oak Ridge National Laboratory (ORNL) and the NNSA Sierra system at Lawrence Livermore National Laboratory (LLNL).

    ORNL IBM Summit Supercomputer

    LLNL IBM Sierra ATS2 supercomputer

    Both systems are being built by IBM using Power9 processors with Nvidia GPU co-processors. The machines will have two Power9 CPUs per system board and will use a Mellenox InfinBand interconnection network.

    Beyond that, the architecture of each machine is slightly different. The ORNL Summit machine will use six Nvidia Volta GPUs per two Power9 CPUs on a system board and will use NVLink to connect to 512 GB of memory. The Summit machine will use a combination of air and water cooling. The LLNL Sierra machine will use four Nvidia Voltas and 256 GB of memory connected with the two Power9 CPUs per board. The Sierra machine will use only air cooling. As was reported by HPCwire in November 2017, the peak performance of the Summit machine will be about 200 petaflops and the Sierra machine is expected to be about 125 petaflops.

    Installation of both the Summit and Sierra systems is currently underway with about 279 racks (without system boards) and the interconnection network already installed at each lab. Now that IBM has formally released the Power9 processors, the racks will soon start being populated with the boards that contain the CPUs, GPUs and memory. Once that is completed, the labs will start their acceptance testing, which is expected to be finished later in 2018.

    Another important piece of news about the DOE exascale program is the clarification of the status of the Argonne National Laboratory (ANL) Aurora machine.

    Depiction of ANL ALCF Cray Shasta Aurora supercomputer

    This system was part of the collaborative CORAL procurement that also selected the Sierra and Summit machines. The Aurora system is being manufactured by Intel with Cray Inc. acting as the system integrator. The machine was originally scheduled to be an approximately 180 peak petaflops system using the Knights Hill third generation Phi processors. However, during SC17, we learned that Intel is removing the Knights Hill chip from its roadmap. This explains the reason why during the September ASCR Advisory Committee (ASCAC) meeting, Barb Helland, the Associate Director of the ASCR office, announced that the Aurora system would be delayed to 2021 and upgraded to 1,000 petaflops (aka 1 exaflops).

    The full details of the revised Aurora system are still under wraps. We have learned that it is going to use “novel” processor technologies, but exactly what that means is unclear. The ASCR program subjected the new Aurora design to an independent outside review. It found, “The hardware choices/design within the node is extremely well thought through. Early projections suggest that the system will support a broad workload.” The review committee even suggested that, “The system as presented is exciting with many novel technology choices that can change the way computing is done.” The Aurora system is in the process of being “re-baselined” by the DOE. Hopefully, once that is complete, we will get a better understanding of the meaning of “novel” technologies. If things go as expected, the changes to Aurora will allow the U.S. to achieve exascale by 2021.

    An important, but sometimes overlooked, aspect of the U.S. exascale program is the number of computing systems that are being procured, tested and optimized by the ASCR and ASC programs as part of the buildup to exascale. Other computing systems involved with “pre-exascale” systems include the 8.6 petaflops Mira computer at ANL and the 14 petaflops Cori system at Lawrence Berkeley National Lab (LBNL).

    ANL ALCF MIRA IBM Blue Gene Q supercomputer at the Argonne Leadership Computing Facility

    NERSC Cray Cori II supercomputer at NERSC at LBNL

    The NNSA also has the 14.1 petaflops Trinity system at Los Alamos National Lab (LANL). Up to 20 percent of these precursor machines will serve as testbeds to enable computing science R&D needed to ensure that the U.S. exascale systems will be able to productively address important national security and discovery science objectives.

    The last, but certainly not least, bit of hardware news is that the ASCR and ASC programs are expected to start their next computer system procurement processes in early 2018. During her presentation to the U.S. Consortium for the Advancement of Supercomputing (USCAS), Barb Helland told the group that she expects that the Request for Proposals (RFP) will soon be released for the follow-ons to the Summit and Sierra systems. These systems, to be delivered in the 2021-2023 timeframe, are expected to be provide in excess of exaFLOP/s performance. The procurement process to be used will be similar to the CORAL procurement and will be a collaboration between the DOE-SC ASCR and NNSA ASC programs. The ORNL exascale system will be called Frontier and the LLNL system will be known as El Capitan.

    2017 also saw significant developments for the people element of the U.S HPC ecosystem. As was previously reported, at last September’s ASCAC meeting, Paul Messina announced that he would be stepping down as the ECP Director on October 1st. Doug Kothe, who was previously the applications development lead, was announced as the new ECP Director. Upon taking the Director job, Kothe with his deputy, Stephen Lee of LANL, instituted a process to review the organization and management of the ECP. At the December ASCAC conference call, Doug reported that the review had been completed and resulted in a number of changes. This included paring down ECP from five to four components (applications development, software technology, hardware and integration, and project management). He also reported that ECP has implemented a more structured management approach that includes a revised work breakdown structure (WBS) and additional milestones, new key performance parameters and risk management approaches. Finally, the new ECP Director reported that they had established an Extended Leadership Team with a number of new faces.

    Another important, element of the HPC ecosystem are the people doing the R&D and other work need to keep the ecosystem going. The DOE ECI involves a huge number of people. Last year, there were about 500 researchers who attended the ECP Principle Investigator meeting and there are many more involved in other DOE/NNSA programs and from industry. The ASCR and ASC programs are involved with a number of programs to educate and train future members of the HPC ecosystem. Such programs are the ASCR and ASC co-funded Computational Science Graduate Fellowship (CSGF) and the Early Career Research Program. The NNSA offers similar opportunities. Both the ASCR and ASC programs continue to coordinate with National Science Foundation educational programs to ensure that America’s top computational science talent continues to flow into the ecosystem.

    Finally, in addition to people and hardware, the U.S. program continues to develop the software stack (aka middleware) to develop end users’ applications to ensure that exascale will be used productively. Doug Kothe reported that ECP has adopted standard Software Development Kits. These SDKs are designed to support the goal of building a comprehensive, coherent software stack that enables application developers to productively write highly parallel applications that effectively target diverse exascale architectures. Kothe also reported that ECP is making good progress in developing applications software. This includes the implementation of innovative approaches that include Machine Learning to utilize the GPUs that are part of the future exascale computers.

    All in all – the last several months of 2017 have set the stage for a very exciting 2018 for the U.S. exascale program. It has been about 5 years since the ORNL Titan supercomputer came onto the stage at #1 on the TOP500 list.

    ORNL Cray XK7 Titan Supercomputer

    Over that time, other more powerful DOE computers have come online (Trinity, Cori, etc.) but they were overshadowed by Chinese and European systems.

    LANL Cray XC30 Trinity supercomputer

    It remains unclear whether or not the upcoming exascale systems will put the U.S. back on the top of the supercomputing world. However, the recent developments help to reassure the country is not going to give up its computing leadership position without a fight. That is great news because for more than 60 years, the U.S. has sought leadership in high performance computing for the strategic value it provides in the areas of national security, discovery science, energy security, and economic competitiveness.

    See the full article here .

    Please help promote STEM in your local schools.

    STEM Icon

    Stem Education Coalition

    HPCwire is the #1 news and information resource covering the fastest computers in the world and the people who run them. With a legacy dating back to 1987, HPC has enjoyed a legacy of world-class editorial and topnotch journalism, making it the portal of choice selected by science, technology and business professionals interested in high performance and data-intensive computing. For topics ranging from late-breaking news and emerging technologies in HPC, to new trends, expert analysis, and exclusive features, HPCwire delivers it all and remains the HPC communities’ most reliable and trusted resource. Don’t miss a thing – subscribe now to HPCwire’s weekly newsletter recapping the previous week’s HPC news, analysis and information at: http://www.hpcwire.com.

  • richardmitnick 11:45 am on May 24, 2017 Permalink | Reply
    Tags: , HPC, HPC heard around the world,   

    From Science Node: “HPC heard around the world” 

    Science Node bloc
    Science Node

    19 May, 2017
    Tristan Fitzpatrick

    A new advanced computing alliance indicates cooperation and collaboration are alive and well in the global research community.

    It was an international celebration in Barcelona, Spain, as representatives from three continents met at PRACEdays17 to sign a memorandum of understanding (MoU), formalizing a new era in advanced research computing.

    On hand to recognize the partnership were John Towns, principal investigator of the Extreme Science and Engineering Discovery Environment (XSEDE), Serge Bogaerts, managing director of the Partnership for Advanced Computing in Europe (PRACE), and Masahiro Seki, president of the Japanese Research Organization for Information Science and Technology (RIST).


    “We are excited about this development and fully expect this effort will support the growing number of international collaborations emerging across all fields of scholarship,” says Towns.

    As steward of the US supercomputing infrastructure, XSEDE will share their socio-technical platform that integrates and coordinates the advanced digital services that support contemporary science across the country.

    See the full article here .

    Please help promote STEM in your local schools.
    STEM Icon

    Stem Education Coalition

    Science Node is an international weekly online publication that covers distributed computing and the research it enables.

    “We report on all aspects of distributed computing technology, such as grids and clouds. We also regularly feature articles on distributed computing-enabled research in a large variety of disciplines, including physics, biology, sociology, earth sciences, archaeology, medicine, disaster management, crime, and art. (Note that we do not cover stories that are purely about commercial technology.)

    In its current incarnation, Science Node is also an online destination where you can host a profile and blog, and find and disseminate announcements and information about events, deadlines, and jobs. In the near future it will also be a place where you can network with colleagues.

    You can read Science Node via our homepage, RSS, or email. For the complete iSGTW experience, sign up for an account or log in with OpenID and manage your email subscription from your account preferences. If you do not wish to access the website’s features, you can just subscribe to the weekly email.”

  • richardmitnick 1:37 pm on January 12, 2017 Permalink | Reply
    Tags: Argo project, , , , Hobbes project, HPC, , XPRESS project   

    From ASCRDiscovery via D.O.E. “Upscale computing” 

    DOE Main

    Department of Energy



    January 2017
    No writer credit

    National labs lead the push for operating systems that let applications run at exascale.

    Image courtesy of Sandia National Laboratories.

    For high-performance computing (HPC) systems to reach exascale – a billion billion calculations per second – hardware and software must cooperate, with orchestration by the operating system (OS).

    But getting from today’s computing to exascale requires an adaptable OS – maybe more than one. Computer applications “will be composed of different components,” says Ron Brightwell, R&D manager for scalable systems software at Sandia National Laboratories.

    “There may be a large simulation consuming lots of resources, and some may integrate visualization or multi-physics.” That is, applications might not use all of an exascale machine’s resources in the same way. Plus, an OS aimed at exascale also must deal with changing hardware. HPC “architecture is always evolving,” often mixing different kinds of processors and memory components in heterogeneous designs.

    As computer scientists consider scaling up hardware and software, there’s no easy answer for when an OS must change. “It depends on the application and what needs to be solved,” Brightwell explains. On top of that variability, he notes, “scaling down is much easier than scaling up.” So rather than try to grow an OS from a laptop to an exascale platform, Brightwell thinks the other way. “We should try to provide an exascale OS and runtime environment on a smaller scale – starting with something that works at a higher scale and then scale down.”

    To explore the needs of an OS and conditions to run software for exascale, Brightwell and his colleagues conducted a project called Hobbes, which involved scientists at four national labs – Oak Ridge (ORNL), Lawrence Berkeley, Los Alamos and Sandia – plus seven universities. To perform the research, Brightwell – with Terry Jones, an ORNL computer scientist, and Patrick Bridges, a University of New Mexico associate professor of computer science – earned an ASCR Leadership Computing Challenge allocation of 30 million processor hours on Titan, ORNL’s Cray XK7 supercomputer.

    ORNL Cray Titan Supercomputer
    ORNL Cray XK7 Titan Supercomputer

    The Hobbes OS supports multiple software stacks working together, as indicated in this diagram of the Hobbes co-kernel software stack. Image courtesy of Ron Brightwell, Sandia National Laboratories.

    Brightwell made a point of including the academic community in developing Hobbes. “If we want people in the future to do OS research from an HPC perspective, we need to engage the academic community to prepare the students and give them an idea of what we’re doing,” he explains. “Generally, OS research is focused on commercial things, so it’s a struggle to get a pipeline of students focusing on OS research in HPC systems.”

    The Hobbes project involved a variety of components, but for the OS side, Brightwell describes it as trying to understand applications as they become more sophisticated. They may have more than one simulation running in a single OS environment. “We need to be flexible about what the system environment looks like,” he adds, so with Hobbes, the team explored using multiple OSs in applications running at extreme scale.

    As an example, Brightwell notes that the Hobbes OS envisions multiple software stacks working together. The OS, he says, “embraces the diversity of the different stacks.” An exascale system might let data analytics run on multiple software stacks, but still provide the efficiency needed in HPC at extreme scales. This requires a computer infrastructure that supports simultaneous use of multiple, different stacks and provides extreme-scale mechanisms, such as reducing data movement.

    Part of Hobbes also studied virtualization, which uses a subset of a larger machine to simulate a different computer and operating system. “Virtualization has not been used much at extreme scale,” Brightwell says, “but we wanted to explore it and the flexibility that it could provide.” Results from the Hobbes project indicate that virtualization for extreme scale can provide performance benefits at little cost.

    Other HPC researchers besides Brightwell and his colleagues are exploring OS options for extreme-scale computing. For example, Pete Beckman, co-director of the Northwestern-Argonne Institute of Science and Engineering at Argonne National Laboratory, runs the Argo project.

    A team of 25 collaborators from Argonne, Lawrence Livermore National Laboratory and Pacific Northwest National Laboratory, plus four universities created Argo, an OS that starts with a single Linux-based OS and adapts it to extreme scale.

    When comparing the Hobbes OS to Argo, Brightwell says, “we think that without getting in that Linux box, we have more freedom in what we do, other than design choices already made in Linux. Both of these OSs are likely trying to get to the same place but using different research vehicles to get there.” One distinction: The Hobbes project uses virtualization to explore the use of multiple OSs working on the same simulation at extreme scale.

    As the scale of computation increases, an OS must also support new ways of managing a systems’ resources. To explore some of those needs, Thomas Sterling, director of Indiana University’s Center for Research in Extreme Scale Technologies, developed ParalleX, an advanced execution model for computations. Brightwell leads a separate project called XPRESS to support the ParalleX execution model. Rather than computing’s traditional static methods, ParalleX implementations use dynamic adaptive techniques.

    More work is always necessary as computation works toward extreme scales. “The important thing in going forward from a runtime and OS perspective is the ability to evaluate technologies that are developing in terms of applications,” Brightwell explains. “For high-end applications to pursue functionality at extreme scales, we need to build that capability.” That’s just what Hobbes and XPRESS – and the ongoing research that follows them – aim to do.

    See the full article here .

    Please help promote STEM in your local schools.

    STEM Icon

    Stem Education Coalition

    The mission of the Energy Department is to ensure America’s security and prosperity by addressing its energy, environmental and nuclear challenges through transformative science and technology solutions.

Compose new post
Next post/Next comment
Previous post/Previous comment
Show/Hide comments
Go to top
Go to login
Show/Hide help
shift + esc
%d bloggers like this: