Tagged: Exascale computing Toggle Comment Threads | Keyboard Shortcuts

  • richardmitnick 5:54 am on July 1, 2017 Permalink | Reply
    Tags: , Exascale computing, , ,   

    From LLNL: “National labs, industry partners prepare for new era of computing through Centers of Excellence” 


    Lawrence Livermore National Laboratory

    June 30, 2017
    Jeremy Thomas
    thomas244@llnl.gov
    925-422-5539

    1
    IBM employees and Lab code and application developers held a “Hackathon” event in June to work on coding challenges for a predecessor system to the Sierra supercomputer. Through the ongoing Centers of Excellence (CoE) program, employees from IBM and NVIDIA have been on-site to help LLNL developers transition applications to the Sierra system, which will have a completely different architecture than the Lab has had before. Photo by Jeremy Thomas/LLNL

    The Department of Energy’s (link is external) drive toward the next generation of supercomputers, “exascale” machines capable of more than a quintillion (1018) calculations per second, isn’t simply to boast about having the fastest processing machines on the planet. At Lawrence Livermore National Laboratory (LLNL) and other DOE national labs, these systems will play a vital role in the National Nuclear Security Administration’s (NNSA) core mission of ensuring the nation’s nuclear stockpile in the absence of underground testing.

    The driving force behind faster, more robust computing power is the need for simulation and codes that are higher resolution, increasingly predictive and incorporate more complex physics. It’s an evolution that is changing the way the national labs’ application and code developers are approaching design. To aid in the transition and prepare researchers for pre-exascale and exascale systems, LLNL has brought experts from IBM (link is external) and NVIDIA together with Lab computer scientists in a Center of Excellence (CoE), a co-design strategy born out of the need for vendors and government to work together to optimize emerging supercomputing systems.

    “There are disruptive machines coming down the pike that are changing things out from under us,” said Rob Neely, an LLNL computer scientist and Weapon Simulation & Computing Program coordinator for Computing Environments. “We need a lot of time to prepare; these applications need insight, and who better to help us with that than the companies who will build the machines? The idea is that when a machine gets here, we’re not caught flat-footed. We want to hit the ground running right away.”

    While LLNL’s exascale system isn’t scheduled for delivery until 2023, Sierra, the Laboratory’s pre-exascale system, is on track to begin installation this fall and will begin running science applications at full machine scale by early next spring.

    LLNL IBM Sierra supercomputer

    Built by IBM and NVIDIA, Sierra will have about six times more computing power than LLNL’s current behemoth, Sequoia.

    2
    Sequoia at LLNL

    The Sierra system is unique to the Lab in that it’s made up of two kinds of hardware — IBM CPUs and NVIDIA GPUs — that have different memory locations associated with each type of computing device and a programming model more complex than LLNL scientists have programmed to in the past. In the meantime, Lab scientists are receiving guidance from experts from the two companies, utilizing a small predecessor system that is already running some components and has some of the technological features that Sierra will have.

    LLNL’s Center of Excellence, which began in 2014, involves about a half dozen IBM and NVIDIA personnel on-site, and a number of remote collaborators who work with Lab developers. The team is on hand to answer any questions Lab computer scientists have, educate LLNL personnel to use best practices in coding hybrid systems, develop strategies for optimizations, debug and advise on global code restructuring that often is needed to obtain performance. The CoE is a symbiotic relationship — LLNL scientists get a feel for how Sierra will operate, and IBM and NVIDIA gain better insight into what the Lab’s needs are and what the machines they build are capable of.

    “We see how the systems we design and develop are being used and how effective they can be,” said IBM research staff member Leopold Grinberg, who works on the LLNL site. “You really need to get into the mind of the developer to understand how they use the tools. To sit next to the developers’ seats and let them drive, to observe them, gives us a good idea of what we are doing right and what needs to be improved. Our experts have an intimate knowledge of how the system works, and having them side-by-side with Lab employees is very useful.”

    Sierra, Grinberg explained, will use a completely different system architecture than what has been used before at LLNL. It’s not only faster than any machine the Lab has had, it also has different tools built into the compilers and programming models. In some cases, the changes developers need to make are substantial, requiring restructuring hundreds or thousands of lines of code. Through the CoE, Grinberg said he’s learning more about how the system will be used for production scientific applications.

    “It’s a constant process of learning for everybody,” Grinberg said. “It’s fun, it’s challenging. We gather the knowledge and it’s also our job to distribute it. There’s always some knowledge to be shared. We need to bring the experience we have with heterogenous systems and emerging programming models to the lab, and help people generate updated codes or find out what can be kept as is to optimize the system we build. It’s been very fruitful for both parties.”

    The CoE strategy is additionally being implemented at Oak Ridge National Laboratory, which is bringing in a heterogenous system of its own called Summit. Other CoE programs are in place at Los Alamos and Lawrence Berkeley national laboratories. Each CoE has a similar goal of preparing computational scientists with the tools they will need to utilize pre-exascale and exascale systems. Since Livermore is new to using GPUs for the bulk of computing power, the Sierra architecture places a heavy emphasis on figuring out which sections of a multi-physics application are the most performance-critical, and the code restructuring that must take place to most effectively use the system.

    “Livermore and Oak Ridge scientists are really pushing the boundaries of the scale of these GPU-based systems,” said Max Katz, a solutions architect at NVIDIA who spends four days a week at LLNL as a technical adviser. “Part of our motivation is to understand machine learning and how to make it possible to merge high-performance computing with the applications demanded by industry. The CoE is essential because it’s difficult for any one party to predict how these CPU/GPU systems will behave together. Each one of us brings in expertise and by sharing information, it makes us all more well-rounded. It’s a great opportunity.”

    In fact, the opportunity was so compelling that in 2016 the CoE was augmented with a three-year institutional component (dubbed the Institutional Center of Excellence, or iCE) to ensure that other mission critical efforts at the Laboratory also could participate. This has added nine applications development efforts, including one in data science, and expanded the number of IBM and NVIDIA personnel. By working together cooperatively, many more types of applications can be explored, performance solutions developed and shared among all the greater CoE code teams.

    “At the end of the iCOE project, the real value will be not only that some important institutional applications run well, but that every directorate at LLNL will have trained staff with expertise in using Sierra, and we’ll have documented lessons learned to help train others,” said Bert Still, leader for Application Strategy (Livermore Computing).

    Steve Rennich, a senior HPC developer-technology engineer with NVIDIA, visits the Lab once a week to help LLNL scientists port mission-critical applications optimized for CPUs over to NVIDIA GPUs, which have an order of magnitude greater computing power than CPUs. Besides writing bug-free code, Rennich said, the goal is to improve performance enough to meet the Lab’s considerable computing requirements.

    “The challenge is they’re fairly complex codes so to do it correctly takes a fair amount of attention to detail,” Rennich said. “It’s about making sure the new system can handle as large a model as the Lab needs. These are colossal machines, so when you create applications at this scale, it’s like building a race car. To take advantage of this increase in performance, you need all the pieces to fit and work together.”

    Current plans are to continue the existing Center of Excellence at LLNL at least into 2019, when Sierra is fully operational. Until then, having experts working shoulder-to-shoulder with Lab developers to write code will be a huge benefit to all parties, said LLNL’s Neely, who wants the collaboration to publish their discoveries to share it with the broader computing community.

    “We’re focused on the issue at hand, and moving things toward getting ready for these machines is hugely beneficial,” Neely said. “These are very large applications developed over decades, so ultimately it’s the code teams that need to be ready to take this over. We’ve got to make this work because we need to ensure the safety and performance of the U.S. stockpile in the absence of nuclear testing. We’ve got the right teams and people to pull this off.”

    See the full article here .

    Please help promote STEM in your local schools.

    STEM Icon

    Stem Education Coalition
    LLNL Campus

    Operated by Lawrence Livermore National Security, LLC, for the Department of Energy’s National Nuclear Security
    Administration
    DOE Seal
    NNSA

     
  • richardmitnick 10:08 am on April 26, 2017 Permalink | Reply
    Tags: , , Building the Bridge to Exascale, ECP- Exascale Computing Project, Exascale computing, , ,   

    From OLCF at ORNL: “Building the Bridge to Exascale” 

    i1

    Oak Ridge National Laboratory

    OLCF

    April 18, 2017 [Where was this hiding?]
    Katie Elyce Jones

    Building an exascale computer—a machine that could solve complex science problems at least 50 times faster than today’s leading supercomputers—is a national effort.

    To oversee the rapid research and development (R&D) of an exascale system by 2023, the US Department of Energy (DOE) created the Exascale Computing Project (ECP) last year. The project brings together experts in high-performance computing from six DOE laboratories with the nation’s most powerful supercomputers—including Oak Ridge, Argonne, Lawrence Berkeley, Lawrence Livermore, Los Alamos, and Sandia—and project members work closely with computing facility staff from the member laboratories.

    ORNL IBM Summit supercomputer depiction.

    At the Exascale Computing Project’s (ECP’s) annual meeting in February 2017, Oak Ridge Leadership Computing Facility (OLCF) staff discussed OLCF resources that could be leveraged for ECP research and development, including the facility’s next flagship supercomputer, Summit, expected to go online in 2018.

    At the first ECP annual meeting, held January 29–February 3 in Knoxville, Tennessee, about 450 project members convened to discuss collaboration in breakout sessions focused on project organization and upcoming R&D milestones for applications, software, hardware, and exascale systems focus areas. During facility-focused sessions, senior staff from the Oak Ridge Leadership Computing Facility (OLCF) met with ECP members to discuss opportunities for the project to use current petascale supercomputers, test beds, prototypes, and other facility resources for exascale R&D. The OLCF is a DOE Office of Science User Facility located at DOE’s Oak Ridge National Laboratory (ORNL).

    “The ECP’s fundamental responsibilities are to provide R&D to build exascale machines more efficiently and to prepare the applications and software that will run on them,” said OLCF Deputy Project Director Justin Whitt. “The facilities’ responsibilities are to acquire, deploy, and operate the machines. We are currently putting advanced test beds and prototypes in place to evaluate technologies and enable R&D efforts like those in the ECP.”

    ORNL has a unique connection to the ECP. The Tennessee-based laboratory is the location of the project office that manages collaboration within the ECP and among its facility partners. ORNL’s Laboratory Director Thom Mason delivered the opening talk at the conference, highlighting the need for coordination in a project of this scope.

    On behalf of facility staff, Mark Fahey, director of operations at the Argonne Leadership Computing Facility, presented the latest delivery and deployment plans for upcoming computing resources during a plenary session. From the OLCF, Project Director Buddy Bland and Director of Science Jack Wells provided a timeline for the availability of Summit, OLCF’s next petascale supercomputer, which is expected to go online in 2018; it will be at least 5 times more powerful than the OLCF’s 27-petaflop Titan supercomputer.

    ORNL Cray XK7 Titan Supercomputer.

    “Exascale hardware won’t be around for several more years,” Wells said. “The ECP will need access to Titan, Summit, and other leadership computers to do the work that gets us to exascale.”

    Wells said he was able to highlight the spring 2017 call for Innovative and Novel Computational Impact on Theory and Experiment, or INCITE, proposals, which will give 2-year projects the first opportunity for computing time on Summit. OLCF staff also introduced a handful of computing architecture test beds—including the developmental environment for Summit known as Summitdev, NVIDIA’s deep learning and accelerated analytics system DGX-1, an experimental cluster of ARM 64-bit compute nodes, and a Cray XC40 cluster of 168 nodes known as Percival—that are now available for OLCF users.

    In addition to leveraging facility resources for R&D, the ECP must understand the future needs of facilities to design an exascale system that is ready for rigorous computational science simulations. Facilities staff can offer insight about the level of performance researchers will expect from science applications on exascale systems and estimate the amount of space and electrical power that will be available in the 2023 timeframe.

    “Getting to capable exascale systems will require careful coordination between the ECP and the user facilities,” Whitt said.

    One important collaboration so far was the development of a request for information, or RFI, for exascale R&D that the ECP released in February to industry vendors. The RFI enables the ECP to evaluate potential software and hardware technologies for exascale systems—a step in the R&D process that facilities often undertake. Facilities will later release requests for proposals when they are ready to begin building exascale systems

    See the full article here .

    Please help promote STEM in your local schools.

    STEM Icon

    Stem Education Coalition

    ORNL is managed by UT-Battelle for the Department of Energy’s Office of Science. DOE’s Office of Science is the single largest supporter of basic research in the physical sciences in the United States, and is working to address some of the most pressing challenges of our time.

    i2

    The Oak Ridge Leadership Computing Facility (OLCF) was established at Oak Ridge National Laboratory in 2004 with the mission of accelerating scientific discovery and engineering progress by providing outstanding computing and data management resources to high-priority research and development projects.

    ORNL’s supercomputing program has grown from humble beginnings to deliver some of the most powerful systems in the world. On the way, it has helped researchers deliver practical breakthroughs and new scientific knowledge in climate, materials, nuclear science, and a wide range of other disciplines.

    The OLCF delivered on that original promise in 2008, when its Cray XT “Jaguar” system ran the first scientific applications to exceed 1,000 trillion calculations a second (1 petaflop). Since then, the OLCF has continued to expand the limits of computing power, unveiling Titan in 2013, which is capable of 27 petaflops.


    ORNL Cray XK7 Titan Supercomputer

    Titan is one of the first hybrid architecture systems—a combination of graphics processing units (GPUs), and the more conventional central processing units (CPUs) that have served as number crunchers in computers for decades. The parallel structure of GPUs makes them uniquely suited to process an enormous number of simple computations quickly, while CPUs are capable of tackling more sophisticated computational algorithms. The complimentary combination of CPUs and GPUs allow Titan to reach its peak performance.

    The OLCF gives the world’s most advanced computational researchers an opportunity to tackle problems that would be unthinkable on other systems. The facility welcomes investigators from universities, government agencies, and industry who are prepared to perform breakthrough research in climate, materials, alternative energy sources and energy storage, chemistry, nuclear physics, astrophysics, quantum mechanics, and the gamut of scientific inquiry. Because it is a unique resource, the OLCF focuses on the most ambitious research projects—projects that provide important new knowledge or enable important new technologies.

     
  • richardmitnick 1:17 pm on April 4, 2017 Permalink | Reply
    Tags: Exascale computing, ,   

    From OLCF via TheNextPlatform: “Machine Learning, Analytics Play Growing Role in US Exascale Efforts” 

    i1

    Oak Ridge National Laboratory

    OLCF

    1

    TheNextPlatform

    April 4, 2017
    Jeffrey Burt

    Exascale computing promises to bring significant changes to both the high-performance computing space and eventually enterprise datacenter infrastructures.

    The systems, which are being developed in multiple countries around the globe, promise 50 times the performance of current 20 petaflop-capable systems that are now among the fastest in the world, and that bring corresponding improvements in such areas as energy efficiency and physical footprint. The systems need to be powerful run the increasingly complex applications being used by engineers and scientists, but they can’t be so expensive to acquire or run that only a handful of organizations can use them.

    At the same time, the emergence of high-level data analytics and machine learning is forcing some changes in the exascale efforts in the United States, changes that play a role in everything from the software stacks that are being developed for the systems to the competition with Chinese companies that also are aggressively pursuing exascale computing. During a talk last week at the OpenFabrics Workshop in Austin, Texas, Al Geist, from the Oak Ridge National Laboratory and CTO of the Exascale Computing Project (ECP), outlined the work the ECP is doing to develop exascale-capable systems within the next few years. Throughout his talk, Geist also mentioned that over the past 18 months, the emergence of data analytics and machine learning in the mainstream has expanded the scientists’ thoughts on what exascale computing will entail, both for HPC as well as enterprises.

    “In the future, there will be more and more drive to be able to have a machine that can solve a wider breadth of problems … that would require machine learning to be able to do the analysis on the fly inside the computer rather than having it be writer out to disk and analyzed later,” Geist said.

    The amount of data being generated continues to increase on a massive scale, driven by everything from the proliferation of mobile devices to the rapid adoption of cloud computing. HPC organizations and enterprises are looking for ways to collect, store and analyze this data in near real-time to be able to make immediate research and businesses decisions. Machine learning and artificial intelligence (AI) are increasingly being used to help accelerate the collection and analysis of the data. In addition, AI and machine learning are at the heart of many of emerging fields of technology, from new cybersecurity techniques to such systems as self-driving cars.

    The ECP is taking the growing role of data analytics and machine learning into consideration in the work it is doing, Geist said. That growing role can be seen in the applications being developed for exascale computing. For example, the application development effort touches on a range of areas, from climate and chemistry to genomics, seismology and cosmology. There also is a project underway to develop applications for cancer research and prevention, and increasingly that work includes the use of data sciences and machine learning.

    2

    In addition, ECP – which was launched in 2016 – originally had created four application develop co-design centers with relatively narrow focuses, such as efficient exascale discretizations and particle applications. One regarding online data analysis and reduction touched on the rising data sciences effort in the tech industry. In recent months, the program added a fifth co-design center, this one targeting graph and combinatorial methods for enabling exascale applications. The new center was created more specifically to deal with the challenges presented by data analytics and machine learning, he said.

    In a more general sense, both of those emerging technologies also have an impact on the types of systems that the ECP is seeking from vendors like IBM, Intel and Nvidia, and on the growing competition with China in the exascale race. The ECP is looking for systems that not only are capable of exascale-level computing, but also that can be used by a wide range of organizations, with the idea that the innovations developed for and used in HPC environments will cascade down into the enterprise and commodity machines. Geist said they should be usable for a broad array of users, not just a small number of “hero programmers.” Given that, the ECP is looking for vendors to develop systems that can meet the various requirements laid out in the program, such as enabling extreme parallelism, creating new memory and storage technologies that can handle the scaling, high reliability and energy consumption of 20 to 30 megawatts.

    However, that doesn’t necessarily mean creating radical designs or highly advanced or novel architectures. If the vendors can create exascale-capable systems without resorting to radical solutions, “that would be fine,” Geist said. “In fact, we’d probably prefer that.” At the same time, ECP officials understand that to develop the first exascale-capable computer by 2021, as planned, it will take some novel approaches to design and architecture. The hope is that by the time the next such systems follow in 2023, the need for such radical approaches won’t be necessary.

    “We’re not trying to build a stunt machine,” Geist said, adding that the goal for these systems is not just how many FLOPs they can produce, but how much science they can produce. “We want to build something that’s going to be useful for the nation and for science.”

    Such demands are a key differentiator in how the United States and China are approaching exascale computing. China already has three exascale projects underway and a prototype called the Tianhe-3 that is scheduled to be ready by 2018. Much of the talk about China’s efforts has been about the amount of money the Chinese government is investing in the projects. At the same time, China is not as limited by legacy technologies as is the United States, Geist said.

    “They can build a chip that is very much geared toward very low energy and high performance without having to worry about legacy apps, or does it run in a smartphone or does it run in a server market,” he said. “They can build a one-off chip.”

    For U.S. vendors, they have to build systems that can run a broad range of new and legacy applications, and components that can be used in other systems. They have to be able to handle myriad workloads, which is why the ECP expanded its efforts in the areas of data analytics and machine learning. These are increasingly important technologies that will have wide application in HPC and enterprise computing as well as consumer devices, he said. Vendors know that, and they have to take it into account when thinking about innovations around exascale computing.

    The application of emerging technologies will be critical in a wide range of technology areas, so innovations in architectures need to be able to address the needs of both the supercomputing and enterprise worlds. For example, in the HPC space, organizations are turning to machine learning to accelerate the simulation workloads they’re running for such tasks as quality assurance, Geist said.

    “In the United States, it’s really important to the health of the ecosystem,” Geist said. “If you’re only going to sell two machines a year or three machines a year, you can’t make a business out of that and you’ll go out of business. Therefore, those chips and those technologies need to be expanded into new markets. This is what the U.S. has to struggle with, to make sure we can meet the requirements of this high-performance level that we want to get to while still meeting the needs of all the other areas. This is why we see this expansion into data analytics and machine learning, which seems to have a much bigger market outside of just the HPC world.”

    See the full article here .

    Please help promote STEM in your local schools.

    STEM Icon

    Stem Education Coalition

    ORNL is managed by UT-Battelle for the Department of Energy’s Office of Science. DOE’s Office of Science is the single largest supporter of basic research in the physical sciences in the United States, and is working to address some of the most pressing challenges of our time.

    i2

    The Oak Ridge Leadership Computing Facility (OLCF) was established at Oak Ridge National Laboratory in 2004 with the mission of accelerating scientific discovery and engineering progress by providing outstanding computing and data management resources to high-priority research and development projects.

    ORNL’s supercomputing program has grown from humble beginnings to deliver some of the most powerful systems in the world. On the way, it has helped researchers deliver practical breakthroughs and new scientific knowledge in climate, materials, nuclear science, and a wide range of other disciplines.

    The OLCF delivered on that original promise in 2008, when its Cray XT “Jaguar” system ran the first scientific applications to exceed 1,000 trillion calculations a second (1 petaflop). Since then, the OLCF has continued to expand the limits of computing power, unveiling Titan in 2013, which is capable of 27 petaflops.


    ORNL Cray XK7 Titan Supercomputer

    Titan is one of the first hybrid architecture systems—a combination of graphics processing units (GPUs), and the more conventional central processing units (CPUs) that have served as number crunchers in computers for decades. The parallel structure of GPUs makes them uniquely suited to process an enormous number of simple computations quickly, while CPUs are capable of tackling more sophisticated computational algorithms. The complimentary combination of CPUs and GPUs allow Titan to reach its peak performance.

    The OLCF gives the world’s most advanced computational researchers an opportunity to tackle problems that would be unthinkable on other systems. The facility welcomes investigators from universities, government agencies, and industry who are prepared to perform breakthrough research in climate, materials, alternative energy sources and energy storage, chemistry, nuclear physics, astrophysics, quantum mechanics, and the gamut of scientific inquiry. Because it is a unique resource, the OLCF focuses on the most ambitious research projects—projects that provide important new knowledge or enable important new technologies.

     
  • richardmitnick 1:37 pm on January 12, 2017 Permalink | Reply
    Tags: Argo project, , , Exascale computing, Hobbes project, , ORNL Cray XK7 Titan supercomputer, XPRESS project   

    From ASCRDiscovery via D.O.E. “Upscale computing” 

    DOE Main

    Department of Energy

    ASCRDiscovery

    ASCRDiscovery

    January 2017
    No writer credit

    National labs lead the push for operating systems that let applications run at exascale.

    2
    Image courtesy of Sandia National Laboratories.

    For high-performance computing (HPC) systems to reach exascale – a billion billion calculations per second – hardware and software must cooperate, with orchestration by the operating system (OS).

    But getting from today’s computing to exascale requires an adaptable OS – maybe more than one. Computer applications “will be composed of different components,” says Ron Brightwell, R&D manager for scalable systems software at Sandia National Laboratories.

    “There may be a large simulation consuming lots of resources, and some may integrate visualization or multi-physics.” That is, applications might not use all of an exascale machine’s resources in the same way. Plus, an OS aimed at exascale also must deal with changing hardware. HPC “architecture is always evolving,” often mixing different kinds of processors and memory components in heterogeneous designs.

    As computer scientists consider scaling up hardware and software, there’s no easy answer for when an OS must change. “It depends on the application and what needs to be solved,” Brightwell explains. On top of that variability, he notes, “scaling down is much easier than scaling up.” So rather than try to grow an OS from a laptop to an exascale platform, Brightwell thinks the other way. “We should try to provide an exascale OS and runtime environment on a smaller scale – starting with something that works at a higher scale and then scale down.”

    To explore the needs of an OS and conditions to run software for exascale, Brightwell and his colleagues conducted a project called Hobbes, which involved scientists at four national labs – Oak Ridge (ORNL), Lawrence Berkeley, Los Alamos and Sandia – plus seven universities. To perform the research, Brightwell – with Terry Jones, an ORNL computer scientist, and Patrick Bridges, a University of New Mexico associate professor of computer science – earned an ASCR Leadership Computing Challenge allocation of 30 million processor hours on Titan, ORNL’s Cray XK7 supercomputer.

    ORNL Cray Titan Supercomputer
    ORNL Cray XK7 Titan Supercomputer

    2
    The Hobbes OS supports multiple software stacks working together, as indicated in this diagram of the Hobbes co-kernel software stack. Image courtesy of Ron Brightwell, Sandia National Laboratories.

    Brightwell made a point of including the academic community in developing Hobbes. “If we want people in the future to do OS research from an HPC perspective, we need to engage the academic community to prepare the students and give them an idea of what we’re doing,” he explains. “Generally, OS research is focused on commercial things, so it’s a struggle to get a pipeline of students focusing on OS research in HPC systems.”

    The Hobbes project involved a variety of components, but for the OS side, Brightwell describes it as trying to understand applications as they become more sophisticated. They may have more than one simulation running in a single OS environment. “We need to be flexible about what the system environment looks like,” he adds, so with Hobbes, the team explored using multiple OSs in applications running at extreme scale.

    As an example, Brightwell notes that the Hobbes OS envisions multiple software stacks working together. The OS, he says, “embraces the diversity of the different stacks.” An exascale system might let data analytics run on multiple software stacks, but still provide the efficiency needed in HPC at extreme scales. This requires a computer infrastructure that supports simultaneous use of multiple, different stacks and provides extreme-scale mechanisms, such as reducing data movement.

    Part of Hobbes also studied virtualization, which uses a subset of a larger machine to simulate a different computer and operating system. “Virtualization has not been used much at extreme scale,” Brightwell says, “but we wanted to explore it and the flexibility that it could provide.” Results from the Hobbes project indicate that virtualization for extreme scale can provide performance benefits at little cost.

    Other HPC researchers besides Brightwell and his colleagues are exploring OS options for extreme-scale computing. For example, Pete Beckman, co-director of the Northwestern-Argonne Institute of Science and Engineering at Argonne National Laboratory, runs the Argo project.

    A team of 25 collaborators from Argonne, Lawrence Livermore National Laboratory and Pacific Northwest National Laboratory, plus four universities created Argo, an OS that starts with a single Linux-based OS and adapts it to extreme scale.

    When comparing the Hobbes OS to Argo, Brightwell says, “we think that without getting in that Linux box, we have more freedom in what we do, other than design choices already made in Linux. Both of these OSs are likely trying to get to the same place but using different research vehicles to get there.” One distinction: The Hobbes project uses virtualization to explore the use of multiple OSs working on the same simulation at extreme scale.

    As the scale of computation increases, an OS must also support new ways of managing a systems’ resources. To explore some of those needs, Thomas Sterling, director of Indiana University’s Center for Research in Extreme Scale Technologies, developed ParalleX, an advanced execution model for computations. Brightwell leads a separate project called XPRESS to support the ParalleX execution model. Rather than computing’s traditional static methods, ParalleX implementations use dynamic adaptive techniques.

    More work is always necessary as computation works toward extreme scales. “The important thing in going forward from a runtime and OS perspective is the ability to evaluate technologies that are developing in terms of applications,” Brightwell explains. “For high-end applications to pursue functionality at extreme scales, we need to build that capability.” That’s just what Hobbes and XPRESS – and the ongoing research that follows them – aim to do.

    See the full article here .

    Please help promote STEM in your local schools.

    STEM Icon

    Stem Education Coalition

    The mission of the Energy Department is to ensure America’s security and prosperity by addressing its energy, environmental and nuclear challenges through transformative science and technology solutions.

     
  • richardmitnick 10:09 am on December 12, 2016 Permalink | Reply
    Tags: , Exascale computing, , , ,   

    From PPPL: “PPPL physicists win funding to lead a DOE exascale computing project” 


    PPPL

    October 27, 2016 [Just now out on social media.]
    Raphael Rosen

    1
    PPPL physicist Amitava Bhattacharjee. (Photo by Elle Starkman/PPPL Office of Communications)

    A proposal from scientists at the U.S. Department of Energy’s (DOE) Princeton Plasma Physics Laboratory (PPPL) has been chosen as part of a national initiative to develop the next generation of supercomputers. Known as the Exascale Computing Project (ECP), the initiative will include a focus on exascale-related software, applications, and workforce training.

    Once developed, exascale computers will perform a billion billion operations per second, a rate 50 to 100 times faster than the most powerful U.S. computers now in use. The fastest computers today operate at the petascale and can perform a million billion operations per second. Exascale machines in the United States are expected to be ready in 2023.

    The PPPL-led multi-institutional project, titled High-Fidelity Whole Device Modeling of Magnetically Confined Fusion Plasmas, was selected during the ECP’s first round of application development funding, which distributed $39.8 million. The overall project will receive $2.5 million a year for four years to be distributed among all the partner institutions, including Argonne, Lawrence Livermore, and Oak Ridge national laboratories, together with Rutgers University, the University of California, Los Angeles, and the University of Colorado, Boulder. PPPL itself will receive $800,000 per year; the project it leads was one of 15 selected for full funding, and the only one dedicated to fusion energy. Seven additional projects received seed funding.

    The application efforts will help guide DOE’s development of a U.S. exascale ecosystem as part of President Obama’s National Strategic Computing Initiative (NSCI). DOE, the Department of Defense and the National Science Foundation have been designated as NSCI lead agencies, and ECP is the primary DOE contribution to the initiative.

    The ECP’s multi-year mission is to maximize the benefits of high performance computing (HPC) for U.S. economic competitiveness, national security and scientific discovery. In addition to applications, the DOE project addresses hardware, software, platforms and workforce development needs critical to the effective development and deployment of future exascale systems. The ECP is supported jointly by DOE’s Office of Science and the National Nuclear Security Administration within DOE.

    PPPL has been involved with high-performance computing for years. PPPL scientists created the XGC code, which models the behavior of plasma in the boundary region where the plasma’s ions and electrons interact with each other and with neutral particles produced by the tokamak’s inner wall. The high-performance code is maintained and updated by PPPL scientist C.S. Chang and his team.

    3
    PPPL scientist C.S. Chang

    XGC runs on Titan, the fastest computer in the United States, at the Oak Ridge Leadership Computing Facility, a DOE Office of Science User Facility at Oak Ridge National Laboratory.

    ORNL Cray Titan Supercomputer
    ORNL Cray Titan Supercomputer

    The calculations needed to model the behavior of the plasma edge are so complex that the code uses 90 percent of the computer’s processing capabilities. Titan performs at the petascale, completing a million billion calculations each second, and the DOE was primarily interested in proposals by institutions that possess petascale-ready codes that can be upgraded for exascale computers.

    The PPPL proposal lays out a four-year plan to combine XGC with GENE, a computer code that simulates the behavior of the plasma core. GENE is maintained by Frank Jenko, a professor at the University of California, Los Angeles. Combining the codes would give physicists a far better sense of how the core plasma interacts with the edge plasma at a fundamental kinetic level, giving a comprehensive view of the entire plasma volume.

    Leading the overall PPPL proposal is Amitava Bhattacharjee, head of the Theory Department at PPPL. Co-principal investigators are PPPL’s Chang and Andrew Siegel, a computational scientist at the University of Chicago.

    The multi-institutional effort will develop a full-scale computer simulation of fusion plasma. Unlike current simulations, which model only part of the hot, charged gas, the proposed simulations will display the physics of an entire plasma all at once. The completed model will integrate the XGC and GENE codes and will be designed to run on exascale computers.

    The modeling will enable physicists to understand plasmas more fully, allowing them to predict its behavior within doughnut-shaped fusion facilities known as tokamaks. The exascale computing fusion proposal focuses primarily on ITER, the international tokamak being built in France to demonstrate the feasibility of fusion power.

    Iter experimental tokamak nuclear fusion reactor that is being built next to the Cadarache facility in Saint Paul les-Durance south of France
    Iter experimental tokamak nuclear fusion reactor that is being built next to the Cadarache facility in Saint Paul les-Durance south of France

    But the proposal will be developed with other applications in mind, including stellarators, another variety of fusion facility.

    Wendelstgein 7-X stellarator
    Wendelstgein 7-X stellarator,built in Greifswald, Germany

    Better predictions can lead to better engineered facilities and more efficient fusion reactors. Currently, support for this work comes from the DOE’s Advanced Science Computing Research program.

    “This will be a team effort involving multiple institutions,” said Bhattacharjee. He noted that PPPL will be involved in every aspect of the project, including working with applied mathematicians and computer scientists on the team to develop the simulation framework that will couple GENE with XGC on exascale computers.

    “You need a very-large-scale computer to calculate the multiscale interactions in fusion plasmas,” said Chang. “Whole-device modeling is about simulating the whole thing: all the systems together.”

    Because plasma behavior is immensely complicated, developing an exascale computer is crucial for future research. “Taking into account all the physics in a fusion plasma requires enormous computational resources,” said Bhattacharjee. “With the computer codes we have now, we are already pushing on the edge of the petascale. The exascale is very much needed in order for us to have greater realism and truly predictive capability.”

    See the full article here .

    Please help promote STEM in your local schools.

    STEM Icon

    Stem Education Coalition

    Princeton Plasma Physics Laboratory is a U.S. Department of Energy national laboratory managed by Princeton University. PPPL, on Princeton University’s Forrestal Campus in Plainsboro, N.J., is devoted to creating new knowledge about the physics of plasmas — ultra-hot, charged gases — and to developing practical solutions for the creation of fusion energy. Results of PPPL research have ranged from a portable nuclear materials detector for anti-terrorist use to universally employed computer codes for analyzing and predicting the outcome of fusion experiments. The Laboratory is managed by the University for the U.S. Department of Energy’s Office of Science, which is the largest single supporter of basic research in the physical sciences in the United States, and is working to address some of the most pressing challenges of our time. For more information, please visit science.energy.gov.

     
  • richardmitnick 3:28 pm on November 23, 2016 Permalink | Reply
    Tags: , , , Computerworld, Exascale computing,   

    From ALCF via Computerworld: “U.S. sets plan to build two exascale supercomputers” 

    Argonne Lab
    News from Argonne National Laboratory

    ANL Cray Aurora supercomputer
    Cray Aurora supercomputer at the Argonne Leadership Computing Facility

    MIRA IBM Blue Gene Q supercomputer at the Argonne Leadership Computing Facility
    MIRA IBM Blue Gene Q supercomputer at the Argonne Leadership Computing Facility

    ALCF

    1

    COMPUTERWORLD

    Nov 21, 2016
    Patrick Thibodeau

    2
    ARM

    The U.S believes it will be ready to seek vendor proposals to build two exascale supercomputers — costing roughly $200 million to $300 million each — by 2019.

    The two systems will be built at the same time and will be ready for use by 2023, although it’s possible one of the systems could be ready a year earlier, according to U.S. Department of Energy officials.

    But the scientists and vendors developing exascale systems do not yet know whether President-Elect Donald Trump’s administration will change directions. The incoming administration is a wild card. Supercomputing wasn’t a topic during the campaign, and Trump’s dismissal of climate change as a hoax, in particular, has researchers nervous that science funding may suffer.

    At the annual supercomputing conference SC16 last week in Salt Lake City, a panel of government scientists outlined the exascale strategy developed by President Barack Obama’s administration. When the session was opened to questions, the first two were about Trump. One attendee quipped that “pointed-head geeks are not going to be well appreciated.”

    Another person in the audience, John Sopka, a high-performance computing software consultant, asked how the science community will defend itself from claims that “you are taking the money from the people and spending it on dreams,” referring to exascale systems.

    Paul Messina, a computer scientist and distinguished fellow at Argonne National Labs who heads the Exascale Computing Project, appeared sanguine. “We believe that an important goal of the exascale computing project is to help economic competitiveness and economic security,” said Messina. “I could imagine that the administration would think that those are important things.”

    Politically, there ought to be a lot in HPC’s favor. A broad array of industries rely on government supercomputers to conduct scientific research, improve products, attack disease, create new energy systems and understand climate, among many other fields. Defense and intelligence agencies also rely on large systems.

    The ongoing exascale research funding (the U.S. budget is $150 million this year) will help with advances in software, memory, processors and other technologies that ultimately filter out to the broader commercial market.

    This is very much a global race, which is something the Trump administration will have to be mindful of. China, Europe and Japan are all developing exascale systems.

    China plans to have an exascale system ready by 2020. These nations see exascale — and the computing advances required to achieve it — as a pathway to challenging America’s tech dominance.

    “I’m not losing sleep over it yet,” said Messina, of the possibility that the incoming Trump administration may have different supercomputing priorities. “Maybe I will in January.”

    The U.S. will award the exascale contracts to vendors with two different architectures. This is not a new approach and is intended to help keep competition at the highest end of the market. Recent supercomputer procurements include systems built on the IBM Power architecture, Nvidia’s Volta GPU and Cray-built systems using Intel chips.

    The timing of these exascale systems — ready for 2023 — is also designed to take advantage of the upgrade cycles at the national labs. The large systems that will be installed in the next several years will be ready for replacement by the time exascale systems arrive.

    The last big performance milestone in supercomputing occurred in 2008 with the development of a petaflop system. An exaflop is a 1,000-petaflop system and building it is challenging because of the limits of Moore’s Law, a 1960s-era observation that noted the number of transistors on a chip doubles about every two years.

    “Now we’re at the point where Moore’s Law is just about to end,” said Messina in an interview. That means the key to building something faster “is by having much more parallelism, and many more pieces. That’s how you get the extra speed.”

    An exascale system will solve a problem 50 times faster than the 20-petaflop systems in use in government labs today.

    Development work has begun on the systems and applications that can utilize hundreds of millions of simultaneous parallel events. “How do you manage it — how do you get it all to work smoothly?” said Messina.

    Another major problem is energy consumption. An exascale machine can be built today using current technology, but such a system would likely need its own power plant. The U.S. wants an exascale system that can operate on 20 megawatts and certainly no more than 30 megawatts.

    Scientists will have to come up with a way “to vastly reduce the amount of energy it takes to do a calculation,” said Messina. The applications and software development are critical because most of the energy is used to move data. And new algorithms will be needed.

    About 500 people are working at universities and national labs on the DOE’s coordinated effort to develop the software and other technologies exascale will need.

    Aside from the cost of building the systems, the U.S. will spend millions funding the preliminary work. Vendors want to maintain the intellectual property of what they develop. If it cost, for instance, $50 million to develop a certain aspect of a system, the U.S. may ask the vendor to pay 40% of that cost if they want to keep the intellectual property.

    A key goal of the U.S. research funding is to avoid creation of one-off technologies that can only be used in these particular exascale systems.

    “We have to be careful,” Terri Quinn, a deputy associate director for HPC at Lawrence Livermore National Laboratory, said at the SC16 panel session. “We don’t want them (vendors) to give us capabilities that are not sustainable in a business market.”

    The work under way will help ensure that the technology research is far enough along to enable the vendors to respond to the 2019 request for proposals.

    Supercomputers can deliver advances in modeling and simulation. Instead of building physical prototypes of something, a supercomputer can allow modeling virtually. This can speed the time it takes something to get to market, whether a new drug or car engine. Increasingly, HPC is used in big data and is helping improve cybersecurity through rapid analysis; artificial intelligence and robotics are other fields with strong HPC demand.

    China will likely beat the U.S. in developing an exascale system, but the real test will be their usefulness.

    Messina said the U.S. approach is to develop an exascale eco-system involving vendors, universities and the government. The hope is that the exascale systems will not only a have a wide range of applications ready for them, but applications that are relatively easy to program. Messina wants to see these systems quickly put to immediate and broad use.

    “Economic competitiveness does matter to a lot of people,” said Messina.

    See the full article here .

    Please help promote STEM in your local schools.
    STEM Icon
    Stem Education Coalition

    Argonne National Laboratory seeks solutions to pressing national problems in science and technology. The nation’s first national laboratory, Argonne conducts leading-edge basic and applied scientific research in virtually every scientific discipline. Argonne researchers work closely with researchers from hundreds of companies, universities, and federal, state and municipal agencies to help them solve their specific problems, advance America’s scientific leadership and prepare the nation for a better future. With employees from more than 60 nations, Argonne is managed by UChicago Argonne, LLC for the U.S. Department of Energy’s Office of Science. For more visit http://www.anl.gov.

    About ALCF

    The Argonne Leadership Computing Facility’s (ALCF) mission is to accelerate major scientific discoveries and engineering breakthroughs for humanity by designing and providing world-leading computing facilities in partnership with the computational science community.

    We help researchers solve some of the world’s largest and most complex problems with our unique combination of supercomputing resources and expertise.

    ALCF projects cover many scientific disciplines, ranging from chemistry and biology to physics and materials science. Examples include modeling and simulation efforts to:

    Discover new materials for batteries
    Predict the impacts of global climate change
    Unravel the origins of the universe
    Develop renewable energy technologies

    Argonne is managed by UChicago Argonne, LLC for the U.S. Department of Energy’s Office of Science

    Argonne Lab Campus

     
  • richardmitnick 12:59 pm on November 12, 2016 Permalink | Reply
    Tags: , , , Exascale computing   

    From Argonne Leadership Computing Facility “Exascale Computing Project announces $48 million to establish four exascale co-design centers” 

    ANL Lab
    News from Argonne National Laboratory

    Argonne Leadership Computing Facility
    A DOE Office of Science user facility

    November 11, 2016
    Mike Bernhardt

    1
    Co-design and integration of hardware, software, applications and platforms, is essential to deploying exascale-class systems that will meet the future requirements of the scientific communities these systems will serve. Credit: Andre Schleife, UIUC

    The U.S. Department of Energy’s (DOE’s) Exascale Computing Project (ECP) today announced that it has selected four co-design centers as part of a 4-year $48 million funding award. The first year is funded at $12 million, and is to be allocated evenly among the four award recipients.

    The ECP is responsible for the planning, execution and delivery of technologies necessary for a capable exascale ecosystem to support the nation’s exascale imperative, including software, applications, hardware and early testbed platforms.

    Exascale refers to computing systems at least 50 times faster than the nation’s most powerful supercomputers in use today.

    According to Doug Kothe, ECP Director of Application Development: “Co-design lies at the heart of the Exascale Computing Project. ECP co-design, an intimate interchange of the best that hardware technologies, software technologies and applications have to offer each other, will be a catalyst for delivery of exascale-enabling science and engineering solutions for the U.S.”

    “By targeting common patterns of computation and communication, known as “application motifs,” we are confident that these ECP co-design centers will knock down key performance barriers and pave the way for applications to exploit all that capable exascale has to offer,” he said.

    The development of capable exascale systems requires an interdisciplinary engineering approach in which the developers of the software ecosystem, the hardware technology and a new generation of computational science applications are collaboratively involved in a participatory design process referred to as co-design.

    The co-design process is paramount to ensuring that future exascale applications adequately reflect the complex interactions and trade-offs associated with the many new—and sometimes conflicting—design options, enabling these applications to tackle problems they currently can’t address.

    According to ECP Director Paul Messina, “The establishment of these and future co design centers is foundational to the creation of an integrated, usable and useful exascale ecosystem. After a lengthy review, we are pleased to announce that we have initially selected four proposals for funding. The establishment of these co-design centers, following on the heels of our recent application development awards, signals the momentum and direction of ECP as we bring together the necessary ecosystem and infrastructure to drive the nation’s exascale imperative.”
    The four selected co-design proposals and their principal investigators are as follows:

    CODAR: Co-Design Center for Online Data Analysis and Reduction at the Exascale

    Principal Investigator: Ian Foster, Argonne National Laboratory Distinguished Fellow

    This co-design center will focus on overcoming the rapidly growing gap between compute speed and storage input/output rates by evaluating, deploying and integrating novel online data analysis and reduction methods for the exascale. Working closely with Exascale Computing Project applications, CODAR will undertake a focused co-design process that targets both common and domain-specific data analysis and reduction methods, with the goal of allowing application developers to choose and configure methods to output just the data needed by the application. CODAR will engage directly with providers of ECP hardware, system software, programming models, data analysis and reduction algorithms and applications in order to better understand and guide tradeoffs in the development of exascale systems, applications and software frameworks, given constraints relating to application development costs, application fidelity, performance portability, scalability and power efficiency.

    “Argonne is pleased to be leading CODAR efforts in support of the Exascale Computing Project,” said Argonne Distinguished Fellow Ian Foster. “We aim in CODAR to co-optimize applications, data services and exascale platforms to deliver the right bits in the right place at the right time.”

    Block-Structured AMR Co-Design Center

    Principal Investigator: John Bell, Lawrence Berkeley National Laboratory

    The Block-Structured Adaptive Mesh Refinement Co-Design Center will be led by Lawrence Berkeley National Laboratory with support from Argonne National Laboratory and the National Renewable Energy Laboratory. The goal is to develop a new framework, AMReX, to support the development of block-structured adaptive mesh refinement algorithms for solving systems of partial differential equations with complex boundary conditions on exascale architectures. Block-structured adaptive mesh refinement provides a natural framework in which to focus computing power on the most critical parts of the problem in the most computationally efficient way possible. Block-structured AMR is already widely used to solve many problems relevant to DOE. Specifically, at least six of the 22 exascale application projects announced last month—in the areas of accelerators, astrophysics, combustion, cosmology, multiphase flow and subsurface flow—will rely on block-structured AMR as part of the ECP.

    “This co-design center reflects the important role of adaptive mesh refinement in accurately simulating problems at scales ranging from the edges of flames to global climate to the makeup of the universe, and how AMR will be critical to tackling problems at the exascale,” said David Brown, director of Berkeley Lab’s Computational Research Division. “It’s also important to note that AMR will be a critical component in a third of the 22 exascale application projects announced in September, which will help ensure that researchers can make productive use of exascale systems when they are deployed.”

    Center for Efficient Exascale Discretizations (CEED)

    Principal Investigator: Tzanio Kolev, Lawrence Livermore National Laboratory

    Fully exploiting future exascale architectures will require a rethinking of the algorithms used in the large-scale applications that advance many science areas vital to DOE and the National Nuclear Security Administration (NNSA), such as global climate modeling, turbulent combustion in internal combustion engines, nuclear reactor modeling, additive manufacturing, subsurface flow and national security applications. The newly established Center for Efficient Exascale Discretizations aims to help these DOE and NNSA applications to take full advantage of exascale hardware by using state-of-the-art ‘high-order discretizations’ that provide an order of magnitude performance improvement over traditional methods.

    In simple mathematical terms, discretization denotes the process of dividing a geometry into finite elements, or building blocks, in preparation for analysis. This process, which can dramatically improve application performance, involves making simplifying assumptions to reduce demands on the computer, but with minimal loss of accuracy. Recent developments in supercomputing make it increasingly clear that the high-order discretizations, which CEED is focused on, have the potential to achieve optimal performance and deliver fast, efficient and accurate simulations on exascale systems.

    The CEED Co-Design Center is a research partnership of two DOE labs and five universities. Partners include Lawrence Livermore National Laboratory; Argonne National Laboratory; the University of Illinois Urbana-Champaign; Virginia Tech; University of Tennessee, Knoxville; Colorado University, Boulder; and the Rensselaer Polytechnic Institute.

    “The CEED team I have the privilege to lead is dedicated to the development of next-generation discretization software and algorithms that will enable a wide range of applications to run efficiently on future hardware,” said CEED director Tzanio Kolev of Lawrence Livermore National Laboratory. “Our co-design center is focused first and foremost on applications. We bring to this enterprise a collaborative team of application scientists, computational mathematicians and computer scientists with a strong track record of delivering performant software on leading-edge platforms. Collectively, we support hundreds of users in national labs, industry and academia, and we are committed to pushing simulation capabilities to new levels across an ever-widening range of applications.”

    Co-design center for Particle Applications (CoPA)

    Principal Investigator: Tim Germann, Los Alamos National Laboratory

    This co-design center will serve as a centralized clearinghouse for particle-based ECP applications, communicating their requirements and evaluating potential uses and benefits of ECP hardware and software technologies using proxy applications. Particle-based simulation approaches are ubiquitous in computational science and engineering, and they involve the interaction of each particle with its environment by direct particle-particle interactions at shorter ranges and/or by particle-mesh interactions with a local field that is set up by longer-range effects. Best practices in code portability, data layout and movement, and performance optimization will be developed and disseminated via sustainable, productive and interoperable co-designed numerical recipes for particle-based methods that meet the application requirements within the design space of software technologies and subject to exascale hardware constraints. The ultimate goal is the creation of scalable open exascale software platforms suitable for use by a variety of particle-based simulations.

    “Los Alamos is delighted to be leading the Co-Design Center for Particle-Based Methods: From Quantum to Classical, Molecular to Cosmological, which builds on the success of ExMatEx, the Exascale CoDesign Center for Materials in Extreme Environments,” said John Sarrao, Associate Director for Theory, Simulation, and Computation at Los Alamos. “Advancing deterministic particle-based methods is essential for simulations at the exascale, and Los Alamos has long believed that co-design is the right approach for advancing these frontiers. We look forward to partnering with our colleague laboratories in successfully executing this important element of the Exascale Computing Project.”

    About ECP

    The ECP is a collaborative effort of two DOE organizations — the Office of Science and the National Nuclear Security Administration. As part of President Obama’s National Strategic Computing initiative, ECP was established to develop a capable exascale ecosystem, encompassing applications, system software, hardware technologies and architectures and workforce development to meet the scientific and national security mission needs of DOE in the mid-2020s timeframe.

    About the Office of Science

    DOE’s Office of Science is the single largest supporter of basic research in the physical sciences in the United States, and is working to address some of the most pressing challenges of our time. For more information, please visit http://science.energy.gov/.

    About NNSA

    Established by Congress in 2000, NNSA is a semi-autonomous agency within DOE, responsible for enhancing national security through the military application of nuclear science. NNSA maintains and enhances the safety, security and effectiveness of the U.S. nuclear weapons stockpile without nuclear explosive testing; works to reduce the global danger from weapons of mass destruction; provides the U.S. Navy with safe and effective nuclear propulsion; and responds to nuclear and radiological emergencies in the United States and abroad.

    See the full article here .

    Please help promote STEM in your local schools.
    STEM Icon
    Stem Education Coalition
    Argonne National Laboratory seeks solutions to pressing national problems in science and technology. The nation’s first national laboratory, Argonne conducts leading-edge basic and applied scientific research in virtually every scientific discipline. Argonne researchers work closely with researchers from hundreds of companies, universities, and federal, state and municipal agencies to help them solve their specific problems, advance America’s scientific leadership and prepare the nation for a better future. With employees from more than 60 nations, Argonne is managed by UChicago Argonne, LLC for the U.S. Department of Energy’s Office of Science. For more visit http://www.anl.gov.

    The Advanced Photon Source at Argonne National Laboratory is one of five national synchrotron radiation light sources supported by the U.S. Department of Energy’s Office of Science to carry out applied and basic research to understand, predict, and ultimately control matter and energy at the electronic, atomic, and molecular levels, provide the foundations for new energy technologies, and support DOE missions in energy, environment, and national security. To learn more about the Office of Science X-ray user facilities, visit http://science.energy.gov/user-facilities/basic-energy-sciences/.

    Argonne is managed by UChicago Argonne, LLC for the U.S. Department of Energy’s Office of Science

    Argonne Lab Campus

     
  • richardmitnick 4:24 pm on May 27, 2016 Permalink | Reply
    Tags: , Exascale computing, ,   

    From SLAC: “SLAC’s New Computer Science Division Teams with Stanford to Tackle Data Onslaught” 


    SLAC Lab

    1
    Alex Aiken, director of the new SLAC Computer Science Division, in the Stanford Research Computing Facility. Built by Stanford on the SLAC campus, this high-performance computing data center opened in 2013; it is used by more than 230 principal investigators and 1,100 students. (SLAC National Accelerator Laboratory)

    Alex Aiken, director of the new Computer Science Division at the Department of Energy’s SLAC National Accelerator Laboratory, has been thinking a great deal about the coming challenges of exascale computing, defined as a billion billion calculations per second. That’s a thousand times faster than any computer today. Reaching this milestone is such a big challenge that it’s expected to take until the mid-2020s and require entirely new approaches to programming, data management and analysis, and numerous other aspects of computing.

    SLAC and Stanford, Aiken believes, are in a great position to join forces and work toward these goals while advancing SLAC science.

    “The kinds of problems SLAC scientists have are at such an extreme scale that they really push the limits of all those systems,” he says. “We believe there is an opportunity here to build a world-class Department of Energy computer science group at SLAC, with an emphasis on large-scale data analysis.”

    Even before taking charge of the division on April 1, Aiken had his feet in both worlds, working on DOE-funded projects at Stanford. He’ll continue in his roles as professor and chair of the Stanford Computer Science Department while building the new SLAC division.

    Solving Problems at the Exascale

    SLAC has a lot of tough computational problems to solve, from simulating the behavior of complex materials, chemical reactions and the cosmos to analyzing vast torrents of data from the upcoming LCLS-II and LSST projects. SLAC’s Linac Coherent Light Source (LCLS), is a DOE Office of Science User Facility.

    SLAC LCLS-II line
    SLAC LCLS-II line

    LSST/Camera, built at SLAC
    LSST Interior
    LSST telescope, currently under construction at Cerro Pachón Chile
    LSST Camera, built at SLAC; LSST telescope, currently under construction at Cerro Pachón Chile

    LSST, the Large Synoptic Survey Telescope, will survey the entire Southern Hemisphere sky every few days from a mountaintop in Chile starting in 2022. It will produce 6 million gigabytes of data per year – the equivalent of shooting roughly 800,000 images with a 8-megapixel digital camera every night. And the LCLS-II X-ray laser, which begins operations in 2020, will produce a thousand times more data than today’s LCLS.

    The DOE has led U.S. efforts to develop high-performance computing for decades, and computer science is increasingly central to the DOE mission, Aiken says. One of the big challenges across a number of fields is to find ways to process data on the fly, so researchers can obtain rapid feedback to make the best use of limited experimental time and determine which data are interesting enough to analyze in depth.

    The DOE recently launched the Exascale Computing Initiative (ECI), led by the Office of Science and National Nuclear Security Administration, as part of a broader National Strategic Computing Initiative. It aims to develop capable exascale computing systems for science, national security and energy technology development by the mid-2020s.

    Staffing up and Enhancing Collaborations

    On the Stanford side, the university has been performing world-class computer science – a field Aiken loosely describes as, “How do you make computers useful for a variety of things that people want to do with them?” – for more than half a century. But since faculty members mainly work through graduate student and postdoctoral researchers, projects tend to be limited to the 3- to 5-year lifespan of those positions.

    The new SLAC division will provide a more stable basis for the type of long-term collaboration needed to solve the most challenging scientific problems. Stanford computer scientists have already been involved with the LSST project, and Aiken himself is working on new exascale computing initiatives at SLAC: “That’s where I’m spending my own research time.”

    He is in the process of hiring four SLAC staff scientists, with plans to eventually expand to a group of 10 to 15 researchers and two initial joint faculty positions. The division will eventually be housed in the Photon Science Laboratory Building that’s now under construction, maximizing their interaction with researchers who use intensive computing for ultrafast science and biology. Stanford graduate students and postdocs will also be an important part of the mix.

    While initial funding is coming from SLAC and Stanford, Aiken says he will be applying for funding from the DOE’s Advanced Scientific Computing Research program, the Exascale Computing Initiative and other sources to make the division self-sustaining.

    Two Sets of Roots

    Aiken came to Stanford in 2003 from the University of California, Berkeley, where he was a professor of engineering and computer science. Before that he spent five years at IBM Almaden Research Center.

    He received a bachelor’s degree in computer science and music from Bowling Green State University in 1983 and a PhD from Cornell in 1988. Aiken met his wife, Jennifer Widom, in a music practice room when they were graduate students (he played trombone, she played trumpet). Widom is now a professor of computer science and electrical engineering at Stanford and senior associate dean for faculty and academic affairs for the School of Engineering. Avid and adventurous travelers, they have taken their son and daughter, both now grown, on trekking, backpacking, scuba diving and sailing trips all over the world.

    The roots of the new SLAC Computer Science Division go back to fall 2014, when Aiken began meeting with two key faculty members – Stanford Professor Pat Hanrahan, a computer graphics researcher who was a founding member of Pixar Animation Studios and has received three Academy Awards for rendering and computer graphics, and SLAC/Stanford Professor Tom Abel, director of the Kavli Institute for Particle Astrophysics and Cosmology, who specializes in computer simulations and visualizations of cosmic phenomena. The talks quickly drew in other faculty and staff, and led to a formal proposal late last year that outlined potential synergies between SLAC, Stanford and Silicon Valley firms that develop computer hardware and software.

    “Modern algorithms that exploit new computer architectures, combined with our unique data sets at SLAC, will allow us to do science that is greater than the sum of its parts,” Abel said. “I am so looking forward to having more colleagues at SLAC to discuss things like extreme data analytics and how to program exascale computers.”

    Aiken says he has identified eight Stanford computer science faculty members and a number of SLAC researchers with LCLS, LSST, the Particle Astrophysics and Cosmology Division, the Elementary Particle Physics Division and the Accelerator Directorate who want to get involved. “We keep hearing from more SLAC people who are interested,” he says. “We’re looking forward to working with everyone!”

    See the full article here .

    Please help promote STEM in your local schools.

    STEM Icon

    Stem Education Coalition

    SLAC Campus
    SLAC is a multi-program laboratory exploring frontier questions in photon science, astrophysics, particle physics and accelerator research. Located in Menlo Park, California, SLAC is operated by Stanford University for the DOE’s Office of Science.
    i1

     
  • richardmitnick 4:18 pm on March 29, 2016 Permalink | Reply
    Tags: , Exascale computing, ,   

    From LLNL: “Lawrence Livermore and IBM collaborate to build new brain-inspired supercomputer” 


    Lawrence Livermore National Laboratory

    Mar. 29, 2016
    Don Johnston
    johnston19@llnl.gov
    925-423-4902

    1
    The 16-chip IBM TrueNorth platform Lawrence Livermore will receive later this week. The scalable platform will process the equivalent of 16 million neurons and 4 billion synapses and consume the energy equivalent of a hearing-aid battery – a mere 2.5 watts of power. Photo courtesy of IBM.

    Chip-architecture breakthrough accelerates path to exascale computing; helps computers tackle complex, cognitive tasks such as pattern recognition sensory processing

    Lawrence Livermore National Laboratory (LLNL) today announced it will receive a first-of-a-kind brain-inspired supercomputing platform for deep learning developed by IBM Research . Based on a breakthrough neurosynaptic computer chip called IBM TrueNorth, the scalable platform will process the equivalent of 16 million neurons and 4 billion synapses and consume the energy equivalent of a hearing aid battery – a mere 2.5 watts of power.

    The brain-like, neural network design of the IBM Neuromorphic System is able to infer complex cognitive tasks such as pattern recognition and integrated sensory processing far more efficiently than conventional chips.

    1
    DARPA SyNAPSE 16 chip board with IBM TrueNorth


    Access mp4 video here .

    The new system will be used to explore new computing capabilities important to the National Nuclear Security Administration’s (NNSA) missions in cybersecurity, stewardship of the nation’s nuclear weapons stockpile and nonproliferation. NNSA’s Advanced Simulation and Computing (ASC) program will evaluate machine-learning applications, deep-learning algorithms and architectures and conduct general computing feasibility studies. ASC is a cornerstone of NNSA’s Stockpile Stewardship Program to ensure the safety, security and reliability of the nation’s nuclear deterrent without underground testing.

    Neuromorphic computing opens very exciting new possibilities and is consistent with what we see as the future of the high performance computing and simulation at the heart of our national security missions,” said Jim Brase, LLNL deputy associate director for Data Science. “The potential capabilities neuromorphic computing represents and the machine intelligence that these will enable will change how we do science.”

    The technology represents a fundamental departure from computer design that has been prevalent for the past 70 years, and could be a powerful complement in the development of next-generation supercomputers able to perform at exascale speeds, 50 times (or two orders of magnitude) faster than today’s most advanced petaflop (quadrillion floating point operations per second) systems. Like the human brain, neurosynaptic systems require significantly less electrical power and volume.

    “The low power consumption of these brain-inspired processors reflects industry’s desire and a creative approach to reducing power consumption in all components for future systems as we set our sights on exascale computing,” said Michel McCoy, LLNL program director for Weapon Simulation and Computing.

    “The delivery of this advanced computing platform represents a major milestone as we enter the next era of cognitive computing,” said Dharmendra Modha, IBM fellow and chief scientist of Brain-inspired Computing, IBM Research. “We value our partnerships with the national labs. In fact, prior to design and fabrication, we simulated the IBM TrueNorth processor using LLNL’s Sequoia supercomputer. This collaboration will push the boundaries of brain-inspired computing to enable future systems that deliver unprecedented capability and throughput, while minimizing the capital, operating and programming costs – keeping our nation at the leading edge of science and technology.”

    A single TrueNorth processor consists of 5.4 billion transistors wired together to create an array of 1 million digital neurons that communicate with one another via 256 million electrical synapses. It consumes 70 milliwatts of power running in real time and delivers 46 giga synaptic operations per second – orders of magnitude lower energy than a conventional computer running inference on the same neural network. TrueNorth was originally developed under the auspices of the Defense Advanced Research Projects Agency’s (DARPA) Systems of Neuromorphic Adaptive Plastic Scalable Electronics (SyNAPSE) program, in collaboration with Cornell University.

    Under terms of the $1 million contract, LLNL will receive a 16-chip TrueNorth system representing a total of 16 million neurons and 4 billion synapses. LLNL also will receive an end-to-end ecosystem to create and program energy-efficient machines that mimic the brain’s abilities for perception, action and cognition. The ecosystem consists of a simulator; a programming language; an integrated programming environment; a library of algorithms as well as applications; firmware; tools for composing neural networks for deep learning; a teaching curriculum; and cloud enablement.

    Lawrence Livermore computer scientists will collaborate with IBM Research, partners across the Department of Energy complex and universities to expand the frontiers of neurosynaptic architecture, system design, algorithms and software ecosystem.

    See the full article here .

    Please help promote STEM in your local schools.

    STEM Icon

    Stem Education Coalition
    LLNL Campus

    Operated by Lawrence Livermore National Security, LLC, for the Department of Energy’s National Nuclear Security
    Administration
    DOE Seal
    NNSA

     
  • richardmitnick 5:56 am on March 18, 2016 Permalink | Reply
    Tags: , Exascale computing,   

    From NSF: “Envisioning supercomputers of the future” 

    nsf
    National Science Foundation

    March 17, 2016
    Aaron Dubrow,
    NSF
    703-292-4489
    adubrow@nsf.gov

    Makeda Easter, Texas Advanced Computing Center
    512-471-8217
    makeda@tacc.utexas.edu

    Project to test operating systems for future exascale computers

    Last year, President Obama announced the National Strategic Computing Initiative (NSCI), an executive order to increase research, development and deployment of high performance computing (HPC) in the United States, with the the National Science Foundation, the Department of Energy and the Department of Defense as the lead agencies.

    One of NSCI’s objectives is to accelerate research and development that can lead to future exascale computing systems — computers capable of performing one billion billion calculations per second (also known as an exaflop). Exascale computers will advance research, enhance national security and give the U.S. a competitive economic advantage.

    Experts believe simply improving existing technologies and architectures will not get us to exascale levels. Instead, researchers will need to rethink the entire computing paradigm — from power, to memory, to system software — to make exascale systems a reality.

    The Argo Project is a three-year collaborative effort, funded by the Department of Energy, to develop a new approach for extreme-scale system software. The project involves the efforts of 40 researchers from three national laboratories and four universities working to design and prototype an exascale operating system and the software to make it useful.

    To test their new ideas, the research team is using Chameleon, an experimental environment for large-scale cloud computing research supported by the National Science Foundation and hosted by the University of Chicago and the Texas Advanced Computing Center (TACC).

    Chameleon — funded by a $10 million award from the NSFFutureCloud program — is a re-configurable testbed that lets the research community experiment with novel cloud computing architectures and pursue new, architecturally-enabled applications of cloud computing.

    Cloud computing has become a dominant method of providing computing infrastructure for Internet services,” said Jack Brassil, a program officer in NSF’s division of Computer and Network Systems. “But to design new and innovative compute clouds and the applications they will run, academic researchers need much greater control, diversity and visibility into the hardware and software infrastructure than is available with commercial cloud systems today.”

    The NSFFutureCloud testbeds provides the types of capabilities Brassil described.

    Using Chameleon, the team is testing four key aspects of the future system:

    The Global Operating System, which handles machine configuration, resource allocation and launching applications.

    The Node Operating System, which is based on Linux and provides interfaces for better control of future exascale architectures.

    The concurrency runtime Argobots, a novel infrastructure that efficiently distributes work among computing resources.

    BEACON (the Backplane for Event and Control Notification), a framework that gathers data on system performance and sends it to various controllers to take appropriate action.

    Chameleon’s unique, reconfigurable infrastructure lets researchers bypass some issues that would have come up if the team was running the project on a typical high-performance computing system.

    For instance, developing the Node Operating System requires researchers to change the operating system kernel — the computer program that controls all the hardware components of a system and allocates them to applications.

    “There are not a lot of places where we can do that,” said Swann Perarnau, a postdoctoral researcher at Argonne National Laboratory and collaborator on the Argo Project. “HPC machines in production are strictly controlled, and nobody will let us modify such a critical component.”

    However Chameleon lets scientists modify and control the system from top to bottom, allowing it to support a wide variety of cloud research and methods and architectures not available elsewhere.

    “The Argo project didn’t have the right hardware nor the manpower to maintain the infrastructure needed for proper integration and testing of the entire software stack,” Perarnau added. “While we had full access to a small cluster, I think we saved weeks of additional system setup time, and many hours of maintenance work, switching to Chameleon.”

    One of the major challenges in reaching exascale is energy usage and cost. During last year’s Supercomputing Conference, the researchers demonstrated the ability to dynamically control the power usage of 20 nodes during a live demonstration running on Chameleon.

    They released a paper this week describing their approach to power management for future exascale systems and will present the results at the Twelfth Workshop on High-Performance, Power-Aware Computing (HPPAC’16) in May.

    The Argo team is working with industry partners, including Cray, Intel and IBM, to explore which techniques and features would be best suited for the Department of Energy’s next supercomputer.

    “Argo was founded to design and prototype exascale operating systems and runtime software,” Perarnau said. “We believe some of the new techniques and tools we have developed can be tested on petascale systems and refined for exascale platforms.”

    See the full article here .

    Please help promote STEM in your local schools.
    STEM Icon

    Stem Education Coalition

    The National Science Foundation (NSF) is an independent federal agency created by Congress in 1950 “to promote the progress of science; to advance the national health, prosperity, and welfare; to secure the national defense…we are the funding source for approximately 24 percent of all federally supported basic research conducted by America’s colleges and universities. In many fields such as mathematics, computer science and the social sciences, NSF is the major source of federal backing.

    seal

     
c
Compose new post
j
Next post/Next comment
k
Previous post/Previous comment
r
Reply
e
Edit
o
Show/Hide comments
t
Go to top
l
Go to login
h
Show/Hide help
shift + esc
Cancel
%d bloggers like this: