Tagged: Summit supercomputer Toggle Comment Threads | Keyboard Shortcuts

  • richardmitnick 12:23 pm on December 28, 2018 Permalink | Reply
    Tags: , , Summit supercomputer, , , UT Students Get Bite-Sized Bits of Big Data Centers in ORNL-Led Course   

    From Oak Ridge Leadership Computing Facility: “UT Students Get Bite-Sized Bits of Big Data Centers in ORNL-Led Course” 

    i1

    Oak Ridge National Laboratory

    From Oak Ridge Leadership Computing Facility

    20 Dec, 2018
    Rachel Harken

    1
    Image Credit: Genevieve Martin, ORNL

    This fall, staff at the US Department of Energy’s (DOE’s) Oak Ridge National Laboratory (ORNL) once again contributed to the “Introduction to Data Centers” course at the University of Tennessee, Knoxville (UT).

    Now in its fourth year, the class had the largest and most diverse enrollment yet, with four disciplines represented: computer engineering, computer science, electrical engineering, and industrial engineering. This year’s students toured the data centers at the Oak Ridge Leadership Computing Facility (OLCF), a DOE Office of Science User Facility located at ORNL, earlier this fall as part of the course.

    The multidisciplinary course, part of UT’s data center technology and management minor, introduces students to the many topics involved in building and commanding a data center. Because running a data center requires knowledge in a multitude of areas, no one discipline typically covers the broad spectrum of topics involved.

    The multidisciplinary course, part of UT’s data center technology and management minor, introduces students to the many topics involved in building and commanding a data center. Because running a data center requires knowledge in a multitude of areas, no one discipline typically covers the broad spectrum of topics involved.

    “We bring in a lot of disciplinary experts from ORNL,” said Stephen McNally, operations manager at the OLCF and the course organizer. “We cover the mechanical and electrical components, but we also focus on project management, commissioning, overall requirements-gathering, and networking.” The current curriculum was developed by McNally, UT interim dean of the College of Engineering Mark Dean, UT professor David Icove, and ORNL project specialist Jennifer Goodpasture.

    The students enrolled in the course are provided a request for proposals at the beginning of the year, and they work together throughout the semester to submit a 20- to 30-page proposal to meet the requirements. Because students are often restricted to classes within their majors, the course stresses the interplay between disciplines and showcases areas that might previously have been out of reach.

    “Hiring someone straight out of school to do what a data center person does is really difficult, because you have to understand so much about so many different disciplines,” McNally said. “This is primarily why we have such a low talent pool for data center–related jobs. We built this class to help solve that problem.”

    The course is opening new opportunities for some students. Two of the students in this year’s class received scholarships to Infrastructure Masons (iMasons), an organization that brings digital infrastructure experts together to network, learn, and collaborate. The students’ enrollment in the course through the new minor degree program qualified them to apply.

    Aside from the opportunity to apply for the iMasons scholarship, students learned from new data center professionals in industry this year. One of the course’s new speakers was Frank Hutchison of SH Data Technologies, who talked about his role in building Tennessee’s first tier 3 data center. Tier 3 data centers are available 99.9% of the time, which means they are only down for seconds at a time each year.

    “This was the most engaging class we’ve had by far,” McNally said. “These students really got to see how these different disciplines work together to run, build, and operate data centers, and we are excited to continue bringing these folks in and helping to bridge this talent gap in the workforce.”

    The team is excited that this course continues to gain traction with the students at UT and is making plans to accommodate more students next fall. The course is currently under consideration for possible expansion into a professional certification program or a distance learning course.

    In addition to McNally and Goodpasture, the ORNL team contributing to the course includes Jim Serafin, Jim Rogers, Kathlyn Boudwin, Justin Whitt, Darren Norris, David Grant, Rick Griffin, Saeed Ghezawi, Brett Ellis, Bart Hammontree, Scott Milliken, Gary Rogers, and Kris Torgerson.

    See the full article here .

    five-ways-keep-your-child-safe-school-shootings

    Please help promote STEM in your local schools.

    Stem Education Coalition

    ORNL is managed by UT-Battelle for the Department of Energy’s Office of Science. DOE’s Office of Science is the single largest supporter of basic research in the physical sciences in the United States, and is working to address some of the most pressing challenges of our time.

    i2

    The Oak Ridge Leadership Computing Facility (OLCF) was established at Oak Ridge National Laboratory in 2004 with the mission of accelerating scientific discovery and engineering progress by providing outstanding computing and data management resources to high-priority research and development projects.

    ORNL’s supercomputing program has grown from humble beginnings to deliver some of the most powerful systems in the world. On the way, it has helped researchers deliver practical breakthroughs and new scientific knowledge in climate, materials, nuclear science, and a wide range of other disciplines.

    The OLCF delivered on that original promise in 2008, when its Cray XT “Jaguar” system ran the first scientific applications to exceed 1,000 trillion calculations a second (1 petaflop). Since then, the OLCF has continued to expand the limits of computing power, unveiling Titan in 2013, which is capable of 27 petaflops.


    ORNL Cray XK7 Titan Supercomputer

    Titan is one of the first hybrid architecture systems—a combination of graphics processing units (GPUs), and the more conventional central processing units (CPUs) that have served as number crunchers in computers for decades. The parallel structure of GPUs makes them uniquely suited to process an enormous number of simple computations quickly, while CPUs are capable of tackling more sophisticated computational algorithms. The complimentary combination of CPUs and GPUs allow Titan to reach its peak performance.

    ORNL IBM AC922 SUMMIT supercomputer. Credit: Carlos Jones, Oak Ridge National Laboratory/U.S. Dept. of Energy

    With a peak performance of 200,000 trillion calculations per second—or 200 petaflops, Summit will be eight times more powerful than ORNL’s previous top-ranked system, Titan. For certain scientific applications, Summit will also be capable of more than three billion billion mixed precision calculations per second, or 3.3 exaops. Summit will provide unprecedented computing power for research in energy, advanced materials and artificial intelligence (AI), among other domains, enabling scientific discoveries that were previously impractical or impossible.

    The OLCF gives the world’s most advanced computational researchers an opportunity to tackle problems that would be unthinkable on other systems. The facility welcomes investigators from universities, government agencies, and industry who are prepared to perform breakthrough research in climate, materials, alternative energy sources and energy storage, chemistry, nuclear physics, astrophysics, quantum mechanics, and the gamut of scientific inquiry. Because it is a unique resource, the OLCF focuses on the most ambitious research projects—projects that provide important new knowledge or enable important new technologies.

     
  • richardmitnick 1:12 pm on January 19, 2018 Permalink | Reply
    Tags: , , Summit supercomputer, ,   

    From OLCF: “Optimizing Miniapps for Better Portability” 

    i1

    Oak Ridge National Laboratory

    OLCF

    January 17, 2018
    Rachel Harken

    When scientists run their scientific applications on massive supercomputers, the last thing they want to worry about is optimizing their codes for new architectures. Computer scientist Sunita Chandrasekaran at the University of Delaware is taking steps to make sure they don’t have a reason to worry.

    Chandrasekaran collaborates with a team at the US Department of Energy’s (DOE’s) Oak Ridge National Laboratory (ORNL) to optimize miniapps, smaller pieces of large applications that can be extracted and fine-tuned to run on GPU architectures. Chandrasekaran and her PhD student, Robert Searles, have taken on the task of porting (adapting) one such miniapp, Minisweep, to OpenACC—a directive-based programming model that allows users to run a code on multiple computing platforms without having to change or rewrite it.

    1
    Minisweep performs a “sweep” computation across a grid (pictured)—representative of a 3D volume in space—to calculate the positions, energies, and flows of neutrons in a nuclear reactor. The yellow cube marks the beginning location of the sweep. The green cubes are dependent upon information from the yellow cube, the blue cubes are dependent upon information from the green cubes, and so forth. In practice, sweeps are performed from each of the eight corners of the cube simultaneously.

    Minisweep is particularly important because it represents approximately 80–99 percent of the computation time of Denovo, a 3D code for radiation transport in nuclear reactors being used in a current DOE Innovative and Novel Computational Impact on Theory and Experiment, or INCITE, project. Minisweep is also being used in benchmarking for the Oak Ridge Leadership Computing Facility’s (OLCF’s) new Summit supercomputer.

    ORNL IBM Summit supercomputer depiction

    Summit is scheduled to be in full production in 2019 and will be the next leadership-class system at the OLCF, a DOE Office of Science User Facility located at ORNL.

    Created from Denovo by OLCF computational scientist Wayne Joubert, Minisweep works by “sweeping” diagonally across grid cells that represent points in space, allowing it to track the positions, flows, and energies of neutrons in a nuclear reactor. Cubes in the grid cell represent a number of these qualities and depend on information from previous cubes in the grid.

    “Scientists need to know how neutrons are flowing in a reactor because it can help them figure out how to build the radiation shield around it,” Chandrasekaran said. “Using Denovo, physicists can simulate this flow of neutrons, and with a faster code, they can compute many different configurations quickly and get their work done faster.”

    Minisweep has already been ported to multicore platforms using the OpenMP programming interface and to GPU accelerators using the lower-level programming language CUDA. ORNL computer scientists and ORNL Miniapps Port Collaboration organizers Tiffany Mintz and Oscar Hernandez knew that porting these kinds of codes to OpenACC would equip them for use on different high-performance computing architectures.

    Chandrasekaran and Searles have been using the Summit early access system, Summitdev, and the Cray XK7 Titan supercomputer at the OLCF to test Minisweep since mid-2017.

    ORNL Cray XK7 Titan Supercomputer

    2
    Visualization of a nuclear reactor simulation on Titan.

    Now, they’ve successfully enabled Minisweep to run on parallel architectures using OpenACC for fast execution on the targeted computer. An option to port to these types of systems without compromising performance didn’t previously exist.

    Whereas the code typically sweeps in eight directions from diagonal corners of a cube inward, the team saw that with only one sweep, the OpenACC directive performed on par with CUDA.

    “We saw OpenACC performing as well as CUDA on an NVIDIA Volta GPU, which is a state-of-the-art GPU card,” Searles said. “That’s huge for us to take away, because we are normally lucky to get performance that’s even 85 percent of CUDA. That one sweep consistently showed us about 0.3 or 0.4 seconds faster, which is significant at the problem size we used for measuring performance.”

    Chandrasekaran and the team at ORNL will continue optimizing Minisweep to get the application up and “sweeping” from all eight corners of a grid cell. Other radiation transport applications and one for DNA sequencing may be able to take advantage of Minisweep for multiple GPU architectures such as Summit—and even exascale systems—in the future.

    “I’m constantly trying to look at how I can package these kinds of tools from a user’s perspective,” Chandrasekaran said. “I take applications that are essential for these scientists’ research and try to find out how to make them more accessible. I always say: write once, reuse multiple times.”

    See the full article here .

    Please help promote STEM in your local schools.

    STEM Icon

    Stem Education Coalition

    ORNL is managed by UT-Battelle for the Department of Energy’s Office of Science. DOE’s Office of Science is the single largest supporter of basic research in the physical sciences in the United States, and is working to address some of the most pressing challenges of our time.

    i2

    The Oak Ridge Leadership Computing Facility (OLCF) was established at Oak Ridge National Laboratory in 2004 with the mission of accelerating scientific discovery and engineering progress by providing outstanding computing and data management resources to high-priority research and development projects.

    ORNL’s supercomputing program has grown from humble beginnings to deliver some of the most powerful systems in the world. On the way, it has helped researchers deliver practical breakthroughs and new scientific knowledge in climate, materials, nuclear science, and a wide range of other disciplines.

    The OLCF delivered on that original promise in 2008, when its Cray XT “Jaguar” system ran the first scientific applications to exceed 1,000 trillion calculations a second (1 petaflop). Since then, the OLCF has continued to expand the limits of computing power, unveiling Titan in 2013, which is capable of 27 petaflops.


    ORNL Cray XK7 Titan Supercomputer

    Titan is one of the first hybrid architecture systems—a combination of graphics processing units (GPUs), and the more conventional central processing units (CPUs) that have served as number crunchers in computers for decades. The parallel structure of GPUs makes them uniquely suited to process an enormous number of simple computations quickly, while CPUs are capable of tackling more sophisticated computational algorithms. The complimentary combination of CPUs and GPUs allow Titan to reach its peak performance.

    The OLCF gives the world’s most advanced computational researchers an opportunity to tackle problems that would be unthinkable on other systems. The facility welcomes investigators from universities, government agencies, and industry who are prepared to perform breakthrough research in climate, materials, alternative energy sources and energy storage, chemistry, nuclear physics, astrophysics, quantum mechanics, and the gamut of scientific inquiry. Because it is a unique resource, the OLCF focuses on the most ambitious research projects—projects that provide important new knowledge or enable important new technologies.

     
  • richardmitnick 11:23 am on October 9, 2017 Permalink | Reply
    Tags: , , , , , , , Summit supercomputer   

    From Science Node: “US Coalesces Plans for First Exascale Supercomputer: Aurora in 2021” 

    Science Node bloc
    Science Node

    September 27, 2017
    Tiffany Trader

    ANL ALCF Cray Aurora supercomputer

    At the Advanced Scientific Computing Advisory Committee (ASCAC) meeting, in Arlington, Va., yesterday (Sept. 26), it was revealed that the “Aurora” supercomputer is on track to be the United States’ first exascale system. Aurora, originally named as the third pillar of the CORAL “pre-exascale” project, will still be built by Intel and Cray for Argonne National Laboratory, but the delivery date has shifted from 2018 to 2021 and target capability has been expanded from 180 petaflops to 1,000 petaflops (1 exaflop).

    2

    The fate of the Argonne Aurora “CORAL” supercomputer has been in limbo since the system failed to make it into the U.S. DOE budget request, while the same budget proposal called for an exascale machine “of novel architecture” to be deployed at Argonne in 2021.

    Until now, the only official word from the U.S. Exascale Computing Project was that Aurora was being “reviewed for changes and would go forward under a different timeline.”

    Officially, the contract has been “extended,” and not cancelled, but the fact remains that the goal of the Collaboration of Oak Ridge, Argonne, and Lawrence Livermore (CORAL) initiative to stand up two distinct pre-exascale architectures was not met.

    According to sources we spoke with, a number of people at the DOE are not pleased with the Intel/Cray (Intel is the prime contractor, Cray is the subcontractor) partnership. It’s understood that the two companies could not deliver on the 180-200 petaflops system by next year, as the original contract called for. Now Intel/Cray will push forward with an exascale system that is some 50x larger than any they have stood up.

    It’s our understanding that the cancellation of Aurora is not a DOE budgetary measure as has been speculated, and that the DOE and Argonne wanted Aurora. Although it was referred to as an “interim,” or “pre-exascale” machine, the scientific and research community was counting on that system, was eager to begin using it, and they regarded it as a valuable system in its own right. The non-delivery is regarded as disruptive to the scientific/research communities.

    Another question we have is that since Intel/Cray failed to deliver Aurora, and have moved on to a larger exascale system contract, why hasn’t their original CORAL contract been cancelled and put out again to bid?

    With increased global competitiveness, it seems that the DOE stakeholders did not want to further delay the non-IBM/Nvidia side of the exascale track. Conceivably, they could have done a rebid for the Aurora system, but that would leave them with an even bigger gap if they had to spin up a new vendor/system supplier to replace Intel and Cray.

    Starting the bidding process over again would delay progress toward exascale – and it might even have been the death knell for exascale by 2021, but Intel and Cray now have a giant performance leap to make and three years to do it. There is an open question on the processor front as the retooled Aurora will not be powered by Phi/Knights Hill as originally proposed.

    These events beg the question regarding the IBM-led effort and whether IBM/Nvidia/Mellanox are looking very good by comparison. The other CORAL thrusts — Summit at Oak Ridge and Sierra at Lawrence Livermore — are on track, with Summit several weeks ahead of Sierra, although it is looking like neither will make the cut-off for entry onto the November Top500 list as many had speculated.

    ORNL IBM Summit supercomputer depiction

    LLNL IBM Sierra supercomputer

    We reached out to representatives from Cray, Intel and the Exascale Computing Project (ECP) seeking official comment on the revised Aurora contract. Cray and Intel declined to comment and we did not hear back from ECP by press time. We will update the story as we learn more.

    See the full article here .

    Please help promote STEM in your local schools.
    STEM Icon

    Stem Education Coalition

    Science Node is an international weekly online publication that covers distributed computing and the research it enables.

    “We report on all aspects of distributed computing technology, such as grids and clouds. We also regularly feature articles on distributed computing-enabled research in a large variety of disciplines, including physics, biology, sociology, earth sciences, archaeology, medicine, disaster management, crime, and art. (Note that we do not cover stories that are purely about commercial technology.)

    In its current incarnation, Science Node is also an online destination where you can host a profile and blog, and find and disseminate announcements and information about events, deadlines, and jobs. In the near future it will also be a place where you can network with colleagues.

    You can read Science Node via our homepage, RSS, or email. For the complete iSGTW experience, sign up for an account or log in with OpenID and manage your email subscription from your account preferences. If you do not wish to access the website’s features, you can just subscribe to the weekly email.”

     
  • richardmitnick 11:58 am on October 12, 2016 Permalink | Reply
    Tags: "Oak Ridge Scientists Are Writing Code That Not Even The World's Fastest Computers Can Run (Yet), Department of Energy’s Exascale Computing Project, , , Summit supercomputer,   

    From ORNL via Nashville Public Radio: “Oak Ridge Scientists Are Writing Code That Not Even The World’s Fastest Computers Can Run (Yet)” 

    i1

    Oak Ridge National Laboratory

    1

    Nashville Public Radio

    Oct 10, 2016
    Emily Siner

    2
    The current supercomputer at Oak Ridge National Lab, Titan, will be replaced by what could be the fastest computer in the world, Summit — and even that won’t even be fast enough for some of the programs that are being written at the lab. Oak Ridge National Laboratory, U.S. Dept. of Energy

    ORNL IBM Summit supercomputer depiction
    ORNL IBM Summit supercomputer depiction

    Scientists at Oak Ridge National Laboratory are starting to build applications for a supercomputer that might not go live for another seven years.

    The lab recently received more than $5 million from the Department of Energy to start developing several longterm projects.

    Thomas Evans’s research is among those funded, and it’s a daunting task: His team is trying to predict how small sections of particles inside a nuclear reactor will behave over a long period time.

    The more precisely they can simulate nuclear reactors on a computer, the better engineers can build them in real life.

    “Analysts can use that [data] to design facilities, experiments and working engineering platforms,” Evans says.

    But these very elaborate simulations that Evans is creating take so much computing power that they cannot run on Oak Ridge’s current supercomputer, Titan — nor will it be able to run on the lab’s new supercomputer, Summit, which could be the fastest in the world when it goes live in two years.

    So Evans is thinking ahead, he says, “to ultimately harness the power of the next generation — technically two generations from now — of supercomputing.

    “And of course, the challenge is, that machine doesn’t exist yet.”

    The current estimate is that this exascale computer, as it’s called, will be several times faster than Summit and go live around 2023. And it could very well take that long for Evans’s team to write code for it.

    The machine won’t just be faster, Evans says. It’s also going to work in a totally new way, which changes how applications are written.

    “In other words, I can’t take a simulation code that we’ve been using now and just drop it in the new machine and expect it to work,” he says.

    The computer will not necessarily be housed at Oak Ridge, but Tennessee researchers are playing a major role in the Department of Energy’s Exascale Computing Project. In addition to Evans’ nuclear reactor project, scientists at Oak Ridge will be leading the development of two other applications, including one that will simulate complex 3D printing. They’ll also assist in developing nine other projects.

    Doug Kothe, who leads the lab’s exascale application development, says the goal is not just to think ahead to 2023. The code that the researchers write should be able run on any supercomputer built in next several decades, he says.

    Despite the difficulty, working on incredibly fast computers is also an exciting prospect, Kothe says.

    “For a lot of very inquisitive scientists who love challenges, it’s just a way cool toy that you can’t resist.”

    See the full article here .

    Please help promote STEM in your local schools.

    STEM Icon

    Stem Education Coalition

    ORNL is managed by UT-Battelle for the Department of Energy’s Office of Science. DOE’s Office of Science is the single largest supporter of basic research in the physical sciences in the United States, and is working to address some of the most pressing challenges of our time.

    i2

     
c
Compose new post
j
Next post/Next comment
k
Previous post/Previous comment
r
Reply
e
Edit
o
Show/Hide comments
t
Go to top
l
Go to login
h
Show/Hide help
shift + esc
Cancel
%d bloggers like this: