Draft2:National Climate-Computing Research Center

Template:Stub notice

Official website
OLCF Celebrates 25 Years of HPC Leadership

The National Climate-Computing Research Center (NCRC) was formed as collaboration between Oak Ridge National Laboratory (ORNL) and the National Oceanic and Atmospheric Administration (NOAA) to explore a variety of research topics in climate sciences. It is one of three petascale computing facilities that art part of the Oak Ridge Leadership Computing Facility (OLCF).

This project uses a supercomputer installed to crunch numbers for the National Atmospheric and Oceanic Administration (NOAA) and its research partners has begun climate simulations at Oak Ridge National Laboratory. “The name of the machine is Gaea, or Mother Earth, from Greek mythology,” says Jim Rogers, Director of Operations at ORNL’s National Center for Computational Sciences (NCCS), which houses the new Cray XT6 machine. Rogers also directs the National Climate-Computing Research Center (NCRC) project at ORNL that includes Gaea. As part of a collaborative Work for Others Agreement, the supercomputer is owned by DOE and operated by ORNL’s managing contractor, UT-Battelle, on behalf of the NOAA customer.

Contract

The U.S. Department of Energy (DOE) awarded Cray Inc. a $47 million subcontract to provide high-performance computing (HPC) and scientific services for climate modeling collaborations between Oak Ridge National Laboratory (ORNL) and the National Oceanic and Atmospheric Administration (NOAA) on November 5, 2010. At ORNL, Cray delivers the Climate Modeling and Research System (CMRS), which includes a supercomputer named “Gaea” for simulating climate.[1]

History

A supercomputer installed to crunch numbers for the National Atmospheric and Oceanic Administration (NOAA) and its research partners has begun climate simulations at Oak Ridge National Laboratory (ORNL).

“The name of the machine is Gaea, or Mother Earth, from Greek mythology,” says Jim Rogers, Director of Operations at ORNL’s National Center for Computational Sciences (NCCS), which houses the new Cray XT6 machine. Rogers also directs the National Climate-Computing Research Center (NCRC) project at ORNL that includes Gaea. As part of a collaborative Work for Others Agreement, the supercomputer is owned by DOE and operated by ORNL’s managing contractor, UT-Battelle, on behalf of the NOAA customer.

Gaea occupies the same half-acre computer room as Jaguar, the Cray XT5 system run by ORNL and funded by the Department of Energy’s (DOE’s) Office of Science, and Kraken, the Cray XT5 system run by the University of Tennessee and ORNL and funded by the National Science Foundation.

Cray will deliver the Gaea HPC system through a series of upgrades that will culminate in a petascale system by the end of 2011.

In June 2010, installation concluded for a 260-teraflop (trillion calculations per second) Cray XT6 system with 2,576 AMD “Magny-Cours” 12-core, 2.1 GHz processors. After passing a series of acceptance tests, Gaea was released to early users. In September, nearly a dozen users began ramping up their data production.

In June 2011, a 720-teraflop Cray XE6 system will be added to Gaea. It will employ the AMD Interlagos 16-core processor. After the installation of that second system, the original 260-teraflop system will be upgraded with the same AMD Interlagos processor, increasing its peak performance to 386 teraflops.

The aggregate Gaea system will have a total memory size of 248 terabytes and a peak calculating capability of 1.1 petaflops (quadrillion floating point operations per second), bringing the number of petascale systems at ORNL, the world’s most powerful computing complex, to three.

This next-generation HPC system is liquid-cooled using Cray’s ECOphlex™ technology, which employs a refrigerant to remove most of the 2.2 MW heat load. The technology is significantly more energy-efficient than the air-cooling systems typically found in other leading-edge HPC systems.

Other elements of the Climate Modeling and Research System (CMRS) anchored by Gaea include two separate Lustre parallel file systems that handle data sets among the world’s largest. The first, the LTFS, is a high-capacity file system based on Data Direct Networks (DDN) SFA10000 that can stage up to 3.6 petabytes of information among ORNL and NOAA facilities. In addition, there is a high-speed file system, also using the DDN SFA10000, with more than a petabyte of storage that provides fast scratch space for the compute partitions.

NOAA research partners access the system remotely through speedy wide area connections. Two 10-gigabit (billion bit) lambdas, or optical waves, pass data to NOAA’s national research network through peering points at Atlanta and Chicago.

Mission

The NCRC project supports end-users at multiple sites across the U.S. and on multiple computational resources. It does this through the data collected through the supercomputer, Gaea, providing numbers, climate simulations, and other data in order to determine outcomes.

Core functions

Gaea occupies the same half-acre computer room as Jaguar, the Cray XT5 system run by ORNL and funded by the Department of Energy’s (DOE’s) Office of Science, and Kraken, the Cray XT5 system run by the University of Tennessee and ORNL and funded by the National Science Foundation. In June 2010, installation concluded for a 260-teraflop (trillion calculations per second) Cray XT6 system with 2,576 AMD “Magny-Cours” 12-core, 2.1 GHz processors. After passing a series of acceptance tests, Gaea was released to early users. In September, nearly a dozen users began ramping up their data production. In June 2011, a 720-teraflop Cray XE6 system will be added to Gaea. It will employ the AMD Interlagos 16-core processor. After the installation of that second system, the original 260-teraflop system will be upgraded with the same AMD Interlagos processor, increasing its peak performance to 386 teraflops. The aggregate Gaea system will have a total memory size of 248 terabytes and a peak calculating capability of 1.1 petaflops (quadrillion floating-point operations per second), bringing the number of petascale systems at ORNL, the world’s most powerful computing complex, to three. This next-generation HPC system is liquid-cooled using Cray’s ECOphlex™ technology, which employs a refrigerant to remove most of the 2.2 MW heat load. The technology is significantly more energy-efficient than the air-cooling systems typically found in other leading-edge HPC systems.

Other Features

Other elements of the Climate Modeling and Research System (CMRS) anchored by Gaea include two separate Lustre parallel file systems that handle data sets among the world’s largest. The first, the LTFS, is a high-capacity file system based on Data Direct Networks (DDN) SFA10000 that can stage up to 3.6 petabytes of information among ORNL and NOAA facilities. In addition, there is a high-speed file system, also using the DDN SFA10000, with more than a petabyte of storage that provides fast scratch space for the computer partitions.

How Is It Operated

NOAA research partners access the system remotely through speedy wide-area connections. Two 10-gigabit (billion bit) lambdas, or optical waves, pass data to NOAA’s national research network through peering points at Atlanta and Chicago.

What is Gaea

Gaea, a Cray XE6 supercomputer, is the primary computational resource within the NCRC project. End-user support on Gaea is handled by individual NOAA sites. Please contact your site’s tech support representatives for issues regarding Gaea. It is a Cray XE6 that through a series of upgrades that will culminate in a petascale system by the end of 2011. The aggregate Gaea system will have a total memory size of 248 terabytes and a peak calculating capability of 1.1 petaflops (quadrillion floating-point operations per second), bringing the number of petascale systems at ORNL, the world’s most powerful computing complex, to three. The CMRS includes two separate file systems, both founded on the Lustre parallel file system, to handle data sets that will be among the world’s largest. A high-capacity file system based on DataDirect Networks SFA10000 can stage up to 3.6 petabytes of information. Meanwhile, a high-speed file system with more than a petabyte of storage provides fast scratch space.

Related Links

External links

References