Introduction of
Supercomputer “Columbia”
By
Desira Stover, Zhizhou Wang
- Built by NASA, Intel, SGI, and Voltaire finished on Oct. 26, 2004
- Named to honor the crew of the Space Shuttle Columbia lost Feb. 2003
Columbia System Facts (1)
• Based on SGI® NUMAflex™ architecture
20 SGI® Altix™ 3700 superclusters, each with 512 processors
Global shared memory across 512 processors
• 10,240 Intel Itanium® 2 processors
Current processor speed: 1.5 gigahertz
Current cache: 6 megabytes
• 1 terabyte of memory per 512 processors, with 20
terabytes total memory
Columbia System Facts (2)
• Operating Environment
Linux® based operating system
PBS Pro™ job scheduler
Intel® Fortran/C/C++ compiler
SGI® ProPack™ 3.2 software
• Interconnect
SGI® NUMAlink™
InfiniBand network
10 gigabit Ethernet
1 gigabit Ethernet
• Storage
Online: 440 terabytes of Fibre Channel RAID storage
Archive storage capacity: 10 petabytes
SGI Altix 3700 Hardware
• An adaptation of SGI's Origin 3000 systems, which use
SGI's NUMAflex global shared-memory architecture;
- The NUMAflex design enables the CPU, memory, I/O, interconnect,
graphics, and storage to be packaged into modular components, or
"bricks".
• Each 3700 in NASA Consists of: 128 CPU-Bricks, 112
Router-Bricks, 4 I/0-bricks.
• Each C-Brick on the Altix contains: 2 nodes with 4 processors,
2 SHUB, ~7.6 GB of memory, one network interface and one I/O
interface.
SGI Altix 3700 Software
• Operating System: SGI Linux Environment 7.2 with SGI
ProPack (RedHat 7.2 with Linux 2.4.21)
• Compilers: Intel IPF (Itanium Processor Family) Compilers: C/C++,
Fortran for Linux ; GNU compilers: C , Fortran 77
• Filesystem Softwares: XFS 64-bit journaled filesystem ; CXFS
shared filesystem
• Other Softwares: Debugers, Libraries , Performance Analysis ,
Linux System Utilities …
Intel Itanium Processor 2 (1)
• The Itanium chip is based on the IA-64 (Intel Architecture, 64 bit)
architecture that implements the EPIC (Explicit Parallel Instruction set
Computing) technology.
• The Itanium processors use long instruction words. Specifically, three
instructions are grouped into a 128-bit bundle. Each instruction is 41
bits wide.
• Four memory-load operations per cycle can be delivered from the L2
cache to the floating-point register file.
• Branch predication: the processor can predict the outcome and proceed
on the basis of that prediction point(branch), and current processors try
to guess which branch to take.
• Speculative loads: look ahead at its instruction and load the required
data from the memory early.
Intel Itanium Processor 2 (2)
Specifications:
• 32KB L1 instruction cache and L1 data cache
• 256KB L2 unified (instruction and data) cache
• 6MB L3 unified (instruction and data) cache
• CPU Clock: 1.5GHZ
• Operating Systems: Windows* Server 2003(64-bit), HP-
UX* 11i, Red Hat Linux*, SuSE Linux*, MSC.Linux*,
United Linux*, Open VMS.
Voltaire ISR9288 InfiniBand Cluster Switch
Key Features:
• Up to 288 InfiniBand 4X (10Gbps) ports in a 14U enclosure;
Alternative 96 InfiniBand 12X (30 Gbps) ports
• 5.76 Tbps of full bisectional switch bandwidth in a Fat-Tree (CLOS)
architecture
• Less than 420 nanosecond of latency between any two ports
• Optional multiprotocol connectivity with up to 132 GbE ports and up
to 132 2 Gb FC ports
• Hot-swappable components, redundant management blades, power
supplies and fans meet stringent availability requirements
• InfiniBandTM specification 1.1 compliant
Benchmarks and Rank
• sustained performance of 51.87 trillion
calculations per second (teraflops)
• peak performance of 60.96 teraflops
• Ranks 2nd in Top 500
NASA’s Columbia Supercomputer
Used in:
• Shuttle ascent simulation and modeling
• Fuel liner analysis conducted for the NASA Engineering
Safety Center and Return to Flight program
• Surface speed of the ocean on a unique “cubesphere” grid
for accurate depiction of th poles
• Space and life science, mission safety, aeronautics,
and Earth sciences…
References:
• http://www.nas.nasa.gov
• http://www.sgi.com
• http://www.intel.com
• http://www.voltaire.com
• http://www.top500.org