Embed
Email

PHENIX Computing Center in Japan_CC-J_

Document Sample

Shared by: ewghwehws
Categories
Tags
Stats
views:
0
posted:
1/26/2012
language:
pages:
16
PHENIX Computing Center

in Japan (CC-J)









Takashi Ichihara

(RIKEN and RIKEN BNL Research Center)





Presented on 08/02/2000 at CHEP2000 conference, Padova, Italy

Contents

1. Overview

2. Concept of the system

3. System Requirement

4. Other requirement as a Regional Computing Center

5. Plan and current status

6. WG for constructing the CC-J (CC-J WG)

7. Current configuration of the CC-J

8. Photographs of the CC-J

9. Linux CPU farm

10. Linux NFS performance v.s. kernel

11. HPSS current configuration

12. HPSS performance test

13. WAN performance test

14. Summary





Takashi Ichihara (RIKEN / RIKEN BNL

Research Center)

PHENIX CC-J : Overview

 PHENIX Regional Computing Center in Japan (CC-J) at RIKEN

 Scope

 Principal site of computing for PHENIX simulation

 PHENIX CC-J is aiming at covering most of the simulation tasks of the whole

PHENIX experiments

 Regional Asian computing center

 Center for the analysis of RHIC spin physics

 Architecture

 Essentially follow the architecture of RHIC Computing Facility

(RCF) at BNL

 Construction

 R&D for the CC-J started in April ‘98 at RBRC

 Construction began in April ‘99 over a three years period

 1/3 scale of of the CC-J will be operational in April 2000

Takashi Ichihara (RIKEN / RIKEN BNL

Research Center)

Concept of the CC-J System

import HPSS SMP

PHENIX CC -J

Duplicating Facility

DST Servers Servers

DST

HPSS DST DST

DST PC farms

STK 15TB

Tapes

Big Phys. for ana. &

(50GB/ Tape

volume)

Robot Disk simulation

sim. 10k Spectnt95

Tape drive units DST

to duplicate data Export

sim.

Sim.



APAN/ESNET

WAN



Duplicating Facility HPSS

SMP

Servers RCF PHENIX

Servers

DST DST HPSS

Tapes 40TB STK

(50GB/ Big Tape

volume) Disk Robot

20MB/s

Tape drive units Physics DST DST Raw

to duplicate data

Track

CAS CRS reconstruction





Takashi Ichihara (RIKEN / RIKEN BNL

Research Center)

System Requirement for the CC-J



 Annual Data amount CPU ( SPECint95)

DST 150 TB Simulation 8200

micro-DST 45 TB

Sim. Reconst 1300

Simulated Data 30 TB

Sim. ana. 170

Total 225 TB

Theor. Mode 800

Data Analysis 1000

 Hierarchical Storage System

Total 11470

Handle data amount of 225TB/year

Total I/O bandwidth: 112 MB/s

Data Duplication Facility

HPSS system

Export/import DST, simulated data.

 Disk storage system

15 TB capacity

All RAID system

I/O bandwidth: 520 MB/s



Takashi Ichihara (RIKEN / RIKEN BNL

Research Center)

Other Requirements as a Regional Computing Center



 Software Environment

• Software environment of the CC-J should be compatible to the PHENIX Offline

Software environment at the RHIC Computing Facility (RCF) at BNL

• AFS accessibility (/afs/rhic)

• Objectivity/DB accessibility (replication to be tested soon)

 Data Accessibility

• Need exchange data of 225 TB/year to RCF

• Most part of the data exchange will be done by SD3 tape cartridges (50GB/volume)

• Some part of the data exchange will be done over the WAN

• CC-J will use Asia-Pacific Advanced Network (APAN) for US-Japan connection

• http://www.apan.net/

• APAN has currently 70 Mbps bandwidth for Japan-US connection

• Expecting 10-30% of the APAN bandwidth (7-21 M bps) can be used for this project:

• 75-230 GB/day ( 27 - 82 TB/year) will be transferred over the WAN









Takashi Ichihara (RIKEN / RIKEN BNL

Research Center)

Plan and current status of the CC-J

1998 1999 2000 2001 2002



April April April April April

Jan. 2000 Mar. 2001 Mar. 2002

RBRC R&D for CC-J CPU farm (number) 64 200 300

CC-J frontend at BNL

(BNL) Pro to type of CPU farms CPU farm (SPECint95) 1500 5900 10700

Data Du plica tio n fa cility Tape Storage size(TB) 100 100 100

Phase 1 Disk Storage size(TB) 2 10 15

RIKEN CC-J Phase 2

Tape Drive (number) 4 7 10

construction 1/3 scale Phase 3

W ako 2/3 scale Tape I/O (MB/s) 45 78 112

Full scale Disk I/O (MB/s) 100 400 600

CC-J Working Group SUN SMP Server unit 2 4 6

formed (Oc t. 1998)

CC-J s tarts operation HPSS Server unit 5 5 5

CC-J review at BNL at 1/3 scale Full sc ale CC-J

(Dec. 1998) (April. 2000) (Mar. 2002)



HPSS Softwar e/Hard ware

Installation (March 19 99)

(Su pple mentar y Bud get) PHENIX Exp. at RHIC









Takashi Ichihara (RIKEN / RIKEN BNL

Research Center)

Working Group for the CC-J construction (CC-J WG)

 CC-J WG is a main body to construct the CC-J

Working Group for the CC-J construction (CC-J WG)



manager Servers, Network, HPSS T. Ichihara (RIKEN and RBRC)

technical manager HPSS Y. Watanabe (RIKEN and RBRC)

computer scientists CPU farms, HPSS N. Hayashi (RIKEN)

Bach queue system S. Sawada (KEK)

System monitor S. Yokkaichi (Kyoto Univ.)

scientific programming coordinator

coordination H. En'yo (Kyoto and RBRC)

AFS mirroring H. Hamagaki (CNS, U-Tokyo)



front-end BNL Data duplication Y. Watanabe (RIKEN and RBRC)

Software environment Y. Goto (RBRC)

Prototype CPU farms A. Taketani (RIKEN)





 Hold bi-weekly regular meeting at RIKEN Wako, to discuss technical items

and project plans etc.

 Mailing list of the CC-J WG created (mail traffic: 1600 mails /year)





Takashi Ichihara (RIKEN / RIKEN BNL

Research Center)

Current configuration of the CC-J

PHENIX Computing Center In Japan

32 Pentium II (450 MHz )+

current config. updated on 14 Jan. 2000

32 Pentium III (600 MHz )

256 MB Memory /CPU

(Alta cluster) * 4 box

288 GB

U

S N E450

Pentium

Pentium IIIII RAID Disk

Pentium II III

PentiumII NFS Server 1 00GB

Pentium

Pentium II 4CPU, 1GB memory

PentiumII

Pentium II

Redhat 5.2

1.6 TB

Linux

RAID Disk

Pentium III U

S N E450

Pentium II .C . erver 1 00GB

G .E S

Pentium II

Pentium

Pentium II II III

Pentium

Pentium II 2CPU, 1GB memory

Pentium II 1000Bas eSX

(9kB MTU)

Serial

100BaseT x 32

Gigabit HIPPI Jumbo Frame

Gigabit Switch Gigabit Ethernet

Catalyst 2948G 1000 Switch #2 (L3)

BaseSX Alteon 180 (9KB MTU) 1000Bas eSX

Private Altacluste r

comp ac

(9kB MTU) Main

contro l WS AFS se rver

address DS20 (e xper imen tal) Bldg.



HIPPI

Comp.

SWITCH Bldg.

EPS-1 000

28 8 GB 1000Bas eSX

HIPPI x 5 (9 kB M TU)

4 RedWood Raid (Work)

SP Router

100 TB SD3 drives Asc end GRF

Tape Mover Gigabit

HPSS Tape Mover Switch

Dis k Mover #1 (L3)

STK Dis k Mover Alteon 180

Tape HPSS Server (9kB MTU)

HPSS

Robot 28 8 GB

15 0GB Raid

HPSS Ca che

Contro ll WS

10/100Bas eT

RIKEN Raid X 2 IBM RS/6000-SP

SUN HPSS Ca che 1000Bas eSX

super ACSLS Silver node x 5

(AIX 4.3.2)

computer RIKEN LAN

Switch

ACSLS

HPSS Switch

Photographs of the PHENIX CC-J at RIKEN

StorageTek Tape Robot (100TB [250 TB]) HPSS Server (IBM RS-6000/SP)

STK Tape Robot (100 TB [240 TB] )









CPU Farm of

1. 6 TB

RAID5 Disk 64 CPU









TWO SUN E450

Data Servers





Uninterruptable

Power Supply (UPS) Uninterruptable

Power Supply (UPS)

Linux CPU farms

 Memory Requirement : 200-300 MB/CPU for a simulation chain

 Node specification

• Motherboard: ASUS p2b

• Dual CPU /node (currently total 64 CPU)

• PentiumII (450MHz) 32 CPU + Pentium III (600 MHz) 32 CPU

• 512 MB memory / node (1GB SWAP/node)

• 14 GB HD /node (system 4GB, work 10 GB)

• 100 BaseT Ethernet interface (DECchip Tulip)

• Linux Redhat 5.2 (kernel 2.2.11 + nfsv3 patch)

• Portable Batch System (PBS V2.1) for batch queuing

• AFS is accessed through the NFS (No AFS client is installed on Linux pc)

• Daily mirroring of the /afs/rhic contents to a local disk file system is carrying out

 PC Assemble (Alta cluster)

• Remote hardware-reset/power control, Remote CPU temp. monitor

• Serial port login from the next node (minicom) for maintenance (fsck etc.)





Takashi Ichihara (RIKEN / RIKEN BNL

Research Center)

Linux NFS performance v.s. kernel

 NFS Performance test using bonnie benchmark for 2 GB file

• NFS Server : SUN Enterprise 450 (Solaris 2.6) 4 CPU (400MHz) 1GB

memory

• NFS client : Linux RH5.2, Dual Pentium II 600 MB, 512 MB memory

NFS Write NFS Write NFS Read NFS Read

(per Char) (Block) (per Char) (Block)

2.2.11 0.6 MB/s 0.5 MB/s 4.7 MB/s 5.4 MB/s

2.2.11+nfsv3 7.1 MB/s 6.5 MB/s 6.4 MB/s 9.8 MB/s

2.2.14 1.1 MB/s 1.9 MB/S 4.7 MB/ 5.8 MB/s

2.2.14+nfsv3 5.5 MB/s 5.6 MB/s 6.2 MB/s 10.2 MB/s





 NFS performance of the recent Linux kernel seems to be improved

 nfsv3 patch is still useful for the recent kernel (2.2.14)

– currently we are using the kernel 2.2.11 + nfsv3 patch

–nfsv3 patch is available from http://www.fys.uio.no/~trondmy/src/



Takashi Ichihara (RIKEN / RIKEN BNL

Research Center)

Current HPSS hardware configuration

• IBM RS6000-SP

• 5-node (silver node: Quadruple PowerPC604e 332 MHz CPU/node)

• Core server : 1, Disk mover : 2, Tape mover : 2

• SP switch (300 MB/s) and 1000BaseSX NIC (OEM of Alteon)

• A StorageTek Powderhorn Tape Robot

• 4 Redwood drives and 2000 SD3 cartridges (100 TB) dedicated for HPSS

• Sharing the robot with other HSM systems

• 6 drives and 3000 cartridges for other HSM systems

• Gigabit Ethernet

• Alteon ACE180 switch for Jumbo Frame ( 9 kB MTU)

• Use of the Jumbo Frame reduces the CPU utilization for transfer

• CISCO Catalyst 2948G for distribution to 100BaseT

• Cache Disk : 700 GB (total), 5 components

• 3 SSA loops (50 GB each)

• 2 FW-SCSI RAID (270 GB each)



Takashi Ichihara (RIKEN / RIKEN BNL

Research Center)

Performance test of parallel ftp (pftp) of HPSS

 pput from SUN-E450 : 12 MB/s for one pftp connection

• Gigabit Ethernet, Jumbo Frame (9 kB MTU)

 pput from LINUX : 6 MB/s for one pftp connection

• 100BaseT - G.Ether - Jumbo (defragment on a switch)

 Totally 〜50 MB/s pftp performance was obtained for pput









Takashi Ichihara (RIKEN / RIKEN BNL

Research Center)

WAN performance test

 RIKEN (12 Mbps) - IMnet - APAN (70 Mbps) -startap- ESnet - BNL

• Round Trip Time for RIKEN-BNL :170 ms

• File transfer rate is 47 kB/s for 8 kB TCP widowsize (Solaris default)

• Large TCP-window size is necessary to obtain high-transfer rate

• RFC1323 (TCP Extensions for high performance, May 1992) describes the

method of using large TCP window-size (> 64 KB)

TCP windowsize FTP transfer rate Theoretical lim it

(observed) For 170 ms RTT

8 kB 41 kB/s 47 kB/s

16 kB 87 kB/s 94 kB/s

32 kB 163 kB/s 188 kB/s

64 kB 288 kB/s 376 kB/s

128 kB 453 kB/s 752 kB/s

256 kB 585 kB/s 1500 kB/s

512 kB 641 kB/s 3010 kB/s





 Large ftp performance (641 kB/s = 5 Mbps) was obtained for a single ftp

connection using a large TCP window-size (512 kB) over the pacific ocean

(RTT = 170 ms)

Takashi Ichihara (RIKEN / RIKEN BNL

Research Center)

Summary

 The construction of the PHENIX Computing Center in Japan (CC-J) at RIKEN

Wako campus, which will extend over a three years period, began in April 1999.

 The CC-J is intended as the principal site of computing for PHENIX

simulation, a regional PHENIX Asian computing center, and a center for the

analysis of RHIC spin Physics.

 The CC-J will handle the data of about 220 TB/year and the total CPU

performance is planned to be 10,000 SPECint95 in 2002.

 CPU farm of 64 processors (RH5.2, kernel 2.2.11 with nfsv3 patch) is stable.

 About 50 MB/s pftp performance was obtained for HPSS access.

 Large ftp performance (641 KB/s = 5 Mbps) was obtained for a single ftp

connection using a large TCP window-size (512 kB) over the Pacific Ocean

(RTT = 170 ms)

 Stress tests for the entire system were carried out successfully.

 Replication of the Objectivity/DB over the WAN will be tested soon.

 The CC-J operation will be started in April 2000.



Takashi Ichihara (RIKEN / RIKEN BNL

Research Center)



Related docs
Other docs by ewghwehws
By registering with docstoc.com you agree to our
privacy policy

You are almost ready to download!

You are almost ready to download!