Testing MVAPICH-uDAPL with Voltaire IB on Solaris 10 x64

Document Sample
Testing MVAPICH-uDAPL with Voltaire IB on Solaris 10 x64 Powered By Docstoc
					Testing MVAPICH-uDAPL with Voltaire IB
                    on Solaris 10 x64

                      Liang PENG, Lip-Kian NG
Testing MVAPICH-uDAPL with Voltaire IB on Solaris 10 x64

Liang PENG, Lip-Kian NG



InfiniBand is fast becoming one of the major high performance network interconnect for
contemporary data centers. Along with wider adoption of InfiniBand Architecture (IBA),
it requires software level support to applications that are running over networks, such as
MPI programs. MVAPICH is a software package that enable people to develop/run
MPICH programs over InfiniBand. MVAPICH-uDAPL, a new release of MVAPICH,
utilizes uDAPL (user level Direct Access Programming Libraries) for inter-operability
over various network interconnects. It has been tested on Linux but not fully tested on
Solaris 10 environment. This technical brief describes the steps required to setup
MVAPICH-uDAPL in Solaris 10 with Voltaire IB equipment.

Keywords: MPICH, MVAPICH, InfiniBand, uDAPL

                                                                       Email Address:
                  Testing MVAPICH-uDAPL with Voltaire IB
                             on Solaris 10 x64
                             Liang PENG, Lip-Kian NG
                     Asia Pacific Science and Technology Center
                                Sun Microsystems Inc.

1. Introduction

InfiniBand (IB) Architecture is an emerging open-standard, high-performance, network
interconnect being widely adopted by HPC clusters, data centers, and file/storage systems.
It has many outstanding features over other network architectures such as low latency,
high bandwidth, enhanced reliability, availability, and serviceability (RAS), etc. Since it
is reinventing connectivity and different from existing TCP/IP over ethernet, it requires
some extra support to enable existing parallel application running over InfiniBand.
MVAPICH is a software package which aims to solve this problem.

uDAPL (User-level Direct Access Programming Library) defines a standard and device
independent interface for accessing the transport mechanisms of RDMA (Remote Direct
Memory Access) capable networks, so it actually builds a software layer between
applications and various network interconnects so that applications can run over different

MVAPICH, developed by Network based Computing Lab, Ohio State University, is an
MPI-1 implementation and enables MPICH programs to run on InfiniBand networks. It
uses the Verbs Level Interface (VAPI), which is developed by Mellanox Technologies.
MVAPICH software is being used by more than 260 organizations world-wide to extract
the potential of InfiniBand networking technology for designing high-end computing
systems and servers. It is also being distributed by many IBA vendors as part of their
software bundle for parallel computing.

MVAPICH-uDAPL is based on MVAPICH utilizing the uDAPL interface and has been
extensively tested on Linux. (Note: MVAPICH2, which is based on MPICH2, is already
using uDAPL.)

The following sections will describe the steps required to setup a test environment for
MVAPICH-uDAPL with Voltaire equipment and Solaris 10.
2. Pre-requisites

Before moving on, ensure that your test environment satisfy the following requirements:
   1. The Voltaire ISR9024 switch software version is at least 3.4.1 Build 291. To
       verify, login to the switch console and execute the command “version show”. (See
       Appendix A on “How to upgrade the switch software” if required.)

        > telnet voltaire
        Connected to voltaire.
        Escape character is '^]'.

        ISR9024-0712 login: admin

        Welcome to Voltaire Switch ISR9024-0712

        ISR9024-0712> version show
        ISR 9224 version: 3.4.1
                date:    Aug 14 2005 12:13:02 PM
                build Id:291

   2. The following Solaris packages are installed on all your systems.

       SUNWib           Sun InfiniBand Framework
       SUNWtavor        Sun Tavor HCA Driver
       SUNWipoib        Sun IP over InfiniBand
       SUNWudaplr       Direct Access Transport (DAT) registry package (root)
       SUNWudaplu Direct Access Transport (DAT) registry packages (usr)
       SUNWudapltu Service Provider for Tavor packages (usr)
       SUNWudapltr Service Provider for Tavor packages (root)

   3. Sun Studio 10 or GCC (or other compilers) is installed on at least 1 system.

3. Voltaire IB on Solaris

The first step is to set up IPoIB. IPoIB implements IP over InfiniBand, which is required
by uDAPL, which in turn, is required by MVAPICH-uDAPL.

The latest version of Solaris already provide pretty good support of IPoIB such that the
OS can talk to IB switch and the subnet manager to get the IB network interface ready.
The key thing here is that you have to make sure you are using the latest version of the
software for your IB switch.

After booting the hosts, you should be able to see something like:
# cfgadm -a
Ap_Id                                         Type             Receptacle     Occupant
c1                                scsi-bus       connected     configured    unknown
c1::dsk/c1t1d0                    disk           connected     configured    unknown
hca:66A00980041DD                 IB-HCA         connected     configured    ok
ib                                IB-Fabric      connected     configured    ok
ib::66A00A00041DD,ffff,ipib       IB-VPPA        connected     configured    ok
ib::66A01A00041DD,ffff,ipib       IB-VPPA        connected     unconfigured unknown
ib::daplt,0                       IB-PSEUDO      connected     configured    ok
ib::rpcib,0                       IB-PSEUDO      connected     configured    ok
usb0/1                            unknown        empty         unconfigured ok
usb0/2                            unknown        empty         unconfigured ok
usb0/3                            unknown        empty         unconfigured ok
usb1/1                            unknown        empty         unconfigured ok
usb1/2                            unknown        empty         unconfigured ok
usb1/3                            unknown        empty         unconfigured ok

This highlighted text shows that one of the IPoIB is configured and can be brought up by
using the “ifconfig” command as follows:

1)   ifconfig ibd0 plumb
     ifconfig ibd0 <ip address/class> up
2) Create the file /etc/hostname.ibd0 containing the hostname for the
3) Update the /etc/hosts file.

4. Solaris uDAPL

As it is mentioned in previous sections, Solaris uDAPL is required by MVAPICH-
uDAPL. uDAPL can be enabled by following the steps:
Step 1         Login as root.
Step 2         Ensure that IPoIB has been configured (See Section 3).
Step 3         Setup uDAPL.
               datadm -a /usr/share/dat/SUNWudaplt.conf

Step 4         Verify
               # datadm -v
               ibd0 u1.2 nonthreadsafe        default     SUNW.1.0      "
               " "driver_name=tavor"

Step 5         Run the “Echo” and “Throughput” test program.
               If you encounter errors like "[pid 1016][tid 1] failure: dat_ep_connect
               (0x40000) ", it is likely due to one of the ports is not working.
               A workaround is to disable APM in udapl as follows:
               # modload /kernel/drv/amd64/daplt
               # echo "daplka_apm/W 0" | mdb -kw


Now the testbed is ready for MVAPICH-uDAPL with the following steps:

Step 1         Download and decompress mvapich-udapl.tar.gz. This package is
               available at
Step 2         Modify the Makefile appropriately (eg. Paths etc)
Step 3         Execute the install script. This will configure, compile and install
               mvapich (installing mpich).
Step 4         Ensure that you can ssh between the hosts via the IB network without
Step 5         Run the test examples in osu_benchmarks subdirectory.
               # ../bin/mpicc osu_latency.c -o ./lat
               # ../bin/mpirun_rsh -np 2 s0 s1 ./lat

6. Summary

Testing MVAPICH-uDAPL is rather straightforward after IPoIB and Solaris uDAPL is
set up, which is a relatively tougher part. However, currently Solaris support uDAPL
pretty well already, and with the latest software from IB vendor, this should not be very
Appendix A: “How to upgrade the switch software”

Step 1       Download the software to a server which has ftp service. The required software version
             is available for internal evaluation at
Step 2       Connect to the serial console of the switch using a terminal emulation program (such as
             hyperterminal, minicom) with the following settings
                 -   Terminal mode     :   VT100
                 -   Baud              :   38400
                 -   Parity            :   No Parity
                 -   Stop Bit          :   1 stop bit
                 -   Flow Control      :   None
Step 3       Login to the serial console (default id/passwd: admin/123456)
Step 4       Switch to privileged mode
                 ISR9024-0712> enable
                 password: (default: voltaire)
Step 5       Now, switch to config mode
                 ISR9024-0712# config
Step 6       Set up the ftp client parameters with the ftp server IP, username and password.
                 ISR9024-0712(config)# ftp
                 ISR9024-0712(config-ftp)# server <ip address of ftp server>
                 ISR9024-0712(config-ftp)# username <ftp login id>
                 ISR9024-0712(config-ftp)# password <passwd of ftp login id>
Step 7       Verify the saved parameters.
                 ISR9024-0712(config-ftp)# ftp show
                 ftp configuration
                 remote server:
                 user:          lkng
Step 8       Exit back to privileged mode.
                 ISR9024-0712(config-ftp)# exit
                 ISR9024-0712(config)# exit
Step 9       Update the software.
                ISR9024-0712# update software <path to dir containing IB
Step 10      Follow the on-screen instructions. Ensure that the switch is rebooted.
Appendix B: Benchmarking Results

• Solaris 10 x86 GA (using 64-bit kernel)

    System Specs:
•   Sun Fire V20z (Quantity: 2)
•   2 X AMD Opteron 248 (2.2 GHz)
•   4GB RAM
•   SilverStorm HCA 7000* (Tavor-based chipset), operating at 64-bit, PCI-X 133MHz
    * Only 1 IB port is used per V20z for the benchmark

•   Sun Studio 10 for OpenSolaris

    InfiniBand Switch:
•   Voltaire ISR-9024 with software version: 3.4.01 build 291
Raw Data:
# ../bin/mpirun_rsh -np 2 gossamer-s10-ib0 thebe-s10-ib0 ./lat
# OSU MPI Latency Test (Version 2.0)
# Size      Latency (us)
0         5.47
1         5.46
2         5.49
4         5.50
8         5.45
16         5.47
32         5.57
64         5.83
128        6.10
256        6.75
512        7.59
1024        9.40
2048        12.85
4096        16.70
8192        27.46
16384        45.03
32768        67.24
65536        111.63
131072       200.43
262144       380.07
524288       845.86
1048576       1711.77
2097152       3429.15
4194304       7665.47

# ../bin/mpirun_rsh -np 2 gossamer-s10-ib0 thebe-s10-ib0 ./bw
# OSU MPI Bandwidth Test (Version 2.0)
# Size      Bandwidth (MB/s)
1         0.374532
2         0.756010
4         1.517577
8         3.034254
16        6.064554
32        12.148535
64        24.275470
128        48.191070
256        95.628320
512        182.003999
1024        311.438483
2048        441.974643
4096        553.221484
8192        608.187460
16384       575.003290
32768       616.319981
65536       641.218134
131072       652.672824
262144       658.141677
524288       660.719962
1048576      662.619698
2097152      663.128774
4194304      659.032950

# ../bin/mpirun_rsh -np 2 gossamer-s10-ib0 thebe-s10-ib0 ./bibw
# OSU MPI Bidirectional Bandwidth Test (Version 2.0)
# Size      Bi-Bandwidth (MB/s)
1         0.387679
2         0.765917
4         1.549074
8         3.167827
16        6.082385
32        12.813614
64        25.867568
128       50.700913
256       95.189403
512       191.089340
1024      356.154557
2048      478.434808
4096      530.510893
8192      622.108311
16384      638.937314
32768      721.420721
65536      772.218356
131072     798.551897
262144     799.746212
524288     754.240231
1048576     743.392685
2097152     743.953594
4194304     728.102526

Shared By: