beowulf

Document Sample
beowulf Powered By Docstoc
					                                         Beowulf―Parallel Library
                                                       Benchmarks
                                                    on IBM Netfinity
                                                 Clustering System

This is a report on the test of compatibility of Parallel Library with IBM Netfinity Clustering
System.


PC Server:
Linux Clustering System
Workstation : IBM Netfinity x 12 (each node has two Pentium III processors 750MHz)
Network : Giganet cLan


Software:
Linux : Turbo Linux Server 6.1, kernel : 2.2.15-8
Compiler: g77 Ver.1.1.2-30
Message Passing Library: MPI/Pro


Communication:
Each node in the cluster has two-network interface. One is ordinary fast ether, and another is
‘Giganet cLAN’ which enables broadband data communication between each PC in the cluster.
‘cLan’ is capable to use the Virtual Interface Architecture (VIA). It is reported that the peak
performance of ‘cLan’ using VIA is 1.25Gbps. MPI/Pro enables us to choose which protocol to
use for communication setting environment variable.


Benchmark program:
The benchmark applied a number of the routines in the NAG Parallel Library for the iterative
solution of sparse systems of linear equations to the solution of a realistic problem derived from
the discretisation of a PDE. The mathematical model consisted of the time-dependent diffusion
equation on a 3-D rectangular domain, and it was discretised using Crank-Nicholson scheme. The
equations were solved using the Conjugate Gradient (CG) iterative method.




Numerical Algorithms Group  630-971-2337 naginfo@nag.com  www.nag.com                        1
Processor Selection:
There can be several ways to spawn processes to each processor in the clustering system. For
example, to get 2 processors, you can use both of 2 processors on one node, or you can also use
two nodes using only one of two processors on each node. The performance can be affected by the
processor selection policy as shown below.


Result:
The following data reports parallel speed-ups. The wall-clock (elapsed) times for the average
solution time for each time step were taken and used to compute speed-ups. The data is observed
changing,
                   problem size
                   number of processes
                   communication protocol
                   processor selection policy



Case 1
               Problem size: the cube of 32, 48 and 64
               Number of processes: 1, 2, 4, 8 and 10
               Communication protocol: VIA
               Processor selection policy: One process for One node.
      

                                   32^3               48^3           64^3
                       1           1.00               1.00           1.00
                       2           1.96               1.98           1.95
                       4           3.49               3.88           3.75
                       8           5.32               7.14           7.30
                       10          5.90               8.43           8.86

           12
           10
            8                                                 i
                                                             lnear
   speed




                                                             32^3
            6
                                                             48^3
            4                                                64^3
            2
            0
                0              5                 10             15
                                    n-procs




Numerical Algorithms Group  630-971-2337 naginfo@nag.com  www.nag.com                     2
Case 2
              Problem size: the cube of 32, 48 and 64
              Number of processes: 1, 2, 4, 8 and 10
              Communication protocol: TCP/IP
              Processor selection policy: One process for One node.
      

                                  32^3            48^3            64^3
                      1           1.00            1.00            1.00
                      2           1.49            1.84            1.89
                      4           1.64            2.98            3.28
                      8           1.38            3.67            5.07
                      10          1.33            3.71            5.65



          12
          10
                                                          i
                                                         lnear
           8                                             32^3
  speed




           6                                             48^3
                                                         64^3
           4
           2
           0
               0              5   n-procs   10              15




Numerical Algorithms Group  630-971-2337 naginfo@nag.com  www.nag.com   3
Case 3
              Problem size: the cube of 32, 48, 64 and 80
              Number of processes: 1, 2, 4, 8, 16 and 20
              Communication protocol: VIA
              Processor selection policy: Two processes for one node.
    

                         32^3             48^3               64^3        80^3
           1             1.00             1.00               1.00
           2             1.52             1.51               1.48
           4             2.64             2.98               2.86        2.88
           8             3.89             5.43               5.44        4.84
           16            4.46             8.26               9.67        7.23
           20            4.46             8.91               11.09       7.85


          25

          20
                                                              i
                                                             lnear
          15                                                 32^3
  speed




                                                             48^3
          10                                                 64^3
                                                             80^3
           5

           0
                0       5       10       15       20         25
                                  n-procs



Note:
              I used BLAS included in NAG Parallel Library instead of Intel BLAS. The profile of
               speed up would be different if Intel BLAS was used.




Numerical Algorithms Group  630-971-2337 naginfo@nag.com  www.nag.com                       4

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:0
posted:9/30/2012
language:Unknown
pages:4