MPICH2 Installation Guide by AndyMcNally

VIEWS: 190 PAGES: 5

									              MPICH2 Version 1.1 Installation Guide
                               author: Justin Rutherford
                 last edited by: Dr. Phil Pfeiffer, Derrick Southerland
                               last updated: 19 June 2007


Background
MPICH2 is a public domain implementation of the Message Passing Interface for distributed
communication. This document describes how to install MPICH2 and validate the installation. This
guide is intended for use with the BASH shell version 3.1-6.2.
Prerequisites
   Package dependencies: none
   Other:
     a working directory into which to download the MPICH2 source
     a target directory into which to install MPICH2
     write access to /etc/profile
     write access to /etc/hosts
     create/write access to the root home directory
     create/write access to the home directories of any users who require access to MPICH2
     password-less SSH configured on all MPICH2 clients
     Fedora Core 5 with the default “Development” packages installed
     gcc and gfortran version 4.1.0

Installation Procedure
Notes: There are known issues with installing MPICH2 on a 64-bit operating system. These instructions
are specific to 32-bit platforms.
1.) Download the package <http://www-unix.mcs.anl.gov/mpi/mpich/>
2.) Change the current working directory to the directory where MPICH2 was downloaded, and extract
    the package, using a command like
        tar -zxvf mpich2-1.0.4p1.tar.gz
    Then change the current working directory to the MPICH2 source folder, using a command like
        cd mpich2-1.0.4p1
3.) Configure the MPICH2 install.
     Use the CC, CXX, and F77 system variables to specify the gcc, g++, and gfortran program names
       respectively.
     Use the “--prefix=” parameter to specify the installation directory.
     Use the “--disable-f90” option to force the compiler to use Fortran 77.
     It is recommended to install MPICH2 as root. If not currently logged in as root, then execute the
       “su” command before running this step to temporarily gain root access in the terminal. Typing
       “exit” while in SU mode returns to normal user permissions.
     Construct the command with a final “2>&1 | tee configure.log”, so as to cause it to write the error
       output to configure.log for later review.
MPICH2 Installation Guide                                             Page 2 of 5


       su
       CC=gcc CXX=g++ F77=gfortran ./configure -–prefix=/usr/local/mpich2
       --disable-f90 2>&1 | tee configure.log
MPICH2 Installation Guide                                                                         Page 3 of 5


4.) Make MPICH2, redirecting error messages to a log file.
        make 2>&1 | make.log
        make install 2>&1 | install.log
5.) Add the MPICH2 binary directory to the system path. There are two ways to do this.
      a.) To add the directory to all users' PATH variables, add the following line to the /etc/profile file.
          This requires root permissions and must be included prior to the "export PATH" line.
          PATH=/usr/local/mpich2/bin:$PATH
      b.) To add the directory to only a specific user's PATH, add the following line to the .bash_profile
          file inside of that user's home directory.
          export PATH=/usr/local/mpich2/bin:$PATH
      In order for these actions to take effect, the user must log out, and then log back in.
6.) MPICH2 requires that a “secret word” be stored in a file located in the user’s home directory with
    permissions set so other users cannot access it. The setup for root and non-root users differs slightly.
        For root:
          [root@node1 ~]$ echo "secretword=TheSecretWordFoo" > /etc/mpd.conf
          [root@node1 ~]$ chmod 600 /etc/mpd.conf

        For non-root user:
          [user@node1 ~]$ echo "secretword=TheSecretWordFoo" > ~/.mpd.conf
          [user@node1 ~]$ chmod 600 ~/.mpd.conf

    Since running MPICH2 jobs as root is not recommended, root configuration should not be needed.
7.) Use mpdcheck to check the MPICH2 configuration. If this command produces any output, MPICH2
    is not configured correctly. To see detailed information on how to fix a problem, use the -l flag.
        [user@node1 ~]$ mpdcheck
        *** the fq hostname seems to be localhost
        *** first ipaddr for this host (via origin) is: 127.0.0.1

        [user@node1 ~]$ mpdcheck -l
            **********
            Your fully qualified hostname seems to be set to 'localhost'.
            This generally means that your machine's /etc/hosts file contains a line
            similar to this:
        127.0.0.1 mybox1 localhost.localdomain localhost
            You probably want to remove your hostname from this line and place it on
            a line by itself with your ipaddress, like this:
        $ipaddr mybox1
            **********
             **********
             Your unqualified hostname resolves to 127.0.0.1, which is
             the IP address reserved for localhost. This likely means that
             you have a line similar to this one in your /etc/hosts file:
             127.0.0.1   $uqhn
             This should perhaps be changed to the following:
             127.0.0.1   localhost.localdomain localhost
             **********
      This printout shows a common problem. If MPICH2 resolves a system’s hostname to the 127.0.0.1
      (localhost) address, it cannot use that system to connect to other nodes. To fix this, edit /etc/hosts:
MPICH2 Installation Guide                                                                        Page 4 of 5


        Before:
         [root@node1 ~]$ vi /etc/hosts
         # Do not remove the following line, or various programs
         # that require network functionality will fail.
         127.0.0.1        localhost.localdomain            localhost       node1

        After:
         [root@node1 ~]$ vi /etc/hosts
         # Do not remove the following line, or various programs
         # that require network functionality will fail.
         127.0.0.1        localhost.localdomain            localhost
         192.168.0.2      node1

        Use a single tab to separate an IP address from its hostname, and a hostname from its host alias.
        Using other combinations of whitespace to separate these tokens can cause certain services to hang
        on system boot. For example, sendmail can take minutes instead of seconds to start when
        confronted with a misconfigured /etc/hosts file.
        The IP address 192.168.0.2 in the above example should equal the computer's LAN address. Once
        this step is complete, disable and then enable your network adapter so the changes take effect.
        Then run mpdcheck again. There should no longer be any errors. However, if other errors do
        occur, then the –l flag should explain how to fix them.
         [user@node1 ~]$ mpdcheck
         [user@node1 ~]$


8.) In any directory, create a file called mpd.hosts. Inside this file, use one line to name each MPICH2-
    capable host on the network. The mpd.hosts file can also be used to specify the number of processors
    per machine by adding a colon (:) and the number of CPUs after the hostname. An example of a two-
    processor machine is shown for node3.
         node1
         node2
         node3:2

9.) Start the MPICH2 ring using the mpdboot command. The following flags are useful to know:
         If mpdboot is being executed from a directory different from mpd.hosts, use the -f flag to specify
          its location.
         To specify the number of hosts to boot, use the -n flag.
         To explicitly state the number of CPUs to use per host, use the “--ncpus=” flag.
         [user@node1 ~]$ mpdboot -n 3
         [user@node1 ~]$
         --or—
         [user@node1 ~]$ mpdboot -n 6 –-ncpus=2 –f /home/user/mpd.hosts
         [user@node1 ~]$

    If the mpdboot command fails, it may be for one of the following reasons:
         A firewall is blocking connections between the boot host and the hostname listed in the error
          message. Fix this problem by disabling your software firewalls.
MPICH2 Installation Guide                                                                    Page 5 of 5


      The mpd service is already running on the hostname listed in the error messages. Fix this
       problem by running the command mpdallexit on the problem host.

10.) Confirm that the MPICH2 ring is communicating with each other by using mpdtrace command.
     This program returns a list of all hosts in the ring.
       [user@node1 ~]$ mpdtrace
       node1
       node2
       node3
       [user@node1 ~]$


11.) Finally, shut the MPICH2 ring down. This will stop the MPICH2 daemon on all connected hosts.
       [user@node1 ~]$ mpdallexit
       [user@node1 ~]$

								
To top