Docstoc

Install and use Globus and Condo

Document Sample
Install and use Globus and Condo Powered By Docstoc
					               Israel Academic Grid (IAG)


                 Introduction to
              Globus with Condor-G

                 Itzhak Ben Akiva (TAU)
                     David Front (WI)



01 May 2003           Globus with Condor-G   1
                  Agenda
•   Grid security and certificates
•   Globus
•   Condor-G
•   Condor-G submission examples
•   References




01 May 2003       Globus with Condor-G   2
           Grid Security Infrastructure (GSI)
• GSI is a set of tools, libraries and
  protocols used in Globus to allow users
  and applications to securely access
  resources.
• Based on a public key infrastructure,
  with certificate authorities and X509           Proxies and delegation
  certificates                                    for secure single
                                                  Sign-on
                  Proxies and Delegation

                    PKI                             SSL for
PKI for
                                      SSL/          Authentication
                 (CAs and
credentials                           TLS           And message
                Certificates)
                                                    protection

  01 May 2003              Globus with Condor-G                     3
      Public Key Infrastructure (PKI)
• PKI allows you to know that a given
  public key belongs to a given user
• PKI builds off of asymmetric
  encryption:
   – Each entity has two keys: public
     and private
   – Data encrypted with one key can
     only be decrypted with other.
   – The private key is known only to
     the entity
• The public key is given to the world
  encapsulated in a X.509 certificate

01 May 2003             Globus with Condor-G   4
                        Certificates
• Certificates link between public key & identity of a:
    – Person, organization, or device (“subject”)
    – Associated with use of private key
    – Used by a “relying party”
• Certificate Authority (CA) are responsible for establishing
  identity
• CA generates key pair, and digitally signs the public key
  making it a Certificate


                                            John Doe
          Name                              755 E. WoodlawnState of
                                                           Illinois
          Issuer                            Urbana IL 61801 Seal
          Public                             BD 08-06-65
                                             Male 6’0” 200lbs
          Key
          Signature
 01 May 2003                 Globus with Condor-G                     5
      Certificates: what CA to use?
The following possibilities go from the less secure to the most secure:

1. Anyone can become a certificate authority:
    http://www.onlamp.com/pub/a/onlamp/2003/02/06/linuxhacks.html
2. Free certificate authorities:
    – http://www.thawte.com/
    – http://www.verisign.com/ ...
3. Globus CA:
   http://www-fp.globus.org/gt2.4/admin/guide-verify.html#cert
   coerces some security limitations such as:
    – Domain of host to get a certificate should be the same as requestor‟s email domain
4. EDG CA:
   http://igc.services.cnrs.fr/Datagrid-fr/english/index.html
   EDG does not honor Globus certificates.
   EDG Grants certificates only to people that are known personally by an
   authorized third person.

For each non-Globus CA, each site should configure the CA as being a trusted one.
Globus is trusted by default and EDG has RPMs for adding it as a trusted CA.
Recommendation: until Machba supplies certificates, use Globus CA.
  01 May 2003                         Globus with Condor-G                             6
  Globus with Condor-G. A resource broker to be added later




01 May 2003           Globus with Condor-G               7
                         Globus
Gurus:
       Ian Foster
       Carl Kesselman


Globus Project™
       www.globus.org


Globus is a bag of tools


Grid SW projects use Globus

01 May 2003              Globus with Condor-G   8
                  What is Globus?
• A research and development project
  that enables the application of Grid concepts
  to scientific and engineering computing.
• Globus Toolkit allows: build Grids, develop Grid applications
• Globus Project research targets technical challenges and
  Globus Toolkit supplies a set of services and software
  libraries to support Grids and Grid applications:
    –   (GRAM) resource management
    –   (GSI) security
    –   (MDS) information infrastructure
    –   (GASS) data management
    –   (HBM) fault detection
    –   (Nexus and globus_io) portability + communication


 01 May 2003                 Globus with Condor-G           9
   The Globus Toolkit contents
• A “bag of services”:
   components to develop grid applications + programming tools
• Component have a C application programmer interface (API)
• Some Components have Java classes and/or command line tools
• Prototypes of
    – higher components (resource brokers, co-allocators)
    – and services.
• Others use the Globus Toolkit to develop:
    – higher-level services,
    – application frameworks,
    – and scientific/engineering applications
  Example:
    – Condor-G uses Globus for its high-throughput computing
      framework
 01 May 2003            Globus with Condor-G              10
  Globus Toolkit: GRAM + GSI
• Globus Resource Allocation Manager (GRAM)
    –   Resource allocation
    –   Process creation
    –   Monitoring
    –   Management
    –   Maps requests
         expressed in a Resource Specification Language (RSL)
        into commands
         to local schedulers and computers.
• Grid Security Infrastructure (GSI)
    –   A single-sign-on, run-anywhere authentication service,
    –   local control over access rights
    –   mapping from global to local user identities.
    –   Smartcard support increases credential security.

01 May 2003                   Globus with Condor-G               11
 Globus Toolkit: MDS + GASS
• Monitoring and Discovery Service (MDS):
    – Extensible Grid information service
    – Combines data discovery mechanisms with the Lightweight
      Directory Access Protocol (LDAP).
    – Uniform framework for providing and accessing system
      configuration and status information, such as:
         • Compute server configuration
         • Network status,
         • Locations of replicated datasets
• Global Access to Secondary Storage (GASS):
    – Implements a variety of automatic and programmer-managed
      data movement and data access strategies
    – Enables remote programs to read and write local data.

01 May 2003                    Globus with Condor-G              12
Globus Toolkit: Nexus, globus_io, HBM, GPT

• Nexus and globus_io:
   communication services for heterogeneous environments:
   – multimethod communication
   – multithreading
   – single-sided operations
• The Heartbeat Monitor (HBM):
   Allows system administrators or ordinary users to
   – detect failure of system components
   – detect failure of application processes
• Globus Packaging Tool: (GPT):
   – VDT‟s Packman installs Globus and Condor-G.
     Hence, is it more appropriate for Condor-G users.
01 May 2003            Globus with Condor-G              13
        Layered Grid Architecture
   (By Analogy to Internet Architecture)
                                              Application




                                                                                   Internet Protocol Architecture
“Coordinating multiple resources”:
ubiquitous infrastructure services,                    Collective
app-specific distributed services                                   Application

“Sharing single resources”:
negotiating access, controlling use                Resource

“Talking to things”: communication
(Internet protocols) & security              Connectivity           Transport
                                                                     Internet
“Controlling things locally”: Access
to, & control of, resources                        Fabric              Link


    01 May 2003                 Globus with Condor-G                          14
                The Globus Toolkit in One Slide
       Grid protocols (GSI, GRAM, …) enable resource sharing within
          virtual orgs; toolkit provides reference implementation
       (      = Globus Toolkit services)
                                                   MDS-2                  Soft state
                                                                         registration;
                       Reliable            (Monitor./Discov. Svc.)         enquiry
    GSI User           remote
                     invocation Gatekeeper Reporter                             GIIS: Grid
   (Grid                                      (registry +                      Information
            Authenticate &        (factory)
 Security create proxy                        discovery)       Other GSI-     Index Server
Infrastruc- credential          Create process Register       authenticated     (discovery)

   ture)      User                   User                    remote service
            process #1             process #2                   requests
              Proxy                                                           Other service
                                    Proxy #2
                  GRAM                                                        (e.g. GridFTP)
(Grid Resource Allocation & Management)
        Protocols (and APIs) are central to Globus toolkit

     01 May 2003                      Globus with Condor-G                               15
Globus Toolkit: missing, weak, plans 1
• (GRAM) resource management
    – Condor-G adds: reliable job submission
    – (EDG) Resource broker: choose Globus resource to submit job
    – Globus plans: support end-to-end performance management and
      fault tolerance via network scheduling, advance reservations, and
      policy-based authorization.
• (GSI) security
    Using X.509 certificates has various limitations. For example:
    – If a user does not use a pass phrase,
      anyone that puts a hand on her certificate can use it.
• (MDS) information infrastructure
    – LDAP: too weak for frequently changing information (EDG uses
      RDBM instead of LDAP)



01 May 2003                 Globus with Condor-G                     16
Globus Toolkit: missing, weak, plans 2
• (GASS) data management
    – Globus current replica management capabilities are limited
    – Globus plans: provide high-performance access
      to large amounts of data (terabytes or petabytes).
• (HBM) fault detection
• (Nexus and globus_io) portability + communication
• Others
    – Weak accounting
    – „The firewall problem‟: In order to submit a Globus job, some Internet
      ports should be opened. This is a security problem.
    – Weak fabric tools:
        • Installation/configuration:
              – VDT adds configurable installation via Packman
              – EDG adds Client-server installation + updating via LCFG
        • Weak monitoring tools
    – Restricted support for Windows.
01 May 2003                      Globus with Condor-G                     17
     Globus support for Windows
• Port of the Globus Toolkit to the Windows XP/2000
  platform is under development/test.
• Using Grid resources from Windows systems or turn
  Windows systems into Grid resources:
    – The Java CoG Kit (http://www.globus.org/cog/) provides access
      to Grid services via the Java programming language, available
      on Windows.
    – A Java-based GRAM service is currently being developed.
    – The Condor software from the Condor Project at the University of
      Wisconsin (http://www.cs.wisc.edu/condor/) provides job
      management services that allow you to submit jobs to a local
      service that then submits your jobs to remote resources for
      execution. Condor can use Grid resources to execute these jobs.
      Condor is available for Windows.

01 May 2003                Globus with Condor-G                     18
        Globus toolkit (re)structure
 Service naming
                                           Soft state
 Reliable invocation                      management

     GRAM              MDS   GridFTP                MDS   ???
Notification
               GSI                        GSI                   GSI
           Job
         manager
                  Job
                manager
           Compute                      Data              Other Service
           Resource                   Resource            or Application



Lots of good mechanisms, but (with the exception of GSI) not that easily
incorporated into other systems

 01 May 2003                 Globus with Condor-G                     19
    Service Oriented Architecture (SOA)
New buzzwords:
•Services (in addition to protocols and APIs)
•Open Grid Services Architecture (OGSA)
•Web services
                                                   Service
•Soap                                              Registry
•XML                             Find                            Publish

                                Service                       Service
                               Requestor                      Provider
 OGSA may become standard                           Bind


01 May 2003                 Globus with Condor-G                         20
         The Grid Service =
Interfaces/Behaviors + Service Data
                        GridService        … other interfaces …
Service data query      (required)              (optional)        Standard:
Explicit destruction                                              - Notification
Soft-state lifetime                                               - Service creation
                            Service     Service         Service
                                                                  - Service registry
                              data        data            data    - Authorization
Binding properties:         element     element         element
                                                                  - Manageability
- Reliable invocation
- Authentication
                                Implementation                    + application-
                                                                  specific interfaces


                        Hosting environment/runtime
                             (“C”, J2EE, .NET, …)

    01 May 2003                  Globus with Condor-G                         21
                             Condor-G
Guru:
       Miron Livny


Condor Project:
       http://www.cs.wisc.edu/condor

Condor-G manual:
       http://www.cs.wisc.edu/condor/manual/v6.4/5_2Condor_G.html



Condor is a scheduler, similar to PBS, LSF and others

Condor-G is (the submission) part of Condor.
It adds to Globus Reliable job submission
01 May 2003                     Globus with Condor-G                 22
     Condor-G from Globus eyes
• Condor-G adds to Globus reliable job submission. It lets you:
    –   Submit jobs into a queue
    –   have a log detailing the life cycle of your jobs
    –   manage your input and output files
    –   along with everything else you expect from a job queuing system.
• Condor-G does more than Globus toolkit's globusrun command:
    –   It allows you to submit many jobs at once
    –   and then to monitor those jobs with a convenient interface
    –   receive notification when jobs complete or fail
    –   maintain your Globus credentials which may expire while a job runs
• Condor-G is a fault-tolerant system:
  If your machine crashes, you can still perform all of these functions
  when your machine returns to life.




01 May 2003                    Globus with Condor-G                          23
     Condor-G from Condor eyes
•   Condor-G is a Globus-enabled version of the Condor scheduler.
    It uses Globus to handle inter-organizational problems like:
     – Security
     – Resource management for supercomputers,
     – Executable staging.
    Hence: The same Condor tools that access local resources are now able to
    use the Globus protocols to access resources at multiple sites.
•   Condor-G manages both a queue of jobs and the resources from one or
    more sites where those jobs can execute.
    It communicates with these resources and transfers files to and from these
    resources using Globus mechanisms, such as:
     – GSI
     – GRAM protocol for job submission,
     – and a local GASS server for file transfers.
•   The mutual look:
    Condor can be used to submit jobs to systems managed by Globus.
    Globus tools can be used to submit jobs to systems managed by Condor.


01 May 2003                        Globus with Condor-G                      24
        how Condor-G interacts with
             Globus protocols




01 May 2003      Globus with Condor-G   25
       Submitting a job to Condor-G:
                example 1
Run your compiled program on a different Globus resource:

•   Make sure your Condor server service is running on the Condor server.
     (Not explained here)
•   Make sure you have your Grid credentials, create a proxy:
    grid-proxy-init
•   To submit a job:
    condor_ submit < submit-description-file-name>
•   The following sample runs a job on the Origin2000 at NCSA:
     executable = test
     globusscheduler = modi4.ncsa.uiuc.edu/jobmanager
     universe = globus
     output = test.out
     log = test.log queue
•   The executable for this example is transferred from the local machine to the
    remote machine.
•   By default, Condor transfers the executable,
    as well as any files specified by the input command.
•   This executable must be compiled for the correct intended platform.
01 May 2003                    Globus with Condor-G                           26
        Submitting a job to Condor-G
             example 1 cont.
•   The globusscheduler command is dependent on the scheduling software
    available on remote resource.
    This required command will change based on the Grid resource intended for
    execution of the job.
•   All Condor-G jobs are submitted to the globus universe. Hence:
    universe = globus
    is always required in the submit description file.
•   IO:
    No input file is specified for this example job.
    Any output (file specified by the output) or error (file specified by the error)
    is transferred from the remote machine to the local machine as it is
    produced.
    This implies that these files may be incomplete in the case where the
    executable does not finish running on the remote resource.
    The job log file is maintained on the local machine.
•   To submit this job to Condor-G for execution on the remote machine, use:
    condor_submit test.submit
    where test.submit is the name of the submit description file.


01 May 2003                       Globus with Condor-G                                 27
         Submitting a job to Condor-G
              example 1 cont.
Example output from condor_ q for this submission looks like:

% condor_q
-- Submitter: wireless48.cs.wisc.edu : <128.105.48.148:33012> : wireless48.cs.wi
 ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD
  7.0 epaulson 3/26 14:08 0+00:00:00 I 0 0.0 test
1 jobs; 1 idle, 0 running, 0 held

After a short time, Globus accepts the job.
Again running condor_ q will now result in

% condor_q
 -- Submitter: wireless48.cs.wisc.edu : <128.105.48.148:33012> : wireless48.cs.wi
   ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD
    7.0 epaulson 3/26 14:08 0+00:01:15 R 0 0.0 test
1 jobs; 0 idle, 1 running, 0 held

Then, very shortly after that, the queue will be empty again, because the job has finished:

% condor_q
 -- Submitter: wireless48.cs.wisc.edu : <128.105.48.148:33012> : wireless48.cs.wi
  ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD
  0 jobs; 0 idle, 0 running, 0 held


01 May 2003                             Globus with Condor-G                                  28
       Submitting a job to Condor-G
                example 2
Run the (prestaged) Unix ls program on a different Globus resource:

    executable = /bin/ls
    Transfer_Executable = false
    globusscheduler = vulture.cs.wisc.edu/jobmanager
    universe = globus
    output = ls-test.out
    log = ls-test.log queue

• The executable is pre-staged. Being on the remote machine, there is
  no need to transfer it before execution.
• The required globusscheduler and universe commands are present.
• The command Transfer_Executable = FALSE identifies the executable as
  being pre-staged.
  In this case, the executable command gives the path to the executable
  on the remote machine.


01 May 2003                  Globus with Condor-G                     29
        Submitting a job to Condor-G
                 example 3
    Submit a Perl script to be run as a Condor job.
    The Perl script both lists and sets environment variables for a job.
•   Save the following Perl script with the name env-test.pl, to be used as a Condor job
    executable:
          #!/usr/bin/env perl
          foreach $key (sort keys(%ENV))
          { print "$key = $ENV{$key}\n" }
          exit 0;
•   Run the Unix command chmod 755 env-test.pl to make the Perl script
    executable.
•   Create the following submit description file
          executable = env-test.pl
          globusscheduler = biron.cs.wisc.edu/jobmanager
          universe = globus
          environment = foo=bar;
          zot=qux
          output = env-test.out
          log = env-test.log queue



01 May 2003                        Globus with Condor-G                                    30
           Submitting a job to Condor-G
                example 3 cont.
•    When the job has completed, the output file env-test.out should contain something
     like this:

GLOBUS_GRAM_JOB_CONTACT = https://biron.cs.wisc.edu:36213/30905/1020633947/
GLOBUS_GRAM_MYJOB_CONTACT = URLx-nexus://biron.cs.wisc.edu:36214
GLOBUS_LOCATION = /usr/local/globus
GLOBUS_REMOTE_IO_URL =
   /home/epaulson/.globus/.gass_cache/globus_gass_cache_1020633948
HOME = /home/epaulson
LANG = en_US
LOGNAME = epaulson
X509_USER_PROXY = /home/epaulson/.globus/.gass_cache/globus_gass_cache_1020633951
foo = bar
zot = qux




    01 May 2003                   Globus with Condor-G                          31
       Submitting a job to Condor-G
            example 3 cont.
Of particular interest is GLOBUS_REMOTE_IO_URL environment variable:
Condor-G automatically starts up a GASS remote I/O server on the
     submitting machine.
Because of the potential for either side of the connection to fail,
     the URL for the server cannot be passed directly to the job.
Instead, it is put into a file, and the GLOBUS_REMOTE_IO_URL
     environment variable points to this file.
Remote jobs can read this file and use the URL it contains to access the
     remote GASS server running inside Condor-G.
If the location of the GASS server changes (for example, if Condor-G
     restarts), Condor-G will contact the Globus gatekeeper and update this
     file on the machine where the job is running.
It is therefore important that all accesses to the remote GASS server
     check this file for the latest location.

01 May 2003                 Globus with Condor-G                       32
        Submitting a job to Condor-G
                last example
A Perl script that uses the GASS server in Condor-G to copy input files
   to the execute machine.
   (the remote job counts the number of lines in a file.)
#!/usr/bin/env perl use FileHandle;
use Cwd;
STDOUT->autoflush();
$gassUrl = `cat $ENV{GLOBUS_REMOTE_IO_URL}`;
chomp $gassUrl;
$ENV{LD_LIBRARY_PATH} = $ENV{GLOBUS_LOCATION}. "/lib";
$urlCopy = $ENV{GLOBUS_LOCATION}."/bin/globus-url-copy";
# globus-url-copy needs a full pathname
$pwd = getcwd();
print "$urlCopy $gassUrl/etc/hosts file://$pwd/temporary.hosts\n\n";
`$urlCopy $gassUrl/etc/hosts file://$pwd/temporary.hosts`;
open(file, "temporary.hosts");
while(<file>) { print $_; }
exit 0;

 01 May 2003                 Globus with Condor-G                         33
         Submitting a job to Condor-G
             last example Cont.
•   The submit file:
          executable = gass-example.pl
          globusscheduler = biron.cs.wisc.edu/jobmanager
          universe = globus
          output = gass.out
          log = gass.log queue

•   There are two optional submit description file commands of note: x509userproxy
    and globusrsl.

    1) The x509userproxy command specifies the path to an X.509 proxy, as:
     x509userproxy = /path/to/proxy

     –   If this optional command is not present in the submit description file,
         then Condor-G checks the value of the environment variable X509_USER_PROXY for the
         location of the proxy.
     –   If this environment variable is not present, then Condor-G looks for the proxy in the file
         /tmp/x509up_u0000,
         where the trailing zeros in this file name are replaced with the Unix user id.



01 May 2003                            Globus with Condor-G                                           34
       Submitting a job to Condor-G
           last example Cont.

     2)The globusrsl command is used to add additional attribute
     settings to a job's RSL string, as:
         globusrsl = (name=value)(name=value)
     An example of this command in a submit description file
        globusrsl = (project=Test_Project)
     This example's attribute name for the additional RSL is project,
     and the value assigned is Test_Project.




01 May 2003                 Globus with Condor-G                        35
        Limitations of Condor-G
• No checkpoints.
• No matchmaking.
• File transfer is limited. There are no file transfer
  mechanisms for files other than the executable,
  stdin, stdout, and stderr.
• No job exit codes. Job exit codes are not
  available.
• Limited platform availability. Condor-G is only
  available on Linux, Solaris, Digital UNIX, and
  IRIX. HP-UX support will hopefully be available
  later.

01 May 2003          Globus with Condor-G                36
                           References
    Globus Project www.globus.org
• Overviews of Grid computing:
      Anatomy of the grid:
         http://www-fp.globus.org/research/papers.html#anatomy
      Physiology of the grid:
         http://www-fp.globus.org/research/papers.html#OGSA
      Older, extensive:
         The Grid: Blueprint for a New Computing Infrastructure,
         I. Foster and C. Kesselman (Eds), Morgan Kaufmann, 1999.
• Globus FAQ http://www-fp.globus.org/about/faq/general.html
• Globus installation http://www-fp.globus.org/gt2/admin/guide-verify.html
• Condor-G manual:
  http://www.cs.wisc.edu/condor/manual/v6.4/5_2Condor_G.html
• A topical school on Grid computing will be held in Vico Equense, Italy
  during the last two weeks of July, 2003.
  For details, send an email to grid-chool@ggf.org.
 Global Grid Forum www.gridforum.org

    01 May 2003                 Globus with Condor-G                   37

				
DOCUMENT INFO