Embed
Email

MPI Jobs

Document Sample

Shared by: xiaopangnv
Categories
Tags
Stats
views:
3
posted:
12/8/2011
language:
pages:
29
Special Jobs

Claudio Cherubino

INFN - Catania

Outline





• MPI jobs on gLite



• DAG



• Job Collection



• Parametric jobs







2

MPI Overview



• Execution of parallel jobs is an essential issue for

modern informatics and applications.



• Most used library for parallel jobs support is MPI

(Message Passing Interface)



• At the state of the art, parallel jobs can run

inside single Computing Elements (CE) only;

 several projects are involved into studies concerning

the possibility of executing parallel jobs on Worker

Nodes (WNs) belonging to different CEs.

Requirements & settings





• In order to guarantee that MPI job can run, the following

requirements MUST BE satisfied:



 the MPICH software must be installed and placed in the

PATH environment variable on each WNs of the CE.





 Some MPI‟s applications require a shared filesystem among

the WNs to run.





 The variable VO__SW_DIR will contain

the name of a directory in case of SHARED filesystem.

 The variable VO__SW_DIR will contain

“.” if there is NO SHARED filesystem.









4

• From the user’s point of view, jobs to be run

as MPI are specified setting the JDL JobType

attribute to MPICH and specifying the

NodeNumber attribute as well.



• E.g.:



• …

• JobType = “MPICH”; This attribute defines the

• NodeNumber = 4; required number of CPUs

needed for the application

• …





5

• When the previous two attributes are included in

a JDL, the User Interface (UI) automatically adds

the following expression:



(other.GlueCEInfoTotalCPUs >= NodeNumber) &&

Member (“MPICH”,other.GlueHostApplicationSoftwareRunTimeEnvironment)







• to the JDL Requirements expression in order

to find out the best resource where the job can

be executed.









6

MPI exercise





• Create the file “mpi-glite.jdl” inside

$HOME/EXAMPLES/gLite/Other and put this contents

inside the file:



• [

• Type = "Job";

• JobType = "MPICH";

• Executable = “cpi";

• NodeNumber = 2;

• StdOutput = “cpi.out";

• StdError = “cpi.err";

• InputSandbox = {"cpi"};

• OutputSandbox = {“cpi.err",“cpi.out"};

• RetryCount = 0;

• ]







7

MPI submission



• [glite-tutor] /home/claudio > edg-job-submit -o id mpi-glite.jdl



• Selected Virtual Organisation name (from proxy certificate

extension): gilda

• Connecting to host glite-rb.ct.infn.it, port 7772

• Logging to host glite-rb.ct.infn.it, port 9002



• ========== glite-job-submit Success ======================

• The job has been successfully submitted to the Network

Server.

• Use glite-job-status command to check job current status.

Your job identifier is:



• - https://glite-rb.ct.infn.it:9000/bsrbbzbcXZWSzU3iUYlm6g



• The job identifier has been saved in the following file:

• /home/claudio/id

• ==========================================================









8

MPI status and output





• Query the status of the job using the following

command:



• [glite-tutor] /home/claudio > edg-job-status -i id

• …………………………………………….



• When the status of the job is “DONE”, you can

retrieve output with the following command:



• [glite-tutor] /home/claudio > edg-job-get-output -i id

• ……………………………………………







9

MPI on the web…





• LCG-2 User Guide Manuals Series

 https://edms.cern.ch/file/454439/LCG-2-UserGuide.pdf





• http://oscinfo.osc.edu/training/



• http://www.netlib.org/mpi/index.html



• http://www-unix.mcs.anl.gov/mpi/learning.html



• http://www.ncsa.uiuc.edu/







10

Workload Manager Proxy









11

WMProxy overview





• WMProxy (Workload Manager Proxy)

 is a new service providing access to the

gLite Workload Management System (WMS)

functionality through a simple Web Services

based interface.

 has been designed to efficiently handle a

large number of requests for job submission

and control to the WMS

 the service interface addresses the Web

Services and SOA architecture standards, in

particular adhering to WS-I

 developed in C++ using gsoap 2.7.6b as

soap stubs generator





12

New request types



• Support for new types strongly relies on newly

developed JDL converters and on the DAG

submission support

 all JDL conversions are performed on the server

 a single submission for several jobs

• All new request types can be monitored and

controlled through a single handle (the request

id)

 each sub-jobs can be however followed-up and

controlled independently through its own id

• “Smarter” WMS client commands/API

 allow submission of DAGs, collections and parametric

jobs exploiting the concept of “shared sandbox”

 allow automatic generation and submission of

collections and DAGs from sets of JDL files located in

user specified directories on the UI





13

WMProxy : submission & monitoring





• In order to submit jobs with WMProxy, it‟s

mandatory to delegate credentials:

glite-wms-job-delegate-proxy -d del_ID



• The submission/monitoring commands are

slightly different, but most of the “old” options

are supported

glite-wms-job-submit -d del_ID collection.jdl



glite-wms-job-status jobID



glite-wms-job-output \

https://glite-rb.ct.infn.it:9000/LHIIGaCVdl7Olm

sz0jpI_g







14

DAG job





• A DAG job is a set of jobs where input, output,

or execution of one or more jobs can depend on

other jobs

• Dependencies are represented through Directed

Acyclic Graphs, where the nodes are jobs, and

the edges identify the dependencies



nodeA









nodeB nodeC NodeF









nodeD



15

JDL structure









16

Attribute: Nodes









17

Attribute: Dependencies









18

DAG jdl



[

type = "dag";

Node description max_nodes_running = 4;

could also be nodes = [

nodeA = [

done here, file ="nodes/nodeA.jdl" ;

instead of using ];

nodeB = [

separate files file ="nodes/nodeB.jdl" ;

];

nodeC = [

file ="nodes/nodeC.jdl" ;

];

nodeD = [

file ="nodes/nodeD.jdl";

];

dependencies = {

{nodeA, nodeB},

{nodeA, nodeC},

{ {nodeB,nodeC}, nodeD }

}

];

]



19

20

Job Collection



• A job collection is a set of independent jobs that

user wants to submit and monitor via a single

request

• Jobs of a collection are submitted as DAG nodes

without dependencies

• JDL is a list of classad, which describes the

subjobs

[

Type = "collection";

VirtualOrganisation = “gilda";

nodes = {

[ ],

[ ],



};

]







21

„Scattered‟ Input Sandboxes



• Input Sandbox can contain

 file paths on the UI machine (i.e. the usual way)

 URI pointing to files on a remote gridFTP/HTTPS server



InputSandbox = {

"gsiftp://neo.datamat.it:2811/var/prg/sim.exe",

"https://ghemon.cnaf.infn.it:8443/data/idat_1",

"file:///home/pacio/myconf“ };



• A base URI to be applied to all sandbox files can also be

specified

InputSandboxBaseURI = "gsiftp://matrix.datamat.it:2811/var";



• Only local files (file://) are uploaded to the WMS node

• File pointed by URIs are directly downloaded on the WN by

the JobWrapper just before the job is started







22

Job collection example



[

type = "collection";

InputSandbox = {"date.sh"};

RetryCount = 0; All nodes will share

nodes = { this Input Sandbox

[

file ="jobs/job1.jdl" ;

],

[

[

Executable = "/bin/sh";

Arguments = "date.sh";

Stdoutput = "date.out";

StdError = "date.err";

OutputSandbox ={"date.out", "date.err"};

]

],

[

file ="jobs/job3.jdl" ;

]

};

]



23

24

Parametric Job



• A parametric job is a job where one or more of its attributes are

parameterized

• Values of attributes vary according to a parameter

[

JobType = "Parametric";

Executable = "/bin/sh";

Arguments = "md5.sh input_PARAM_.txt";

InputSandbox = {"md5.sh", "input_PARAM_.txt"};

StdOutput = "out_PARAM_.txt";

StdError = "err_PARAM_.txt";

Parameters = 4;

ParameterStart = 1;

ParameterStep = 1;

OutputSandbox = {"out_PARAM_.txt",

"err_PARAM_.txt"};

]



• Job monitoring / managing is always done through an unique

jobID, as if the job was single (see submission of collection









25

Parametric Job / 2



• Parameter can be also a list of string

• InputSandbox (if present) has to be coherent with

parameters

[ui-test] /home/giorgio/param > cat param2.jdl

[

JobType = "Parametric";

Executable = “/bin/cat";

Arguments = “input_PARAM_.txt”;

InputSandbox = "input_PARAM_.txt";

StdOutput = "myoutput_PARAM_.txt";

StdError = "myerror_PARAM_.txt";

Parameters = {earth,moon,mars};

OutputSandbox = {“myoutput_PARAM_.txt”};

]



[ui-test] /home/giorgio/param > ls

inputEARTH.txt inputMARS.txt inputMOON.txt param2.jdl







26

27

References





• JDL attributes specification for WM proxy

 https://edms.cern.ch/document/590869/1





• WMProxy quickstart

 http://egee-jra1-wm.mi.infn.it/egee-

jra1-wm/wmproxy_client_quickstart.shtml





• WMS user guides

 https://edms.cern.ch/document/572489/1









28

Questions…









29



Related docs
Other docs by xiaopangnv
agenda-10-04
Views: 0  |  Downloads: 0
Folkevisen Germand Gladensvend
Views: 1  |  Downloads: 0
Macbeth-Summary-by-toni
Views: 0  |  Downloads: 0
How to Change Settings for the Microphone
Views: 0  |  Downloads: 0
bonn3update8
Views: 0  |  Downloads: 0
Enrol Result_0067AG_17032007_web
Views: 0  |  Downloads: 0
Healing _A Prayer for Healing_
Views: 0  |  Downloads: 0
8900september
Views: 0  |  Downloads: 0
By registering with docstoc.com you agree to our
privacy policy

You are almost ready to download!

You are almost ready to download!