Embed
Email

DataGrid-01-TEN-0142-0_1

Document Sample

Shared by: huanghengdong
Categories
Tags
Stats
views:
0
posted:
1/18/2012
language:
pages:
31
DataGrid

JDL ATTRIBUTES









Document identifier: DataGrid-01-TEN-0142-0_1



Date: 18/01/2012



Work package: WP1



Partner: Datamat SpA







Document status







Deliverable identifier:









Abstract: This note provides the description of JDL attributes supported by the EDG WMS

release-2 software.









IST-2000-25182 PUBLIC 1 / 31

Doc. Identifier:

DataGrid-01-TEN-0142-0_1

JDL ATTRIBUTES

Date: 18/01/2012









Delivery Slip

Name Partner Date Signature



Datamat

From Fabrizio Pacini 04/09/2003

SpA



Datamat

Verified by Stefano Beco 04/09/2003

SpA



Approved by









Document Log

Issue Date Comment Author

0_0 16/06/2003 First issue Fabrizio Pacini



0_1 04/09/2003 Fabrizio Pacini









IST-2000-25182 PUBLIC 2 / 31

Doc. Identifier:

DataGrid-01-TEN-0142-0_1

JDL ATTRIBUTES

Date: 18/01/2012









Document Change Record

Issue Item Reason for Change

 Update Std/In/Out/Err attributes definitio

0_1 General Update

 Added JDL examples









Files

Software Products User files

DataGrid-01-TEN-0142-0_1.doc

Word 2000



Acrobat Exchange 5.0 DataGrid-01-TEN-0142-0_1.pdf









IST-2000-25182 PUBLIC 3 / 31

Doc. Identifier:

DataGrid-01-TEN-0142-0_1

JDL ATTRIBUTES

Date: 18/01/2012







CONTENT

1. INTRODUCTION ............................................................................................................................. 5

1.1. APPLICABLE DOCUMENTS AND REFERENCE DOCUMENTS ............................................................ 5

1.2. DOCUMENT EVOLUTION PROCEDURE ........................................................................................... 6

1.3. TERMINOLOGY .............................................................................................................................. 6

2. ATTRIBUTES DESCRIPTION....................................................................................................... 8

2.1. TYPE ............................................................................................................................................. 9

2.2. JOBTYPE ....................................................................................................................................... 9

2.3. EXECUTABLE .............................................................................................................................. 10

2.4. ARGUMENTS ............................................................................................................................... 10

2.5. STDINPUT .................................................................................................................................... 11

2.6. STDOUTPUT ................................................................................................................................ 11

2.7. STDERROR .................................................................................................................................. 12

2.8. INPUTSANDBOX .......................................................................................................................... 12

2.9. OUTPUTSANDBOX....................................................................................................................... 13

2.10. ENVIRONMENT .......................................................................................................................... 13

2.11. INPUTDATA ............................................................................................................................... 15

2.12. DATAACCESSPROTOCOL .......................................................................................................... 15

2.13. OUTPUTSE ................................................................................................................................ 16

2.14. OUTPUTDATA ........................................................................................................................... 17

2.14.1. OutputFile ......................................................................................................................... 17

2.14.2. StorageElement ................................................................................................................. 17

2.14.3. LogicalFileName ............................................................................................................... 17

2.15. VIRTUALORGANISATION .......................................................................................................... 18

2.16. RETRYCOUNT ........................................................................................................................... 19

2.17. MYPROXYSERVER .................................................................................................................... 19

2.18. HLRLOCATION ......................................................................................................................... 20

2.19. NODENUMBER .......................................................................................................................... 21

2.20. JOBSTEPS .................................................................................................................................. 21

2.21. CURRENTSTEP .......................................................................................................................... 22

2.22. LISTENERPORT.......................................................................................................................... 22

2.23. REQUIREMENTS ........................................................................................................................ 22

2.24. RANK ........................................................................................................................................ 23

2.25. FUZZYRANK ............................................................................................................................. 24

3. SPECIAL JDL EXPRESSIONS .................................................................................................... 25

3.1. GANG-MATCHING ....................................................................................................................... 25

3.2. GETACCESSCOST FUNCTION ....................................................................................................... 25

4. JDL EXAMPLES ............................................................................................................................ 27

4.1. JOB WITHOUT DATA REQUIREMENTS ......................................................................................... 27

4.2. JOB WITH OUTPUTDATA ......................................................................................................... 27

4.3. JOB WITH INPUT AND OUTPUTDATA .................................................................................... 28

4.4. INTERACTIVE JOB ...................................................................................................................... 29

4.5. CHECKPOINTABLE JOB................................................................................................................ 30

4.6. MPI JOB ...................................................................................................................................... 31









IST-2000-25182 PUBLIC 4 / 31

Doc. Identifier:

DataGrid-01-TEN-0142-0_1

JDL ATTRIBUTES

Date: 18/01/2012









1. INTRODUCTION

This document provides a description of all JDL attributes supported by the EDG WMS for

release 2 and a guide for the users to the building of job descriptions.





1.1. APPLICABLE DOCUMENTS AND REFERENCE DOCUMENTS

Applicable documents

[A1] Definition of the architecture, technical plan and evaluation criteria for the resource

co-allocation framework and mechanisms for parallel job partitioning

(http://www.infn.it/workload-grid/docs/DataGrid-01-D1.4-0127-1_0.{doc, pdf})





[A2] DataGrid Accounting System - Architecture v 1.0

(http://www.infn.it/workload-grid/docs/DataGrid-01-TED-0126-1_0.pdf)

[A3] The Glue CE Schema

(http://www.cnaf.infn.it/~sergio/datatag/glue/v11/CE/index.htm)

[A4] Job Description Language HowTo – DataGrid-01-TEN-0102-02 – 17/12/2001

(http://www.infn.it/workload-grid/docs/DataGrid-01-TEN-0102-0_2.pdf)







Reference documents

[R1] The Resource Broker Info file – DataGrid-01-TEN-0135-0_0

(http://www.infn.it/workload-grid/docs/DataGrid-01-TEN-0135-0_0.{doc,pdf})

[R2] LB-API Reference Document – DataGrid-01-TED-0139-0_0

(http://lindir.ics.muni.cz/dg_public/lb_api.pdf)

[R3] Job Partitioning and Checkpointing – DataGrid-01-TED-0119-0_3

(https://edms.cern.ch/file/347730/1/DataGrid-01-TED-0119-0_3.pdf)

[R4] "Gang-Matching in EDG WMS" - DataGrid-01-TEN-014X-0_0

(To be issued)

[R5] Design of a Replica Optimisation Framework

(https://edms.cern.ch/file/337977/1.7.2/wp2_replicaopt_api.ps)









IST-2000-25182 PUBLIC 5 / 31

Doc. Identifier:

DataGrid-01-TEN-0142-0_1

JDL ATTRIBUTES

Date: 18/01/2012









1.2. DOCUMENT EVOLUTION PROCEDURE

The content of this document will be subjected to modification according to the following

events:

 Comments received from Datagrid project members,

 Changes/evolutions/additions to the JDL.





1.3. TERMINOLOGY

Definitions

Condor Condor is a High Throughput Computing (HTC) environment that can

manage very large collections of distributively owned workstations

Globus The Globus Toolkit is a set of software tools and libraries aimed at the

building of computational grids and grid-based applications.



Glossary

class-ad Classified advertisement

CE Computing Element

CLI Command Line Interface

DGAS Datagrid Grid Accounting Service

EDG European DataGrid

FQDN Fully Qualified Domain Name

GIS Grid Information Service, aka MDS

GSI Grid Security Infrastructure

GUI Graphical User Interface

HLR Home Location Register

IS Information Service

job-ad Class-ad describing a job

JA Job Adapter

JC Job Controller

JDL Job Description Language

LB Logging and Bookkeeping Service

LM Log Monitor

LRMS Local Resource Management System

MDS Metacomputing Directory Service, aka GIS

MPI Message Passing Interface

NS Network Server

OS Operating System

PA Price Authority





IST-2000-25182 PUBLIC 6 / 31

Doc. Identifier:

DataGrid-01-TEN-0142-0_1

JDL ATTRIBUTES

Date: 18/01/2012







PID Process Identifier



PM Project Month

RB Resource Broker

SE Storage Element

SI00 Spec Int 2000

SMP Symmetric Multi Processor

TBC To Be Confirmed

TBD To Be Defined

UI User Interface

VO Virtual Organisation

WM Workload Manager

WMS Workload Management System

WP Work Package









IST-2000-25182 PUBLIC 7 / 31

Doc. Identifier:

DataGrid-01-TEN-0142-0_1

JDL ATTRIBUTES

Date: 18/01/2012









2. ATTRIBUTES DESCRIPTION

The JDL is a fully extensible language, hence the user is allowed to use whatever attribute

for the description of a job without incurring in errors from the JDL parser. Anyway only a

certain set of attributes that we will refer as “supported attributes” from now on, is taken into

account by the Workload Management System components in order to schedule and submit

a job.

JDL attributes represent job specific information and specify in some way actions that have

to be performed by the WMS to schedule the job.

Some of these attributes are provided by the user when she/he edits the job description file

while some other (needed by the underlying WMS components) are automatically inserted by

the UI before submitting the job.

A sub-set of the attributes that are inserted by the user is mandatory, i.e. necessary for the

WMS to handle the job apopropriately and can be split in two categories:

- Mandatory: the lack of these attributes does not allow the submission of the job

- Mandatory with default value: the UI is able to provide default value for these attributes if

they are missing in the job description.

Next section provides the complete list of the job JDL attributes supported by the WMS

together with the format and rules to follow for adding them to the job description.

It is worth recalling that the requirements and rank expressions (see 2.23 and 2.24) that are

evaluated by the RB during the match making process, can include attributes describing the

CEs in the IS (attributes prefixed with “other.”) that are reported and described in [A3].

Before starting with the detailed attribute description we recall that a job description is

composed by entries that are strings having the format attribute = expression and are

terminated by the semicolon character. The whole description has to be included between

square brackets, i.e. [ ]. The termination with the semicolon is not mandatory for

the last attribute before the closing square bracket ].

Attribute expressions can span several lines provided the semicolon is put only at the end of

the whole expression. Comments must be preceded by a sharp character (#) or have to

follow the C++ syntax, i.e a double slash (//) at the beginning of each line or statements

begun/ended respectively with “/*” and “*/”.









IST-2000-25182 PUBLIC 8 / 31

Doc. Identifier:

DataGrid-01-TEN-0142-0_1

JDL ATTRIBUTES

Date: 18/01/2012









2.1. TYPE

This a string representing the type of the request described by the JDL, e.g.

Type = “Job”;

Possible values are:





 Job

 DAG (not supported in rel 2.0)

 Reservation (not supported in rel 2.0)

 Co-allocation (not supported in rel 2.0)





although for release 2.0 only “Job” is supported as request type. The value for this attribute

is case insensitive.

 Mandatory: Yes

 Default: “Job”





2.2. JOBTYPE

This a string or a list of strings representing the type of the job described by the JDL, e.g.:



JobType = “Interactive”;

or

JobType = {“Checkpointable”, “MPICH”};





Possible values are:

 Normal

 Interactive

 Checkpointable

 MPICH

 Partitionable (not supported in rel 2.0)

 Checkpointable, Interactive

 Checkpointable, MPICH





This attributes only makes sense when the Type attribute equals to “Job”. The value for

this attribute is case insensitive.

 Mandatory: Yes

 Default: “Normal”







IST-2000-25182 PUBLIC 9 / 31

Doc. Identifier:

DataGrid-01-TEN-0142-0_1

JDL ATTRIBUTES

Date: 18/01/2012









2.3. EXECUTABLE

This a string representing the executable/command name.

The user can specify an executable that lies already on the remote CE and in this case the

absolute path, possibly including environment variables of this file should be specified, e.g.:



Executable = “/usr/local/java/j2sdk1.4.0_01/bin/java”;





The other possibility is to provide a local executable name, which will be staged from the UI

node to the Computing Element WN. In this case only the file name has to be specified as

executable. The absolute path on the local file system executable should be then listed in the

InputSandbox attribute expression to meke it be transferred. E.g.:



Executable = “cms_sim.exe”;

InputSandbox = {“/home/edguser/sim/cms_sim.exe”, ……… };





It is important to remark that if the job needs for the execution some command line

arguments, they have to be specified through the Arguments attribute.

 Mandatory: Yes

 Default: No





2.4. ARGUMENTS

This is a string containing all the job command line arguments.

E.g. an executable sum that has to be started as:

$ sum N1 N2 –out result.out

is described by:

Executable = “sum”;

Arguments = “N1 N2 –out result.out”;

If you want to specify a quoted string inside the Arguments then you have to escape quotes

with the \ character. E.g. when describing a job like:

$ grep –i “my name” *.txt

you will have to specify:

Executable = “/bin/grep”;

Arguments = “-i \”my name\” *.txt”;

Analogously, if the job takes as argument a string containing a special character (e.g. the job

is the tail command issued on a file whose name contains the ampersand character, say

file1&file2), since on the shell line you would have to write:

$ tail –f file1\&file2

in the JDL you’ll have to write:

Executable = “/usr/bin/tail”;







IST-2000-25182 PUBLIC 10 / 31

Doc. Identifier:

DataGrid-01-TEN-0142-0_1

JDL ATTRIBUTES

Date: 18/01/2012







Arguments = “-f file1\\\&file2”;

i.e. a \ for each special character.

In general, special characters such as &, |, >, .out”) with the results of the operation that is put

automatically in the OutputSandbox attribute list by the UI and can then be retrieved by the

user.

 Mandatory: No

 Default: No









2.15. VIRTUALORGANISATION

This is a string representing the name of the VO the submitting user is currently working for.

If the edg-job-submit and the edg-job-list-match commands are issued with the --vo

option, then the value of this attribute is overwritten with the VO name specified on the

command line. Hereafter follows an example for this attribute:



VirtualOrganisation = “atlas”;







IST-2000-25182 PUBLIC 18 / 31

Doc. Identifier:

DataGrid-01-TEN-0142-0_1

JDL ATTRIBUTES

Date: 18/01/2012









The value for this attribute is case insensitive.

 Mandatory: Yes

 Default: The value of the parameter DefaultVo in the UI configuration file

$EDG_WL_LOCATION/etc/edg_wl_ui_cmd_var.conf if any.

No default otherwise









2.16. RETRYCOUNT

It is an integer representing the maximun number of job re-submission to be done in case of

failure due to some grid component (i.e. not to the job itself).

RetryCount has to be a number equal or grater than 0 and the actual number of submission

retries for a job is represented by the minimum value between RetryCount itself and the

value of the MaxRetryCount parameter in the WM configuration file (default for

MaxRetryCount is 10).

Hereafter follows an example for this attribute:



RetryCount = 3;





For example for disabling the job re-submission mechanism it suffices specifying:



RetryCount = 0;





 Mandatory: No

 Default: The value of the parameter RetryCount in the UI configuration file

$EDG_WL_LOCATION/etc/edg_wl_ui_cmd_var.conf if any.

No default otherwise









2.17. MYPROXYSERVER

This is a string representing the MYProxy server address () where the user has

registered her/his long-term proxy certificate.

For performing this registration by means of the myproxy-init command the user has to

specify either through the –s option or the MYPROXY_SERVER environment variable the

host name of the MyProxy server where to store the certificate proxy. The same hostname

should be specified as value of the MyProxyServer attribute in the JDL.

The presence of this attribute in the JDL triggers indeed the WMS proxy renewal mechanism

that is very useful when submitting long-running jobs to avoid job failure because it outlived

the validity of the initial proxy used for the submission.







IST-2000-25182 PUBLIC 19 / 31

Doc. Identifier:

DataGrid-01-TEN-0142-0_1

JDL ATTRIBUTES

Date: 18/01/2012







Proxy renewal can be enabled by default through the UI configuration by adding the

MyProxyServer parameter to the $EDG_WL_LOCATION/etc//edg_wl_ui.conf

file. If present, indeed it makes the UI automatically add to the job description the

MyProxyServer JDL attribute (if not specified by the user).





An example of the JDL setting is provided hereafter:



MyProxyServer = “skurut.cesnet.cz”;





Note that the port number must not be provided.





 Mandatory: No

 Default: The value of the parameter MyProxyServer in the UI configuration file

$EDG_WL_LOCATION/etc//edg_wl_ui.conf if any.

No default otherwise





2.18. HLRLOCATION

This is a string representing the Home Location Register address in the format



::





HLR is the service responsible for managing the economic transactions and the accounts of

user and resources. The presence of the HLRLocation attribute in the JDL enables

accounting in the WMS, i.e.:

 after the job submission the UI contacts the user's HLR and authorizes the payment

of that job.

 on the CE, while the job runs, a sensor monitors the resource usage and when the

job is done those data (usage records) are sent to the HLR

 the HLR computes the job cost according to the usage records and to the resource

price and then debits the user account

WMS accounting can be enabled by default through the UI configuration by adding the

HLRLocation parameter to the $EDG_WL_LOCATION/etc//edg_wl_ui.conf file.

If present, indeed it makes the UI automatically add to the job description the HLRLocation

JDL attribute (if not specified by the user).

An example of the JDL setting is provided hereafter:



HLRLocation =

"lilith.to.infn.it:56568:/O=CESNET/O=Masaryk University/CN=Miroslav Ruda"









IST-2000-25182 PUBLIC 20 / 31

Doc. Identifier:

DataGrid-01-TEN-0142-0_1

JDL ATTRIBUTES

Date: 18/01/2012







 Mandatory: No

 Default: The value of the parameter HLRLocation in the UI configuration file

$EDG_WL_LOCATION/etc//edg_wl_ui.conf if any.

No default otherwise









2.19. NODENUMBER

This is an integer greater than 1 specifying the number of nodes needed for a MPI job. This

attribute is only allowed if the job type is MPICH (see 2.2). An example of the JDL setting is

provided hereafter:



NodeNumber = 5;



The RB uses this attribute during the matchmaking for selecting those CE having a number

of CPUs equal or greater than the one specified in NodeNumber.





 Mandatory: Yes (if the job type is MPICH)

 Default: No









2.20. JOBSTEPS

This can be either an integer representing the number of steps for a checkpointable job, e.g.:



JobSteps = 100000;



or a list of strings representing labels associated to the steps of a checkpointable job, e.g:



JobSteps = {“rawdata”, “d0”, “d1”, “d2”, “gomos”};



As descrideb in [R3] a checkpointable application can be seen as “composed” by a set of

sequential steps, where for example a step can represent the processing of a file, the

analysis of an HEP event, etc. The various steps can be represented by a main stepper set

of iterations and it is usually worth to save the state of the job after each step.

The content of the main stepper can be defined through the JDL attribute JobSteps see [R3]

for details.

This attribute can only be set for checkpointable jobs.



 Mandatory: No

 Default: No









IST-2000-25182 PUBLIC 21 / 31

Doc. Identifier:

DataGrid-01-TEN-0142-0_1

JDL ATTRIBUTES

Date: 18/01/2012









2.21. CURRENTSTEP

This is an integer equal or greater than 0 indicating the step number to be taken as the initial

one when submitting a checkpointable job (see [R3] for details). If JobSteps is a list of labels

then CurrentStep indicates the position of the label in the list. E.g.



CurrentStep = 2;





for the second example of section 2.20 would indicate the step labelled “d1” as the first step.

This attribute can only be set for checkpointable jobs. If not provided by the user it is set to 0

automatically by the UI.





 Mandatory: Yes (for checkpointable jobs)

 Default: 0









2.22. LISTENERPORT

This is an integer (>0) that represents the port on which the condor grid_console_shadow

process started by the UI listens for the job standard streams. E.g.:



ListenerPort = 44000;





This attribute can only be included in the JDL for interactive jobs (see 2.2).

If this attribute is not included in the JDL then the listener port is the one automatically

assigned by the OS (suggested choice).





 Mandatory: No

 Default: No (the port is assigned dynamically by the OS)









2.23. REQUIREMENTS

This is a Boolean ClassAd expression that uses C-like operators. It represents job

requirements on resources. The Requirements expession can contain attributes that

describe the CE in the IS and are hence prefixed with “other.”. All these attributes are

reported in the Glue Schema for the CE (see [A3]).

To have a job scheduled to run on a given CE, this Requirements expression must evaluate

to true on the given CE. The evaluation of this expression is performed by the RB during the

match making phase. Hereafter follows and example of requirements expression:









IST-2000-25182 PUBLIC 22 / 31

Doc. Identifier:

DataGrid-01-TEN-0142-0_1

JDL ATTRIBUTES

Date: 18/01/2012







Requirements = other.GlueCEInfoLRMSType == "PBS" &&

other.GlueCEInfoTotalCPUs > 2 &&

Member("IDL1.7",other.GlueHostApplicationSoftwareRunTimeEnvironment)



The above expression requires a CE whose local resource manager is PBS, having at least

2 CPUs and the IDL software version 1.7 already installed.

The Requirements attribute is mandatory in the JDL. The requirements expression is

assigned automatically by the UI to:



Requirements = other.GlueCEStateStatus == "Production" ;





if not specified by the user. This default value is configurable by means of the

$EDG_WL_LOCATION/etc/edg_wl_ui_cmd_var.conf configuration file.

If the user has instead provided an expression for the Requirements attribute in the JDL, the

one specified in the configuration file is added (in AND) to the existing one. E.g. if in the JDL

file the user has specified:



Requirements = other.GlueCEInfoLRMSType == "PBS";





then the job description that is passed to the NS contains:



Requirements = (other.GlueCEInfoLRMSType == "PBS") &&

(other.GlueCEStateStatus == "Production");





Obviously the setting TRUE for the Requirements in the configuration file would result in:





Requirements = (other.GlueCEInfoLRMSType == "PBS") && TRUE ;





and hence does not have any impact on the evaluation of job requirements.





 Mandatory: Yes

 Default: other.GlueCEStateStatus == "Production"









2.24. RANK

This is a ClassAd Floating-Point expression that states how to rank CEs that have already

met the Requirements expression. Essentially, rank expresses a preference. A higher

numeric value equals a better rank. The RB will give to the job the CE with the highest rank.

The Rank expession can contain attributes that describe the CE in the IS and are hence

prefixed with “other.”. All these attributes are reported in the Glue Schema for the CE (see

[A3]).





IST-2000-25182 PUBLIC 23 / 31

Doc. Identifier:

DataGrid-01-TEN-0142-0_1

JDL ATTRIBUTES

Date: 18/01/2012







The evaluation of the rank expression is performed by the RB during the match making

phase. Hereafter follows and example of requirements expression:



RANK = OTHER.GLUECEPOLICYMAXRUNNINGJOBS – OTHER.GLUECESTATERUNNINGJOBS;





With the above Rank, the preferred CEs are the ones having the greatest number of free

slots for running jobs available.

The Rank attribute is mandatory in the JDL. The ranking expression, if not specified by the

user, is assigned automatically by the UI to:



Rank = other.GlueCEStateFreeCPUs;





for MPICH jobs, whilst:



Rank = - other.GlueCEStateEstimatedResponseTime;





for all other job types. This default value is configurable by means of the

$EDG_WL_LOCATION/etc/edg_wl_ui_cmd_var.conf configuration file.





 Mandatory: Yes

 Default: other.GlueCEStateFreeCPUs (for MPICH jobs)

- other.GlueCEStateEstimatedResponseTime (other job types)









2.25. FUZZYRANK

This is a Boolean attribute that enables fuzzyness in the ranking computation. In other words

if this attribute is set to TRUE, forces the matchmaking algorithm to adopt a stochastic

selection criteria while searching for the best matching CE. E.g. specifying:



FuzzyRank = true;





in the submitted JDL, the rank values associated to each matching CE represent the

probability that each CE has, to be selected as the best matching one. Namely, the higher is

the probability to be selected the higher the rank value.





 Mandatory: No

 Default: FALSE









IST-2000-25182 PUBLIC 24 / 31

Doc. Identifier:

DataGrid-01-TEN-0142-0_1

JDL ATTRIBUTES

Date: 18/01/2012









3. SPECIAL JDL EXPRESSIONS

Next sections briefly describe how it is possible to drive the resources discovery and

selection process by means of special expressions for the Requirements and Rank

attributes.



3.1. GANG-MATCHING

The matchmaking mainly occurs as a two-step process: entities (i.e., servers and customers)

requiring matchmaking services express their characteristics, requirements and preferences

to a matchmaker in classified advertisements (Step 1). Attributes of candidate classads are

accessed via the pseudo-attribute other and the matchmaker employs a very generic

matchmaking algorithm to evaluate the requirements and rank of the involved entities (Step

2). When the RB performs the matchmaking for scheduling a job, the involved entities are

the job (whose classads has been provided by the user) and the CE (whose classad is built

by the RB with the information from IS).

If we consider for example a job that requires a CE and a determined amount of free space

on a SE to run successfully, the matchmaking solution to this problem requires three

participants in the match (i.e., job, CE and SE), which cannot be accomodated by

conventional (bilateral) matchmaking. The gangmatching feature of the classads library

provides a multilateral matchmaking formalism to address this deficiency.

In order to exploit this new important extension of the classads library it suffices including the

appropriate classads built-in functions in the requirements expression.

A useful example, as already premised, is the usage of gangmatching to require a certain

amount of free space on a SE close to the execution CE. This can be achieved specifying

the job Requirements expression as follows:



Requirements = anyMatch( other.storage.CloseSEs ,

target.GlueSAStateAvailableSpace > 200);





This makes indeed the RB find the CEs having a close SE with at least 200 MB of free space

available for the VO the user belongs to.

The newly supported classads built-in functions are:

 anyMatch()

 whichMatch()

 allMatch()

Information and details about gangmatching and usage of this functions are provided in

document [R4].









3.2. GETACCESSCOST FUNCTION

When data requirements (i.e. the InputData and DataAccessProtocol attributes) are specified

in the JDL the RB, before the actual match making, performs a pre-match processing to find

out those CEs satisfying user authorisation requirements and classify them according to the

number of input files stored in storage element(s) which is (are) close to the CE itself and

speak at least one of the protocols specified in the DataAccessProtocol JDL attribute. Then it



IST-2000-25182 PUBLIC 25 / 31

Doc. Identifier:

DataGrid-01-TEN-0142-0_1

JDL ATTRIBUTES

Date: 18/01/2012







performs the Requirements checking phase starting from the first class until one or more

suitable CEs are found. If needed performs the Rank evaluation phase to choose the best

CE.

There is also another path that the RB can follow to select the best CE for running a job

when data requirements have been specified. This is allowed by the Replica Manager

getAccessCost function that given a CE and a set of LFNs (and/or GUIDs), provides the total

cost (time) for accessing them (see [R5] for details). Using the getAccesCost on all the CEs

satisfying the job requirements and on the InputData specified by the user in the JDL, the RB

ranks the suitable CEs and chooses the “best” one.

The mentioned mechanism is triggered by the following ranking expression:



Rank = other.DataAccessCost;





Indeed such Rank makes the RB, skip the classification procedure described at the

beginning of this section, perform the requirements checking phase and rank resources

using the getAccessCost outcomes, i.e. select for submission the CE having the lowest cost

for accessing input data.

It is worth noting that it is not possible to combine the “other.DataAccessCost”

expression with other ranking expression.









IST-2000-25182 PUBLIC 26 / 31

Doc. Identifier:

DataGrid-01-TEN-0142-0_1

JDL ATTRIBUTES

Date: 18/01/2012









4. JDL EXAMPLES

In the following sections are reported simple example of JDL describing different types of

jobs.





4.1. JOB WITHOUT DATA REQUIREMENTS

[

Type = "job";

JobType = "normal";

Executable = "script.sh";

Arguments = "60";

STDOUTPUT = "SIM.OUT";

StdError = "sim.err";

MyProxyServer = "skurut.cesnet.cz";

OUTPUTSANDBOX = {

"sim.err",

"sim.out"

};

// This attribute triggers accounting

HLRLocation = "lilith.to.infn.it:56568:/C=IT/O=INFN/OU=Personal

Certificate/L=Torino/CN=Andrea Guarise/Email=A.Guarise@to.infn.it";

InputSandbox = {

"/home/fpacini/GUI/sbin/script.sh"

};

rank = (other.GlueCEPolicyMaxRunningJobs-other.GlueCEStateRunningJobs);

// This is the default requirements expression

requirements = other.GlueCEStateStatus == "Production" ;

]



4.2. JOB WITH OUTPUT DATA

[

Type = "job";

JobType = "normal";

VirtualOrganisation = "cms";

Executable = "test.sh";

Arguments = "1 20000 sim1";

StdInput = "file2";

StdOutput = "sim.out";

StdError = "sim.err";

OutputSandbox = {

"sim.out",

"sim.err"

};





IST-2000-25182 PUBLIC 27 / 31

Doc. Identifier:

DataGrid-01-TEN-0142-0_1

JDL ATTRIBUTES

Date: 18/01/2012







RetryCount = 2;

OutputData = {

[

// No StorageElement is specified – Close SE is taken

OutputFile = "dataset1.out";

LogicalFileName = "lfn:myoutdata.1"

],

[

OutputFile = "dataset2.out";

LogicalFileName = "lfn:myoutdata.2"

]

};

InputSandbox = {

"/home/mytest/JNI/test.sh",

"/home/mytest/DATA/file2",

"/home/mytest/DATA/sim.dat"

};

Environment = "SIM_ROOT=/usr/local/";

rank = other.GlueHostMainMemoryRAMSize;

// job is submitted to a CE having a close SE with at least 5GB free

requirements = anyMatch( other.storage.CloseSEs,

target.GlueSAStateAvailableSpace > 5120);

]







4.3. JOB WITH INPUT AND OUTPUT DATA

[

Type = "job";

// JobType is not mandatory – If not specified “normal” is the default

JobType = "normal";

VirtualOrganisation = "cms";

Executable = "test.sh";

Arguments = "1 20000 sim1";

StdInput = "file2";

StdOutput = "sim.out";

StdError = "sim.err";

OutputSandbox = {"sim.out", "sim.err"};

// disable job re-submission in case of failure

RetryCount = 0;

InputData = {

"lfn:mydatafile1",

"lfn:mydatafile2",

"guid:135b7b23-4a6a-11d7-87e7-9d101f8c8b70"





IST-2000-25182 PUBLIC 28 / 31

Doc. Identifier:

DataGrid-01-TEN-0142-0_1

JDL ATTRIBUTES

Date: 18/01/2012







};

DataAccessProtocol = {"gridftp", "file"};

OutputData = {

[

OutputFile = "dataset1.out";

StorageElement = "grid011.pd.infn.it";

LogicalFileName = "lfn:myoutdata.1"

],

[

OutputFile = "dataset2.out";

LogicalFileName = "lfn:myoutdata.2"

],

[

OutputFile = "dataset3.out"

],

[

OutputFile = "dataset4.out";

StorageElement = "grid001.ct.infn.it"

]

};

OutputSE = "grid011.pd.infn.it";

InputSandbox = {

"/home/fpacini/JNI/test.sh",

"/home/fpacini/DATA/file2",

"/home/fpacini/DATA/sim.dat",

"/home/fpacini/HandsOn-0409/WP1testA"

};

// Ranking is done according to cost for accessing InputData

rank = other.DataAccessCost;

requirements = (other.GlueCEStateFreeCPUs>=2) &&

(other.GlueCEInfoLRMSType=="lsf")

]







4.4. INTERACTIVE JOB

[

Type = "job";

JobType = "interactive";

VirtualOrganisation = "my_vo";

Executable = "scriptint.sh";

RetryCount = 1;

// grid_console_shadow listens on this port. If not specified is

// assigned by the OS





IST-2000-25182 PUBLIC 29 / 31

Doc. Identifier:

DataGrid-01-TEN-0142-0_1

JDL ATTRIBUTES

Date: 18/01/2012







ListenerPort = 6000;

FuzzyRank = true;

InputSandbox = {

"/home/fpacini/JDL2.0/fox/scriptint.sh",

"/home/fpacini/JDL2.0/fox/cpi",

"/home/fpacini/DATA/sim.dat"

};

requirements = (other.GlueHostOperatingSystemRelease == "LINUX") &&

(other.GlueHostMainMemoryRAMSize >= 128)

// this is needed for Interactive jobs. Don’t need to specify it. It is

// added automatically by UI

&& (other.GlueHostNetworkAdapterOutboundIP);

rank = other.GlueHostBenchmarkSF00

]







4.5. CHECKPOINTABLE JOB

[

Type = "job";

JobType = "checkpointable";

VirtualOrganisation = "my_vo";

// This is the total number of steps for the job. Not mandatory

// if your job already knows it

JobSteps = 10000000;

CurrentStep = 1;

Executable = "hsum";

Arguments = "200000 gsiftp://lxde01.pd.infn.it/tmp/root_test/";

StdOutput = "sim.out";

StdError = "sim.err";

InputData = {

"lfn:wp1-test-file-01-lfn",

"lfn:wp1-test-file-02-lfn",

"lfn:wp1-test-file-04-lfn"

};

DataAccessProtocol = {"file", "rfio"};

OutputSandbox = {

"sim.err",

"sim.out"

};

RetryCount = 3;

InputSandbox = {

"/home/fpacini/GUI/sbin/hsum"

};





IST-2000-25182 PUBLIC 30 / 31

Doc. Identifier:

DataGrid-01-TEN-0142-0_1

JDL ATTRIBUTES

Date: 18/01/2012







// This is the default rank expression

rank = -other.GlueCEStateEstimatedResponseTime;

// semicolon “;” can be omitted for last attribute specification

requirements = other.GlueCEInfoLRMSType=="pbs"

]







4.6. MPI JOB

[

Type = "job";

JobType = "mpich";

VirtualOrganisation = "my_vo";

// This is the minimum number of CPU needed by the job

NodeNumber = 6;

Executable = "cpi";

StdOutput = "sim.out";

StdError = "sim.err";

OutputSandbox = {

"sim.err",

"sim.out"

};

// This attribute triggers the proxy-renewal mechanism

MyProxyServer = "skurut.cesnet.cz";

RetryCount = 3;

InputSandbox = {

"/home/fpacini/JDL2.0/fox/cpi"

};

requirements = other.GlueHostNetworkAdapterOutboundIP &&

Member("IDL2.0",other.GlueHostApplicationSoftwareRunTimeEnvironment);

rank = other.GlueCEStateFreeCPUs;;

]









IST-2000-25182 PUBLIC 31 / 31



Other docs by huanghengdong
6th-syllabus-Threet-2011-2012
Views: 0  |  Downloads: 0
Gina Cillo rd
Views: 0  |  Downloads: 0
szoftverfejlesztok.xls
Views: 1  |  Downloads: 0
cv-notes-exemple
Views: 0  |  Downloads: 0
Damascus Steel_seth Willouhby
Views: 0  |  Downloads: 0
UP_HolderReportingManual
Views: 0  |  Downloads: 0
4
Views: 0  |  Downloads: 0
ScienceFairLesson2
Views: 0  |  Downloads: 0
By registering with docstoc.com you agree to our
privacy policy

You are almost ready to download!

You are almost ready to download!