chinshu
Description
grid computing security
Document Sample


Enabling Grids for E-sciencE
An overview of grid middleware
and gLite
www.eu-egee.org
EGEE-II INFSO-RI-031688
Enabling Grids for E-sciencE
Outline
1. An overview of grid middleware
2. Introduction of gLite
3. Job managememt services of gLite.
EGEE-II INFSO-RI-031688 Overview of EGEE 2
A Grid
Enabling Grids for E-sciencE
• Grid
– Many machines
– Across many
locations and
administrative
domains
• Grid middleware runs
on each machines
– High Performance
Computing INTERNET
– High capacity
Storage
– Meet the need of
scientific computing
• Grid trust VOs
– Users join VOs
– Virtual organisation
contributes resources
& negotiates access
• Additional services
also enable the grid
– Operation
– Dissemination Virtual Organization is entity that corresponds to
a organization or group of people. Desires to share
EGEE-II INFSO-RI-031688 computing, data or software resources Overview of EGEE 3
Authentication, Authorisation (AA)
Enabling Grids for E-sciencE
Users in many locations and organisations
Access services (“user interface”) :
logon, upload credentials, run m/w commands
GRID SERVICES
Build on Grid Security Infrastructure
Encryption and Data Integrity, Authentication and Authorization
“Gate keeping”:
Authenticate users and give permissions
Resources in many locations and organisations LSF,…
PBS, Condor,
System software
Operating system NFS, … File system Local scheduler
HPSS, CASTOR…
Hardware
Computing clusters,… Network resources Data storage
EGEE-II INFSO-RI-031688 Overview of EGEE 4
Basic job Management
Enabling Grids for E-sciencE
Users
Tools for:
•Submit jobs to a CE
How do I run a job •Monitor jobs
on a compute •Get outputs
element (CE) ? •Transfer files to CE
(CE =batch queue) •Transfer files between CE
and SE
Resources
Compute elements Network resources Data storage
EGEE-II INFSO-RI-031688 Overview of EGEE 5
Information service (IS)
Enabling Grids for E-sciencE
Users
Information Service (IS):
• Resources such as
How do I know CE and SE report their
which CE could status to IS
run my job?
• Grid services query
Which is free? IS before running jobs
Resources
Compute elements Network resources Data storage
EGEE-II INFSO-RI-031688 Overview of EGEE 6
File management
Enabling Grids for E-sciencE
Users
Storage
Transfer
We’ve terabytes
My data are in
Our data are in
files, and I’ve
of data in files. Replication
terabytes management
Resources
Compute elements Network resources Data storage
EGEE-II INFSO-RI-031688 Overview of EGEE 7
Main components
Enabling Grids for E-sciencE
User Interface (UI): The place where users access the Grid
Information System: collection information about the resource
Characteristics and status of CE and SE
Resource Broker (RB): Matches the user requirements with the available
resources on the Grid
Computing Element (CE): A batch queue on a site’s computers where
the user’s job is executed
Storage Element (SE): provides (large-scale) storage for files
EGEE-II INFSO-RI-031688 Overview of EGEE 8
Current production middleware
Enabling Grids for E-sciencE
Replica
“User Input “sandbox”
DataSets info Catalogue
interface” Information
Output “sandbox”
Service
Resource Broker
(WorkLoad Mgr.)
Job Submit Event
Author.
Job Query
&Authen.
Publish
Storage
Element
Logging & Computing
Book-keeping Job Status
Element
EGEE-II INFSO-RI-031688 Overview of EGEE 9
“gLite 3.0” the current middleware
Enabling Grids for E-sciencE
• Being deployed on EGEE production Grid now
• Runs on various Linux releases
– “Scientific Linux” most common
– Ports to other Operating Systems in progress
• History
– During last 2 years, some new services were created in releases
of new middleware, up to gLite 1.5, has been in pre-production
use
– A subset of these is deployed with some of the previous
middleware (LCG 2.7)
All components already in LCG 2.7.0 plus upgrades
• this already includes new versions of VOMS, R-GMA and FTS
The Workload Management System (with LB, CE, UI) of gLite 1.5.0
EGEE-II INFSO-RI-031688 Overview of EGEE 10
gLite Grid Middleware Services
Enabling Grids for E-sciencE
CLI API
Access
Authorization Information & Application
Auditing Monitoring Monitoring
Authentication Information &
Monitoring Services
Security Services
Metadata File & Replica Job Package
Catalog Catalog Accounting Provenance Manager
Storage Data Computing Workload
Element Movement Connectivity Element Management
Data Management Workload Mgmt Services
EGEE-II INFSO-RI-031688 Overview of EGEE 11
http://gridportal.hep.ph.ic.ac.uk/rtm
Enabling Grids for E-sciencE
14:00 on 17 Jan 2007
EGEE-II INFSO-RI-031688 Overview of EGEE 12
Enabling Grids for E-sciencE
gLite Job Management Services
www.eu-egee.org
EGEE-II INFSO-RI-031688
gLite Job Management Services
Enabling Grids for E-sciencE
EGEE-II INFSO-RI-031688 Overview of EGEE 14
WMS’s Architecture
Enabling Grids for E-sciencE
Job management
requests (submission,
cancellation) expressed
via a Job Description
Language (JDL)
EGEE-II INFSO-RI-031688 Overview of EGEE 15
WMS’s Architecture
Enabling Grids for E-sciencE
Keeps submission
Requests
Requests are kept
for a while, waiting for
being dispatched
If there is no matching
resource available
EGEE-II INFSO-RI-031688 Overview of EGEE 16
WMS’s Architecture
Enabling Grids for E-sciencE
Repository of resource
information
Updated via notifications
and/or active
polling on sources
Provide matchmaker
With information to decide
best resources for request.
EGEE-II INFSO-RI-031688 Overview of EGEE 17
WMS’s Architecture
Enabling Grids for E-sciencE
Finds an appropriate
CE or resource for job
request according to the
information from ISM.
Taking into account job
preferences, resource
status, policies on resources
EGEE-II INFSO-RI-031688 Overview of EGEE 18
WMS’s Architecture
Enabling Grids for E-sciencE
Performs the actual job
submission and monitoring
Normally it is Condor.
EGEE-II INFSO-RI-031688 Overview of EGEE 19
WMS’s Architecture
Enabling Grids for E-sciencE
Computing
Element is the
place where
you jobs run
EGEE-II INFSO-RI-031688 Overview of EGEE 20
WMS components (1)
Enabling Grids for E-sciencE
WMS components handling the job during its lifetime and
performing the submission
• Network Server (NS)
– is responsible for
Accepting incoming requests from the UI.
Authenticates the user.
Obtains a delegated full proxy from the user proxy.
Enqueues the job to the Workload Management..
• WorkLoad Manager (WM)
– Is responsible for
Calls Matchmaker to find the resource which best matches the job requirements.
Interacting with Information System and File catalog.
Calculates the ranking of all the matchmaked resourceCondorC
• Information Supermarket (ISM)
– is responsible for
basically consists of a repository of resource information that is available in read
only mode to the matchmaking engine
EGEE-II INFSO-RI-031688 Overview of EGEE 21
WMS components (2)
Enabling Grids for E-sciencE
WMS components handling the job during its lifetime and
performing the submission
• Job Adapter
– is responsible for
making the final touches to the JDL expression for a job, before it is passed to
CondorC for the actual submission
creating the job wrapper script that creates the appropriate execution environment
in the CE worker node
• transfer of the input and of the output sandboxes
• Job Controller (JC)
– Is responsible for
Converts the condor submit file into ClassAd
hands over the job to CondorC
• CondorC
– responsible for
performing the actual job management operations
• job submission, job removal
• Log Monitor
– is responsible for
watching the CondorC log file
intercepting interesting events concerning active jobs
• events affecting the job state machine
triggering appropriate actions.
EGEE-II INFSO-RI-031688 Overview of EGEE 22
CE’s Architecture
Enabling Grids for E-sciencE
Computing Element is
built on a homogeneous
farm of computing nodes
(called Worker Nodes)
Also there are many
components inside CE
such as
gatekeeper, globus-
jobmanager, ..
EGEE-II INFSO-RI-031688 Overview of EGEE 23
CE’s Architecture
Enabling Grids for E-sciencE
Gatekeeper
Grants access to the
CE and map grid user
EGEE-II INFSO-RI-031688 to a local user id.EGEE
Overview of 24
CE’s Architecture
Enabling Grids for E-sciencE
Batch System
A cluster of compute
nodes controlled by a
head node.
handles the job execution
Example:
Torque (Open PBS), PBS
EGEE-II INFSO-RI-031688 Overview of EGEE 25
A typical case of glite-enabled grid
Enabling Grids for E-sciencE
• Many CE in glite-enabled grid
• Few WMS coordinating the
CEs and broker jobs to proper
CEs.
EGEE-II INFSO-RI-031688 Overview of EGEE 26
Computing Element Components)
Enabling Grids for E-sciencE
Gatekeeper
• Grants access to the CE.. Authenticate users and map users to local accounts.
• forks the globus-jobmanager.
globus-jobmanager
• Fork Condor-C (in CE) to help submit jobs to batch systems.
BLAPHD (Batch Local ASCII Helper Protocol Daemon)
• Offer an unique interface for condor-c(in CE) to submit jobs to different batch systems
• BLAPHD commands is used by Condor-C (in CE) to submit jobs to the batch system.
Batch System
• handles the job execution on the available local worker nodes.
• Batch System consists of:
- torque (formerly known as OpenPBS) resource manager .
- maui job scheduler .
– A cluster MUST be homogeneous.
Worker nodes
• It is the host executing the jobs .
• Also responsible for downloading and uploading jobs’ data from or to WMS or SE.
EGEE-II INFSO-RI-031688 Overview of EGEE 27
Job State Machine
Enabling Grids for E-sciencE
EGEE-II INFSO-RI-031688 Overview of EGEE 28
Job State Machine (1/9)
Enabling Grids for E-sciencE
Submitted:
job is entered by the user
to the User Interface but
not yet transferred to
Network Server for
processing
EGEE-II INFSO-RI-031688 Overview of EGEE 29
Job State Machine (2/9)
Enabling Grids for E-sciencE
Waiting:
job was accepted by NS and is
waiting for Workload Manager
processing or being processed
by WMHelper modules.
EGEE-II INFSO-RI-031688 Overview of EGEE 30
Job State Machine (3/9)
Enabling Grids for E-sciencE
Ready:
job processed by WM
and its Helper
modules (CE found)
but not yet transferred
to the CE (local batch
system queue) via JC
and CondorC..
EGEE-II INFSO-RI-031688 Overview of EGEE 31
Job State Machine (4/9)
Enabling Grids for E-sciencE
Scheduled:
job waiting in the queue
on the CE.
EGEE-II INFSO-RI-031688 Overview of EGEE 32
Job State Machine (5/9)
Enabling Grids for E-sciencE
Running:
job is running on CE’s queuing
system (inside one of the
worker nodes)
EGEE-II INFSO-RI-031688 Overview of EGEE 33
Job State Machine (6/9)
Enabling Grids for E-sciencE
Done:
job exited or considered to be in a
terminal state by CondorC
(e.g., submission to CE has failed in an
EGEE-II INFSO-RI-031688
unrecoverable way). Overview of EGEE 34
Job State Machine (7/9)
Enabling Grids for E-sciencE
Aborted:
job processing was
aborted by WMS
(waiting in the WM
queue or CE for too
long, over-use of
quotas, expiration of
user credentials).
EGEE-II INFSO-RI-031688 Overview of EGEE 35
Job State Machine (8/9)
Enabling Grids for E-sciencE
Cancelled:
job has been successfully
canceled on user request.
EGEE-II INFSO-RI-031688 Overview of EGEE 36
Job State Machine (9/9)
Enabling Grids for E-sciencE
Cleared:
output sandbox was
transferred to
the user or removed due to
the timeout.
EGEE-II INFSO-RI-031688 Overview of EGEE 37
Further information
Enabling Grids for E-sciencE
• EGEE www.eu-egee.org
• gLite http://www.glite.org/
• LCG http://lcg.web.cern.ch/LCG/
• Open Grid Forum http://www.gridforum.org/
• Globus Alliance http://www.globus.org/
• VDT http://www.cs.wisc.edu/vdt/
EGEE-II INFSO-RI-031688 Overview of EGEE 38
Related docs
Get documents about "