; Dataflows in SRB using SDSC Matrix
Documents
Resources
Learning Center
Upload
Plans & pricing Sign in
Sign Out
Your Federal Quarterly Tax Payments are due April 15th Get Help Now >>

Dataflows in SRB using SDSC Matrix

VIEWS: 9 PAGES: 31

  • pg 1
									Dataflows in SRB using SDSC Matrix
Arun Jagatheesan Architect & Team Lead, SDSC Matrix San Diego Supercomputer Center
10th Annual NPACI/SDSC Summer Computing Institute August 23-27, 2004, Sun Diego, California, USA

San Diego Supercomputer Center University of Florida Grid Physics Network (GriPhyN)

Talk Outline
• • • • • Introduction to Gridflows Introduction to SDSC Matrix Project Data Grid Language Architecture of SDSC Matrix Matrix Usage

• What can you do for Matrix?

2 San Diego Supercomputer Center University of Florida Grid Physics Network (GriPhyN)

Acknowledgement
• • • • • • • Jonathan Weinberg Daniel Moore Allen Ding Reena Mathew Erik Vandekieft SRB Team You! – ( hey your name can be here  )

SDSC SRB, NSF GriPhyN, NSF SCEC, DoE Portals Project,
3 San Diego Supercomputer Center University of Florida Grid Physics Network (GriPhyN)

Gridflows (Grid Workflow)
• Automation of an execution pipeline
• Data and/or tasks processed by multiple autonomous grid resources • According to set of procedural rules • Confluence of multiple autonomous administrative domains

• GridFlow Execution Servers
• By themselves are from autonomous administrative domains • P2P (Distributed) Control

4 San Diego Supercomputer Center University of Florida Grid Physics Network (GriPhyN)

Talk Outline
• • • • • Introduction to Gridflows Introduction to SDSC Matrix Project Data Grid Language Architecture of SDSC Matrix Matrix Usage

• What can you do for Matrix?

5 San Diego Supercomputer Center University of Florida Grid Physics Network (GriPhyN)

SDSC Matrix Project
• CS Research & Development
• Gridflow Description, Data Grid Administration Rules • Gridflow P2P protocols for Gridflow Server Communication

• Development
• SRB Data Grid Web Services • SRB Datagrid flow automation and provenance

• Theory  Practice
• Help in customized development & deployment of gridflow concepts in scientific / grid applications • Visibility and assist in standardization of efforts at GGF
6 San Diego Supercomputer Center University of Florida Grid Physics Network (GriPhyN)

Advantages from SRB Perspective
• Reduces the Client-Server Communication
• The whole execution logic is sent to the server • Less number of WAN messages • Our experiments prove significant increase in performance

• Datagrid Information Lifecycle Management
• Autonomic: “Move data at 9:00 PM in weekdays and in week ends” • Data Grid Administration

• Power-users and Sophisticated Users
• Data Grid Administrator (Rules to manage data grid) • Scientist or Librarian (Visualized data flow programming)
7 San Diego Supercomputer Center University of Florida Grid Physics Network (GriPhyN)

Talk Outline
• • • • • Introduction to Gridflows Introduction to SDSC Matrix Project Data Grid Language Architecture of SDSC Matrix Matrix Usage

• What can you do for Matrix?

8 San Diego Supercomputer Center University of Florida Grid Physics Network (GriPhyN)

What they want?
We know the business (scientific) process

CyberInfrastructure is all we care (why bother about atoms or DNA)
9 San Diego Supercomputer Center University of Florida Grid Physics Network (GriPhyN)

What they want?
Use DGL to describe your process logic with abstract references to datagrid infrastructure dependencies

10 San Diego Supercomputer Center University of Florida Grid Physics Network (GriPhyN)

Why a Gridflow Language?
• Infrastructure independent description
• Abstract references to hardware and cyberinfrastructure

• Description of execution flow logic
• Separate the execution flow logic from application logic • (e.g) MonteCarlo is an application, execution of that 10 times or till a variable becomes zero is execution logic • Procedural Rules associated with execution flow

• Provenance
• What happened, when, who, how …? (and querying)

11 San Diego Supercomputer Center University of Florida Grid Physics Network (GriPhyN)

Gridflow Language Requirements
• High level Abstract descriptions
• Abstract description of cyberinfrastructure dependencies

• Simple yet flexible
• Flexible to describe complex requirements (no brute force)

• Gridflow dependency patterns
• Based on execution structure and data semantics • (Parallel, Sequential, fork-new), (milestones, for-each, switch-case)..

• Asynchronous execution
• For long-run requests

• Querying using existing standard
• XQuery
12 San Diego Supercomputer Center University of Florida Grid Physics Network (GriPhyN)

Gridflow Language Requirements
• Process meta data and annotations
• Runtime definition, update and querying of meta-data

• Runtime Management of Gridflows
• Stop gridflow at run time

• Partitioning
• Facility in language to divide a gridflow request to multiple requests (Excellent Research Topic)

• Import descriptions
• Refer other gridflows in execution

13 San Diego Supercomputer Center University of Florida Grid Physics Network (GriPhyN)

Data Grid Language (DGL)
• XML based gridflow description
• Describes execution flow logic

• ECA-based rule description for execution
• ECA = Event, Condition, Action

• Querying of Status of Gridflow
• XQuery / Simple query of a Gridflow Execution

• Scoped variables and gridflow patterns
• For control of execution flow logic

14 San Diego Supercomputer Center University of Florida Grid Physics Network (GriPhyN)

DGL Requests
• Data Grid Flow
• An XML Structure that describes the execution logic, associated procedural rules and grid environment variables

• Status Query
• An XML Structure used to query the execution status any gridflow or a sub-flow at any granular level

• A DGL or Matrix client sends any of these to the Matrix Server
15 San Diego Supercomputer Center University of Florida Grid Physics Network (GriPhyN)

Data Grid Request
Annotations about the Data Grid Request

Can be either a Flow or a Status Query

16 San Diego Supercomputer Center University of Florida Grid Physics Network (GriPhyN)

Grid User
<GridUser> <userID>Matrix-demo</userID> <organization> <organizationName>sdsc</organizationName> </organization> <challenge-Response>******</challenge-Response> <homeDirectory>/home/Matrixdemo.sdsc</homeDirectory> <defaultStorageResource>sdscunix</defaultStorageResource> <phoneNumber>0</phoneNumber> <e-mail>arun@sdsc.edu</e-mail> </GridUser>
17 San Diego Supercomputer Center University of Florida Grid Physics Network (GriPhyN)

Grid Ticket

18 San Diego Supercomputer Center University of Florida Grid Physics Network (GriPhyN)

VO Info

19 San Diego Supercomputer Center University of Florida Grid Physics Network (GriPhyN)

Flow
Scoped Variables that can control the flow Logic used by the sub-members Sub-members that are the real execution statements

20 San Diego Supercomputer Center University of Florida Grid Physics Network (GriPhyN)

Talk Outline
• • • • • Introduction to Gridflows Introduction to Matrix Data Grid Language Architecture of SDSC Matrix Matrix Usage

• What can you do for Matrix?

21 San Diego Supercomputer Center University of Florida Grid Physics Network (GriPhyN)

Matrix Gridflow Server Architecture
JAXM Wrapper WSDL Description

SOAP Service for Matrix Clients

Event Publish Subscribe, Notification

JMS Messaging Interface

Matrix Data Grid Request Processor Sangam P2P Gridflow Broker and Protocols Transaction Handler Flow Handler and Execution Manager

Status Query Handler
XQuery Processor ECA rules Handler

Workflow Query Processor Gridflow Meta data Manager

Matrix Agent Abstraction

Persistence (Store) Abstraction JDBC In Memory Store

SDSC SRB Agents

Other SDSC Data Services

Agents for java, WSDL and other grid executables

22 San Diego Supercomputer Center University of Florida Grid Physics Network (GriPhyN)

Talk Outline
• • • • • Introduction to Gridflows Introduction to Matrix Data Grid Language Architecture of SDSC Matrix Matrix Usage

• What can you do for Matrix?

23 San Diego Supercomputer Center University of Florida Grid Physics Network (GriPhyN)

Using XML-Editor
• Only XML (DGL) file required
• All that is needed is a DGL file that has to be sent to the server

• Use XML Editor to make DGL file
• XMLSpy® could be used

• Send it to the Matrix Server
• Use the Java Program DGLSender.java

24 San Diego Supercomputer Center University of Florida Grid Physics Network (GriPhyN)

Using Java API
• Download our Matrix Java Client
• Programmatically create a request

• Use it in your java program to interact with the grid and develop a local application • http://www.npaci.edu/DICE/SRB/matrix/Software/ index.html

25 San Diego Supercomputer Center University of Florida Grid Physics Network (GriPhyN)

Using WSDL
• Use the WSDL to create a SOAP based client in any programming language or your preference

26 San Diego Supercomputer Center University of Florida Grid Physics Network (GriPhyN)

Using DG-Modeler
• GUI for dataflow programming

27 San Diego Supercomputer Center University of Florida Grid Physics Network (GriPhyN)

Gridflow Process I

End User using DGBuilder

Gridflow Description Data Grid Language

28 San Diego Supercomputer Center University of Florida Grid Physics Network (GriPhyN)

Gridflow Process II

Abstract Gridflow using Data Grid Language

Planner

Concrete Gridflow Using Data Grid Language

29 San Diego Supercomputer Center University of Florida Grid Physics Network (GriPhyN)

Gridflow Process III

Gridflow Processor

Concrete Gridflow Using Data Grid Language Gridflow P2P Network
30 San Diego Supercomputer Center University of Florida Grid Physics Network (GriPhyN)

got ideas/suggestions? Contact:
SDSC Matrix project arun@sdsc.edu Google key word: SDSC Gridflow
Click here to start the slide show again

San Diego Supercomputer Center University of Florida Grid Physics Network (GriPhyN)


								
To top