Exploration of Embedded System Architectures
Document Sample


The Globus Project
Adam Belloum
Computer Architecture & Parallel Systems group
University of Amsterdam
adam@science.uva.nl
Globus requirements
• Security
• Global name space
• Fault tolerance
• Accommodating heterogeneity
• Binary management and application provisioning
• Multilanguage support
• Multilanguage support
• Persistence
• Extensibility
• Site autonomy
• Complexity management
Globus design principles
• Provide a toolkit
– from which users can pick and choose
• Focus on low-level functionality,
– facilitating high-level tools (general usability)
• Use standards whenever possible
– for both interfaces and implementations
• Emphasize the identification and definition of
– protocols and services first,
– and APIs and software development kits next
• Provide open-source community development
• Provide immediate usefulness
• Do not provide a virtual machine abstraction
Globus architectural details
• Globus started out with the bottom-up
premise that:
– a grid must be constructed as a set of tools
developed from user requirements.
• This architecture is based on composing tools
from a kit.
– much of the initial design time was spent
determining the user requirements for which grid
tools could be built.
Globus architectural details
• Resource manager to start jobs
– assuming users had procured accounts beforehand on
all of the machines on which they could possibly run
• Tool and API for transferring files from one
machine to another
– used for binary and data transfer
• Tools for procuring credentials and certificates
• Service for collecting resource information about
machines on a grid.
Globus architectural details
• The designers of Globus believe that
– New services/tools must be added to the existing
set such that users can combine available tools to
get the work done.
• Much of the later development in Globus has
been directed
– at composing these tools in order to achieve a
specific goal.
Globus architectural details
• The communications module provides network-
aware communications messaging capabilities.
– The implementation of the communications module
in the Globus Toolkit was called Nexus.
• The resource location and allocation module
provides mechanisms for
– expressing application resource requirements
– identifying resources that meet the requirements
– scheduling resources after they have been located.
Globus architectural details
• The authentication module provides means to verify the
identity of both humans and resources.
– GSSAPI hides the underlying authentication technique: Kerberos
(centralized authentication system) or SSL (PKI)
• The information service module provides a uniform
mechanism for obtaining information about meta-system.
– Known as the Metacomputing Directory Service (MDS), which
builds upon an API of LDAP.
• The data access module is responsible for providing
remote access to persistent storage, such as files.
Globus architectural details
• The grid fabric layer provides the resources to
which grid protocols mediate access.
• The connectivity layer defines the core
communication and authentication protocols
required for grid-specific network transactions.
– this layer includes the Grid Security Infrastructure
(GSI).
Globus architectural details
• The resource layer, which defines protocols for
– secure negotiation, initiation, monitoring, control, accounting,
and payment of sharing operations on individual resources
– a Grid Resource Information Protocol (GRIP), a resource
information protocol;
– the Grid Resource Registration Protocol (GRRP), used to
register resources with the Grid Index Information Servers;
– the Grid Resource Access and Management (GRAM) protocol,
used to allocate and monitor resources; and GridFTP, which is
used for data access.
• The collective layer is used to coordinate access to
multiple resources, which, in terms of the GT, refers to
MetaComputing Directory Service (MDS), supported by
GRRP and GRIP.
• Finally, grid applications are at the very top
Globus architectural details
• With respect to grids, a bottom-up approach tends to
– result in early successes simply because the approach targets
immediate user requirements.
– the implementation of a new service or tool can be quick, since
much of the complexity of the underlying substrate is abstracted
away.
• The risk is that this approach may not
– Scale as the number of tools or services increases, since an
increasing number of pairwise protocols are necessary to
ensure that the tools compose seamlessly.
– accommodate changing requirements
Layered Architecture
Applications
GlobusView High-level Services and Tools Testbed Status
DUROC MPI MPI-IO CC++ Nimrod/G globusrun
Core Services
Nexus GRAM
Metacomputing Globus
Directory Heartbeat
Security
Gloperf Service Monitor GASS
Interface
Condor MPI Local TCP UDP
Services
LSF Easy NQE AIX Irix Solaris
Security in Globus
Overview of Security Standards in the Grid, CSE 225 High Performance and
Computational Grids Spring 2000 kwalsh@ucsd.edu
Globus Security Requirements
• Single sign-on
• Protection of credentials
• Interoperability with local security solutions
• Exportability
• Uniform credentials/certification infrastructure
• Support for secure group communication
• Support for multiple implementations
Globus Security Assumptions
• Grid consists of multiple trust domains
• Resource pool & users are large and dynamic
• interoperate with local security solutions
– local security policies differ
• authentication exportable
– cannot directly or indirectly require use of bulk privacy
• uniform credentials/certification
– a user will be associated differently with site it has access
– processes used in a computation are dynamic access
control
Globus Security Infrastructure (GSI)
• Provides authentication and data integrity
– Data signing (not encryption) services for
Unix/Windows client/server programs
• Utilize an X.509 PKI
• GSI library is layered on top of the SSLeay
• Performs X.509 certificate handling & SSL
protocol.
Security in Globus (8)
CREDENTIAL
User
Assignment of User Proxy
Single sign-on
credentials to
“user proxies” Globus
via “grid-id”
Credential
Mutual
user-resource
authentication
Site 1 Site 2
Authenticated Mapping
GRAM Process Process GRAM to local ids
interprocess
communication
GSI Process Process GSI
Ticket CertificateGSSAPI:
Process Process multiple
Kerberos Public Key low-level
mechanisms
Security in Globus (7)
• .
Security in Globus Standards
• Standards subscribed to:
– Generic Security Services (GSS) RFC 2078
– Secure Socket Layer (SSL)
– Public Key Cryptography based on X.509 certificates
– Kerberos
Security in Globus (2)
– Kerberos
SSL
Technology X.509
Standards SSH PGP PKI Kerberos DCE IPSec VPN
Security
Requirements
Authentication x x x x x x
Authorization x x x x x x x
Assurance x x x x x x
Accounting x x
Audit x x
Integrity x x x x x x
Confidentiality x x x x x x x
Resource Management
Introduction
• The GT resource management components
includes a set of service components:
– Globus Resource Allocation Manager
• GRAM.
– Dynamically Updated Request Online Component
• DUROC
– Globus Architecture for Reservation and Allocation
• GARA
What is the role of the GRAM
• GRAM is designed to provide
– a single common protocol and API for requesting and
using remote system resources,
– by providing a uniform, flexible interface to, local job
scheduling systems.
• GRAM provides a simple authorization mechanism
based on GSI identities and a mechanism to map
GSI identities to local user accounts.
The main feature of the GRAM
• GRAM reduces the number of mechanisms
required for using remote resources.
• This capability is consistent with the
"hourglass" role played by most of the
GT’s components:
– GRAM is the neck of the hourglass, with
applications and higher-level services
(resource brokers or metaschedulers) above
it and local control and access mechanisms
below it.
– Both sides need work only with GRAM, so
the number of interactions, APIs, and
protocols that need to be used are greatly
reduced.
http://www.globus.org/mds/
GRAM Components (GT2)
MDS client API calls
to locate resources
Client MDS: Grid Index Info Server
MDS client API calls Site boundary
to get resource info
GRAM client API calls to
request resource allocation
MDS: Grid Resource Info Server
and process creation. Query current status
GRAM client API state of resource
Globus Security change callbacks
Infrastructure Local Resource Manager
Allocate &
Request
create processes
Create Job Manager
Gatekeeper Parse
Process
Monitor &
control Process
RSL Library
Process
Resource Specification Language
• Much of the power of GRAM is in the RSL
• Common language for specifying job requests
– GRAM service translates this common language into
scheduler specific language
• GRAM service constrains RSL to a conjunction of
(attribute=value) pairs
– E.g. &(executable=“/bin/ls”)(arguments=“-l”)
• GRAM service understands a well defined set of
attributes
A Co-allocation Multirequest
+( & (resourceManagerContact=
*** “flash.isi.edu:2119/jobmanager-
lsf:/O=Grid/…/CN=host/flash.isi.edu”)
(count=1)
Different resource
(label="subjob A") managers
(executable= my_app1)
)
Different ( & (resourceManagerContact=
counts
***“sp139.sdsc.edu:2119:/O=Grid/…/CN=host/sp097.sdsc.edu")
(count=2)
(label="subjob B") Different executables
(executable=my_app2)
)
DUROC: Dynamically Updated Request
Online Component
• Simultaneous allocation of a resource set
– Handled via optimistic co-allocation based on free
nodes or queue prediction
– And advance reservations
• In the GT2:
globusrun co-allocates specific multi-requests
Job Manager Files
GRIS
Client
Job
monitoring status
Gatekeeper Jobmanager JOB
Submission
X509_USER_PROXY
GASS_CACHE
UP Scheduler
Desc.
UP Staged
Exe=x
EXE
Args=y
stdout Env=z
Staged
stdin
stderr
GRAM in the GT3
• GT3 GRAM integrating into the various
metaschedulers and resource brokers.
• GRAM does not provide accounting and billing
features.
– It is assumed that these features-if needed-are being
supplied by local management mechanisms such as a
queuing system or scheduler.
GRAM in the GT3
• GRAM allows to run jobs remotely
– using a set of WSDL/OGSI client interfaces for
submitting, monitoring, and terminating a job.
• Job requests
– are written by the user in the Resource Specification
Language
– and processed by the ManagedJobService as part of the
job request.
Master Hosting Environment (MHE)
2
Master Manager Factory
1 Service (Master)
13 Service Data
Aggregator
12
Virtual Host
Environment
3 Redirector 11 Resource
5 Launch UHE Information
4 6 Provider
Start UHE (setuid)
7 Managed Job Factory Service (RPS)
(gridmap)
8 Service (MJFS)
12
12
9 12
Managed
Job Service Scheduling
(MJS) 10 System
14
15
File Stream Factory
File Stream Factory
Service (stdout)
Service (stdout) Host
16
System
Grid Resource
Identify Mapper File Stream
(GRIM) Service
(FSS)
User Hosting Environment (UHE)
http:www-unix.globus.org/developper/resource-management.html
Globus Information Service
GT3 Information Services
• What is GT3 Information Services?
– Grid service which provides information about Grid
resources
– modular Java component framework for OGSA
• service developers can use to implement various information
management solutions for GT3-compatible OGSA Services
and Service Data
GT2 vs. GT3 Information Services
• Components
– MDS Index Service, Service Data Providers and Aggregators,
Query and Notification Framework
– GRAM Reporter Resource Information Provider Service
• Data Format
– LDIF XML
• Data Source
– GLUE providers GLUE providers, Service Instance
• Query Mechanisms
– LDAP XPath, XQuery
Service Data
• A Grid service instance maintains a set of service data
elements (SDE)
– Declared via an extended XSD element declaration, placed
in a WSDL portType
– Includes basic introspection information, interface-specific
data, and application state
• Pull and push models for information query
– GridService::FindServiceData operation
• Pull: queries this information via extensible query language
– NotificationSource::Subscribe
• Push: Subscribe to notification of changes to information
Why Service Data?
• Discovery often requires instance-specific,
perhaps dynamic information
• Service data offers a general solution
– Every service must support some common service data,
and may support any additional service data desired
– Not just meta-data, but also instance state
• Part of GT MDS-2 model contained in OGSI
– Defines standard data model, and query
– Complements soft-state registration and notification
OGSI ServiceData Model
• ServiceData for self-description
– Model service with properties!
• Fine-grained view of resource functionality
• Data scoped by service instance
– Domain-dependent state
• Service discovery/monitoring information
• Stateful properties of service, e.g. cpu-load
Components
Basic Components
• Service Data Provider Components
• Service Data Aggregation Components
• Registry Components
Service Data Provider
Components
• Service Data Provider components
– provide a standard mechanism for dynamic generation
of service data via external programs
• External provider programs
– can be the core providers that are part of GT3 or
– can be user-created, custom providers
In Detail: Service Data Providers
• Service Data Provider interfaces are designed to
support execution in either
– synchronous (“pull”) mode
– asynchronous (“push”) mode
• A valid provider
– is composed of any Java class which implements at
least one of three predefined Java Interfaces
– generates a compatible form of XML output as the
result of its execution
In Detail: Service Data Providers
• Provider Interfaces
– SimpleDataProvider
• synchronous provider which produces XML output in the form of a
Java OutputStream
– DOMDataProvider
• synchronous extension of SimpleDataProvider which can also
produce XML output in the form of a Java org.w3c.dom.Document
– AsyncDataProvider
• asynchronous version of SimpleDataProvider sending the output to
the specified callback Object, which is assumed that the provider
implementer and the provider caller have both agreed on the callback
interface at compile-time
In Detail: Provider Interfaces
• SimpleDataProvider
– Basic interface which all service data providers must
implement
In Detail: Provider Interfaces
• DOMDataProvider
– Generic interface for XML service data providers that
are capable of emitting a org.w3c.dom.Document
object at runtime
In Detail: Provider Interfaces
• AsyncDataProvider
– Asynchronous version of provider interface
In Detail: GT3 Providers
• AsyncDocumentProvider
– An asynchronous version of a generic XML document
provider
• ScriptExecutionProvider
– ServiceDataProvider that provides a generic way to
execute scripts which produce XML documents
• HostScriptProvider
– Constructs Host service data from the output of
multiple scripts
In Detail: GT3 Providers
• ForkInfoProvider
– ServiceDataProvider which monitors local system PIDs
• PBSInfoProvider
– ServiceDataProvider which queries PBS for queue
information
• SimpleSystemInformationProvider
– Basic MDS GRIS-sytle sensor which emits system
information in XML, with state managed directly as an
XML Document using JDOM - JDK 1.3 compatible
In Detail: Provider Manager
• Provider execution
– is handled by the ServiceDataProviderManager class, which
schedules and manages provider execution as Java TimerTasks
• ServiceDataProviderManager
– uses an XML-based configuration file to load and link installed
Service Data Providers during runtime through standard Java
reflection methods
• Configuration file
– $GLOBUS_LOCATION/etc/indexservice.providers
– $GLOBUS_LOCATION/etc/rips.providers
In Detail: Provider Manager
• Configuration entry for the provider in a
configuration file
– enables your provider for execution by the Provider
Manager
– publishes the existence of your provider to clients
• Required attribute in the configuration entry
– the “class” attribute, which is simply the fully qualified
Java class name
In Detail: Custom Data Handlers
• The default data processing behavior of the Provider Manager
– take the logical XML document result of a provider’s execution
– wrap it in a new SDE
– and then add it to the Service’s ServiceDataSet
• We can override the default data processing logic in the
Provider Manager
– by specifying the “handler” attribute in the Provider’s configuration file
In Detail: Mechanisms
Service or User
Information Providers
SimpleDataProvider
enumProvider
Provider
DomDataProvider
Manager
executeProvider
AsyncDataProvider
Custom Data Handler
Basic Components
• Service Data Provider Components
• Service Data Aggregation Components
• Registry Components
Service Data Aggregation
Components
• ServiceDataAggregator components
– provide a reusable mechanism for handling subscription,
notification, and updating of
• locally stored copies of service data which is generated by other
services
• By using the ServiceDataAggregator class in your service
code
– service Data from both locally executing information providers and
other OGSA service instances can be aggregated into any given
service
In Detail: Service Data Aggregator
• ServiceDataAggregator component
– is used to perform server-side notification subscription
management
• Key additional feature of ServiceDataAggregator
– notification data that is processed by the
deliverNotification() function
• is actually copied and stored locally as a SDE, which includes
creation timestamp, TTL and source metadata
In Detail: Service Data Aggregator
• Aggregated SDEs are
– organized by SDE QName
– stored in a array which is returned as a
org.gridforum.ogsa.ServiceDataSetType to FindServiceData name
queries
• Originator field (GSH type) of the OGSA ServiceDataType
– is used as the “primary key”
• to differentiate like-named entries from each other
• to identify the source of the data itself
In Detail: AggregatorPortType
• addSubscription and removeSubscription
In Detail: Mechanisms
Service or User
Grid Services
addSubscription
deliverNotification
…
Aggregator
removeSubscription
Basic Components
• Service Data Provider Components
• Service Data Aggregation Components
• Registry Components
Registry Components
• Registry components
– maintain a set of available peer Grid Service Handles
– provides soft-state cataloging of a set of Grid Services
• i.e., the registry of services is periodically updated with
existence notification messages and any existing entries which
fail to refresh within the timeout period are eventually expired
• Registries
– can be used to support query or other operations that
may apply to one or more services in a set
In Detail: RegistrationPortType
• registerService and unregisterService
In Detail: Mechanisms
Grid Services
registerService
Registry
…
unregisterService
logical structure of the Index Service
User
Service Service
GSH
Data Data
Index Service
Aggregator
Collective Registry Provider Java
Layer
Mechanism
Mechanism Mechanism Provider
(caching here)
Existence Notification Message Notification Message
Service Service Service
Data Data Data
Resource
Layer Grid Grid Grid
Service Service Service
Globus Data Management services
Reliable File Transfer Service
Overview
• The Reliable Transfer Service (RFT) is an OGSA based
service that provides interfaces for:
– Controlling and monitoring 3rd party file transfers using GridFTP
servers.
– The client controlling the transfer is hosted inside of a grid service
so it can be managed using the soft state model and queried using
the ServiceData interfaces available to all grid services.
• It is essentially a reliable and recoverable version of the
GT2 globus-url-copy tool and more.
Prerequisites and Dependencies
• The Prerequisites to RFT are:
– GridFTP Server with a Host Certificate
– PostgreSQL
• PostgreSQL is used to store the state of the transfer to
allow for restart after failures.
– The interface to PostgreSQL is JDBC so any DBMS that supports
JDBC can be used.
Note:GT3 used PostgreSQL version 7.3.2 for testing and the instructions
provided to set up the database are good for the same.
Prerequisites and Dependencies
• GridFTP perfoms the actual file transfer.
• GridFTP server can only be run on Unix or Linux.
• There are 2 ways to get GridFTP:
– Packaged with the core GT3 Final installation
– As part of the Globus Toolkit 2.4 distribution
Prerequisites and Dependencies
1. PostgreSQL Setup
2. Configure and Run a GridFTP Server
3. RFT Grid Service Setup
4. Build the GAR from Source Distribution
www-unix.globus.org/toolkit/reliable_transfer.html
Service Data Elements for RFT
• Version :version of RFT.
• FileTransferProgress: SDE that denotes the percentage of file that is transferred
• FileTransferRestartMarker: SDE for the last restart marker for a particular
transfer
• FileTransferJobStatusElement: SDE for status of a particular transfer
• FileTransferStatusElement: SDE that denotes the status of all the transfers in the
request
• GridFTPRestartMarkerElement: SDE of Restart marker of the transfer
• GridFTPPerfMarkerElement: SDE of Performance Marker of the transfer
The Replica Location Service
The replica Location Service
• The replica location service (RLS) maintains and provides
access to mapping information from logical names for data
items to target names.
• The distributed RLS is intended to replace the centralized
Globus replica catalog available in earlier releases of
GT2.x.
• The distributed RLS provides higher performance,
reliability and scalability.
Replica Location service
• Replication of data items can reduce access
latency, improve data locality, and increase
robustness, scalability and performance for
distributed applications.
• An RLS typically does not operate in isolation, but
functions as one component of a data grid
architecture.
Replica Location Service
• Consistent local state maintained in Local Replica Catalogs
(LRCs).
– Local catalogs maintain mappings between arbitrary logical file
names (LFNs) and the physical file names (PFNs) associated with
those LFNs on its storage system(s).
• Collective state with relaxed consistency maintained in
Replica Location Indices (RLIs).
– Each RLI contains a set of mappings from LFNs to LRCs. A
variety of index structures can be defined with different
performance characteristics, simply by varying the number of RLIs
and amount of redundancy and partitioning among the RLIs.
Replica Location Service
• Soft state maintenance of RLI state.
– LRCs send information about their state to RLIs using soft state
protocols. State information in RLIs times out and must be periodically
refreshed by soft state updates.
• Compression of state updates.
– Optional compression uses Bloom Filters to summarize the content of a LRC
before sending a soft state update to a RLI Node.
• Membership and partitioning information maintenance.
– The current RLS implementation maintains static information about the
LRCs and RLIs participating in the distributed system.
– As new implementations of the RLS are developed, they will use OGSA
mechanisms for registration of services and for service lifetime management.
Relationship to Earlier Globus Replica
Management Software
• The RLS is intended to replace replica management tools available in GT2.X,
including:
– the Replica Catalog API
– the Replica Management API.
• The RLS differs from these earlier components in several important ways.
– As a distributed system, the RLS is designed to provide reliability by avoiding
single points of failure , load balancing, performance and scalability.
– The RLS implementation is based on open source relational database technology.
– The RLS separates replication information from other types of metadata.
• The RLS does not include information about logical collections, but assumes such
information is stored in a separate metadata service.
The GridFTP Protocol and
Software
What is GridFTP ?
• GridFTP is a high-performance, secure, reliable
data transfer protocol optimized for high-
bandwidth wide-area networks.
• The GridFTP protocol is based on FTP, the
highly-popular Internet file transfer protocol.
Protocol Features
• GSI security on control and data channels
• Multiple data channels for parallel transfers
• Partial file transfers
• Third-party (direct server-to-server) transfers
• Authenticated data channels
• Reusable data channels
• Command pipelining
Protocol Features
• Grid Security Infrastructure (GSI) and Kerberos support:
– Robust and flexible authentication, integrity, and confidentiality features are
critical when transferring or accessing files.
• Third-party control of data transfer:
– In order to manage large data sets for large distributed communities, it is necessary
to provide third-party control of transfers between storage servers.
• Parallel data transfer:
– On wide-area links, using multiple TCP streams can improve aggregate bandwidth
over using a single TCP stream.
• Striped data transfer:
– Partitioning data across multiple servers can further improve aggregate bandwidth.
GridFTP supports striped data transfers through extensions defined in the Grid
Forum draft.
Protocol Features
• Partial file transfer:
– GridFTP introduces new FTP commands to support transfers of regions of a file.
• Support for reliable data transfer:
– Reliable transfer is important for many applications that manage data. Fault
recovery methods for handling transient network failures, server outages, etc., are
needed
• Manual control of TCP buffer size:
– This is a critical parameter for achieving maximum bandwidth with TCP/IP. The
protocol also has support for automatic buffer size tuning
• Integrated Instrumentation:
– The protocol calls for restart and performance markers to be sent back. It is not
specified how often, and this is something we intend to address shortly.
What Does “GridFTP” Mean?
• GridFTP Protocol:
– This refers to the wire protocol used and is defined by a draft technical
specification submitted to the Global Grid Forum.
• The Globus Toolkit V2.0 GridFTP Server (GT2GridFTP):
– This system is the widely used open source wuftpd FTP server code base
extended to support the GridFTP protocol extensions.
– GT2GridFTP is distributed with the Globus Toolkit.
• The GridFTP family of tools: the term “GridFTP” is used to refer to
the entire family of GridFTP tools distributed with the Globus Toolkit:
The GridFTP server, client tools, client library, control library, etc.
Implementation
• The Globus implementation of the GridFTP protocol takes the form of two
APIs and corresponding libraries:
– globus_ftp_control
– globus_ftp_client.
• Besides supporting the protocol features described above, The APIs also
include interfaces for adding software "plug-ins".
• In addition to Globus software libraries, we have also implemented
– an API/library (globus_gass_copy)
– a command-line tool (globus-url-copy) that integrates GridFTP, HTTP, and local
file I/O to enable secure transfers using any combination of these protocols.
• Globus has adapted a popular FTP server package (Washington University's
wu-ftpd) to support a majority of the GridFTP protocol features (GSI security,
parallel transfer, third-party transfer, partial file transfer).
Availability of the GridFTP
• Our data grid software is currently available to the
public as components of the Globus Toolkit 2.0
release.
• Prior to GT2.X release, the software was tested
and evaluated for more than a year by several
external project teams who are using our
technologies to build data grids for their own use.
GASS: Global Access to Secondary
Storage
Requirement for Grid I/O service
• Uniform data access
• Diverse data source
• Dynamic resource set
• Support for streaming I/O
• Little or no program modification
• Support for programmer-direct performance
optimization
Joseph Bester et al. “GASS: A Data Movement and Access Service for Wide Area Computing
GASS Architecture
• Common Grid File Access Patterns
• Default Data Movement Strategies
• Specialized Data Movement Strategies
• GASS Operation
• Integration with the Globus Toolkit
Joseph Bester et al. “GASS: A Data Movement and Access Service for Wide Area Computing
Common Grid File Access
Patterns
• Read-only access
• Write-shared access
• Append-only access
• Unrestricted read/write
Joseph Bester et al. “GASS: A Data Movement and Access Service for Wide Area Computing
Read-only access to Write access to Append-only access,
• constant data, • Entire file, • multiple writers,
• read entire file • Multiple writers:last writer wins
READ READ WRITE WRITE APPEND APPEND
Concurrent write and read access, Concurrent write access to the same file Read-only access to part of the file
WRITE READ WRITE WRITE READ READ
Joseph Bester et al. “GASS: A Data Movement and Access Service for Wide Area Computing
Default Data Movement
Strategies
• GASS addresses bandwidth management
issues by providing a file cache: a “local”
secondary storage
• By default, data is moved into and out of this
cache when files are opened and closed
Joseph Bester et al. “GASS: A Data Movement and Access Service for Wide Area Computing
Processes
Cache Cache
GASS-server http-server ftp-server HPSS-server
Joseph Bester et al. “GASS: A Data Movement and Access Service for Wide Area Computing
GASS Operation
• Grid applications access remote files using GASS by
opening and closing the files with specialized open
and close calls
– globus_gass_open()
– globus_gass_fopen()
– globus_gass_close()
– globus_gass_fclose()
Note: the GASS open and close calls act like their
standard Unix I/O counterparts, except that a URL
rather than a le name is used to specify the location
of the le data.
Joseph Bester et al. “GASS: A Data Movement and Access Service for Wide Area Computing
Integration With Globus Toolkit
• The availability of GASS services has made it
straightforward to extend the GRAM API:
– Allow both executables and standard input,
output, and error streams to be named by URLs
– GASS mechanisms are used to fetch
• URL-named executable into the cache.
• standard input, and to redirect standard output and error.
Joseph Bester et al. “GASS: A Data Movement and Access Service for Wide Area Computing
Get documents about "