VIEWS: 5 PAGES: 97 POSTED ON: 9/14/2011
The Globus Project Adam Belloum Computer Architecture & Parallel Systems group University of Amsterdam email@example.com Globus requirements • Security • Global name space • Fault tolerance • Accommodating heterogeneity • Binary management and application provisioning • Multilanguage support • Multilanguage support • Persistence • Extensibility • Site autonomy • Complexity management Globus design principles • Provide a toolkit – from which users can pick and choose • Focus on low-level functionality, – facilitating high-level tools (general usability) • Use standards whenever possible – for both interfaces and implementations • Emphasize the identification and definition of – protocols and services first, – and APIs and software development kits next • Provide open-source community development • Provide immediate usefulness • Do not provide a virtual machine abstraction Globus architectural details • Globus started out with the bottom-up premise that: – a grid must be constructed as a set of tools developed from user requirements. • This architecture is based on composing tools from a kit. – much of the initial design time was spent determining the user requirements for which grid tools could be built. Globus architectural details • Resource manager to start jobs – assuming users had procured accounts beforehand on all of the machines on which they could possibly run • Tool and API for transferring files from one machine to another – used for binary and data transfer • Tools for procuring credentials and certificates • Service for collecting resource information about machines on a grid. Globus architectural details • The designers of Globus believe that – New services/tools must be added to the existing set such that users can combine available tools to get the work done. • Much of the later development in Globus has been directed – at composing these tools in order to achieve a specific goal. Globus architectural details • The communications module provides network- aware communications messaging capabilities. – The implementation of the communications module in the Globus Toolkit was called Nexus. • The resource location and allocation module provides mechanisms for – expressing application resource requirements – identifying resources that meet the requirements – scheduling resources after they have been located. Globus architectural details • The authentication module provides means to verify the identity of both humans and resources. – GSSAPI hides the underlying authentication technique: Kerberos (centralized authentication system) or SSL (PKI) • The information service module provides a uniform mechanism for obtaining information about meta-system. – Known as the Metacomputing Directory Service (MDS), which builds upon an API of LDAP. • The data access module is responsible for providing remote access to persistent storage, such as files. Globus architectural details • The grid fabric layer provides the resources to which grid protocols mediate access. • The connectivity layer defines the core communication and authentication protocols required for grid-specific network transactions. – this layer includes the Grid Security Infrastructure (GSI). Globus architectural details • The resource layer, which defines protocols for – secure negotiation, initiation, monitoring, control, accounting, and payment of sharing operations on individual resources – a Grid Resource Information Protocol (GRIP), a resource information protocol; – the Grid Resource Registration Protocol (GRRP), used to register resources with the Grid Index Information Servers; – the Grid Resource Access and Management (GRAM) protocol, used to allocate and monitor resources; and GridFTP, which is used for data access. • The collective layer is used to coordinate access to multiple resources, which, in terms of the GT, refers to MetaComputing Directory Service (MDS), supported by GRRP and GRIP. • Finally, grid applications are at the very top Globus architectural details • With respect to grids, a bottom-up approach tends to – result in early successes simply because the approach targets immediate user requirements. – the implementation of a new service or tool can be quick, since much of the complexity of the underlying substrate is abstracted away. • The risk is that this approach may not – Scale as the number of tools or services increases, since an increasing number of pairwise protocols are necessary to ensure that the tools compose seamlessly. – accommodate changing requirements Layered Architecture Applications GlobusView High-level Services and Tools Testbed Status DUROC MPI MPI-IO CC++ Nimrod/G globusrun Core Services Nexus GRAM Metacomputing Globus Directory Heartbeat Security Gloperf Service Monitor GASS Interface Condor MPI Local TCP UDP Services LSF Easy NQE AIX Irix Solaris Security in Globus Overview of Security Standards in the Grid, CSE 225 High Performance and Computational Grids Spring 2000 firstname.lastname@example.org Globus Security Requirements • Single sign-on • Protection of credentials • Interoperability with local security solutions • Exportability • Uniform credentials/certification infrastructure • Support for secure group communication • Support for multiple implementations Globus Security Assumptions • Grid consists of multiple trust domains • Resource pool & users are large and dynamic • interoperate with local security solutions – local security policies differ • authentication exportable – cannot directly or indirectly require use of bulk privacy • uniform credentials/certification – a user will be associated differently with site it has access – processes used in a computation are dynamic access control Globus Security Infrastructure (GSI) • Provides authentication and data integrity – Data signing (not encryption) services for Unix/Windows client/server programs • Utilize an X.509 PKI • GSI library is layered on top of the SSLeay • Performs X.509 certificate handling & SSL protocol. Security in Globus (8) CREDENTIAL User Assignment of User Proxy Single sign-on credentials to “user proxies” Globus via “grid-id” Credential Mutual user-resource authentication Site 1 Site 2 Authenticated Mapping GRAM Process Process GRAM to local ids interprocess communication GSI Process Process GSI Ticket CertificateGSSAPI: Process Process multiple Kerberos Public Key low-level mechanisms Security in Globus (7) • . Security in Globus Standards • Standards subscribed to: – Generic Security Services (GSS) RFC 2078 – Secure Socket Layer (SSL) – Public Key Cryptography based on X.509 certificates – Kerberos Security in Globus (2) – Kerberos SSL Technology X.509 Standards SSH PGP PKI Kerberos DCE IPSec VPN Security Requirements Authentication x x x x x x Authorization x x x x x x x Assurance x x x x x x Accounting x x Audit x x Integrity x x x x x x Confidentiality x x x x x x x Resource Management Introduction • The GT resource management components includes a set of service components: – Globus Resource Allocation Manager • GRAM. – Dynamically Updated Request Online Component • DUROC – Globus Architecture for Reservation and Allocation • GARA What is the role of the GRAM • GRAM is designed to provide – a single common protocol and API for requesting and using remote system resources, – by providing a uniform, flexible interface to, local job scheduling systems. • GRAM provides a simple authorization mechanism based on GSI identities and a mechanism to map GSI identities to local user accounts. The main feature of the GRAM • GRAM reduces the number of mechanisms required for using remote resources. • This capability is consistent with the "hourglass" role played by most of the GT’s components: – GRAM is the neck of the hourglass, with applications and higher-level services (resource brokers or metaschedulers) above it and local control and access mechanisms below it. – Both sides need work only with GRAM, so the number of interactions, APIs, and protocols that need to be used are greatly reduced. http://www.globus.org/mds/ GRAM Components (GT2) MDS client API calls to locate resources Client MDS: Grid Index Info Server MDS client API calls Site boundary to get resource info GRAM client API calls to request resource allocation MDS: Grid Resource Info Server and process creation. Query current status GRAM client API state of resource Globus Security change callbacks Infrastructure Local Resource Manager Allocate & Request create processes Create Job Manager Gatekeeper Parse Process Monitor & control Process RSL Library Process Resource Specification Language • Much of the power of GRAM is in the RSL • Common language for specifying job requests – GRAM service translates this common language into scheduler specific language • GRAM service constrains RSL to a conjunction of (attribute=value) pairs – E.g. &(executable=“/bin/ls”)(arguments=“-l”) • GRAM service understands a well defined set of attributes A Co-allocation Multirequest +( & (resourceManagerContact= *** “flash.isi.edu:2119/jobmanager- lsf:/O=Grid/…/CN=host/flash.isi.edu”) (count=1) Different resource (label="subjob A") managers (executable= my_app1) ) Different ( & (resourceManagerContact= counts ***“sp139.sdsc.edu:2119:/O=Grid/…/CN=host/sp097.sdsc.edu") (count=2) (label="subjob B") Different executables (executable=my_app2) ) DUROC: Dynamically Updated Request Online Component • Simultaneous allocation of a resource set – Handled via optimistic co-allocation based on free nodes or queue prediction – And advance reservations • In the GT2: globusrun co-allocates specific multi-requests Job Manager Files GRIS Client Job monitoring status Gatekeeper Jobmanager JOB Submission X509_USER_PROXY GASS_CACHE UP Scheduler Desc. UP Staged Exe=x EXE Args=y stdout Env=z Staged stdin stderr GRAM in the GT3 • GT3 GRAM integrating into the various metaschedulers and resource brokers. • GRAM does not provide accounting and billing features. – It is assumed that these features-if needed-are being supplied by local management mechanisms such as a queuing system or scheduler. GRAM in the GT3 • GRAM allows to run jobs remotely – using a set of WSDL/OGSI client interfaces for submitting, monitoring, and terminating a job. • Job requests – are written by the user in the Resource Specification Language – and processed by the ManagedJobService as part of the job request. Master Hosting Environment (MHE) 2 Master Manager Factory 1 Service (Master) 13 Service Data Aggregator 12 Virtual Host Environment 3 Redirector 11 Resource 5 Launch UHE Information 4 6 Provider Start UHE (setuid) 7 Managed Job Factory Service (RPS) (gridmap) 8 Service (MJFS) 12 12 9 12 Managed Job Service Scheduling (MJS) 10 System 14 15 File Stream Factory File Stream Factory Service (stdout) Service (stdout) Host 16 System Grid Resource Identify Mapper File Stream (GRIM) Service (FSS) User Hosting Environment (UHE) http:www-unix.globus.org/developper/resource-management.html Globus Information Service GT3 Information Services • What is GT3 Information Services? – Grid service which provides information about Grid resources – modular Java component framework for OGSA • service developers can use to implement various information management solutions for GT3-compatible OGSA Services and Service Data GT2 vs. GT3 Information Services • Components – MDS Index Service, Service Data Providers and Aggregators, Query and Notification Framework – GRAM Reporter Resource Information Provider Service • Data Format – LDIF XML • Data Source – GLUE providers GLUE providers, Service Instance • Query Mechanisms – LDAP XPath, XQuery Service Data • A Grid service instance maintains a set of service data elements (SDE) – Declared via an extended XSD element declaration, placed in a WSDL portType – Includes basic introspection information, interface-specific data, and application state • Pull and push models for information query – GridService::FindServiceData operation • Pull: queries this information via extensible query language – NotificationSource::Subscribe • Push: Subscribe to notification of changes to information Why Service Data? • Discovery often requires instance-specific, perhaps dynamic information • Service data offers a general solution – Every service must support some common service data, and may support any additional service data desired – Not just meta-data, but also instance state • Part of GT MDS-2 model contained in OGSI – Defines standard data model, and query – Complements soft-state registration and notification OGSI ServiceData Model • ServiceData for self-description – Model service with properties! • Fine-grained view of resource functionality • Data scoped by service instance – Domain-dependent state • Service discovery/monitoring information • Stateful properties of service, e.g. cpu-load Components Basic Components • Service Data Provider Components • Service Data Aggregation Components • Registry Components Service Data Provider Components • Service Data Provider components – provide a standard mechanism for dynamic generation of service data via external programs • External provider programs – can be the core providers that are part of GT3 or – can be user-created, custom providers In Detail: Service Data Providers • Service Data Provider interfaces are designed to support execution in either – synchronous (“pull”) mode – asynchronous (“push”) mode • A valid provider – is composed of any Java class which implements at least one of three predefined Java Interfaces – generates a compatible form of XML output as the result of its execution In Detail: Service Data Providers • Provider Interfaces – SimpleDataProvider • synchronous provider which produces XML output in the form of a Java OutputStream – DOMDataProvider • synchronous extension of SimpleDataProvider which can also produce XML output in the form of a Java org.w3c.dom.Document – AsyncDataProvider • asynchronous version of SimpleDataProvider sending the output to the specified callback Object, which is assumed that the provider implementer and the provider caller have both agreed on the callback interface at compile-time In Detail: Provider Interfaces • SimpleDataProvider – Basic interface which all service data providers must implement In Detail: Provider Interfaces • DOMDataProvider – Generic interface for XML service data providers that are capable of emitting a org.w3c.dom.Document object at runtime In Detail: Provider Interfaces • AsyncDataProvider – Asynchronous version of provider interface In Detail: GT3 Providers • AsyncDocumentProvider – An asynchronous version of a generic XML document provider • ScriptExecutionProvider – ServiceDataProvider that provides a generic way to execute scripts which produce XML documents • HostScriptProvider – Constructs Host service data from the output of multiple scripts In Detail: GT3 Providers • ForkInfoProvider – ServiceDataProvider which monitors local system PIDs • PBSInfoProvider – ServiceDataProvider which queries PBS for queue information • SimpleSystemInformationProvider – Basic MDS GRIS-sytle sensor which emits system information in XML, with state managed directly as an XML Document using JDOM - JDK 1.3 compatible In Detail: Provider Manager • Provider execution – is handled by the ServiceDataProviderManager class, which schedules and manages provider execution as Java TimerTasks • ServiceDataProviderManager – uses an XML-based configuration file to load and link installed Service Data Providers during runtime through standard Java reflection methods • Configuration file – $GLOBUS_LOCATION/etc/indexservice.providers – $GLOBUS_LOCATION/etc/rips.providers In Detail: Provider Manager • Configuration entry for the provider in a configuration file – enables your provider for execution by the Provider Manager – publishes the existence of your provider to clients • Required attribute in the configuration entry – the “class” attribute, which is simply the fully qualified Java class name In Detail: Custom Data Handlers • The default data processing behavior of the Provider Manager – take the logical XML document result of a provider’s execution – wrap it in a new SDE – and then add it to the Service’s ServiceDataSet • We can override the default data processing logic in the Provider Manager – by specifying the “handler” attribute in the Provider’s configuration file In Detail: Mechanisms Service or User Information Providers SimpleDataProvider enumProvider Provider DomDataProvider Manager executeProvider AsyncDataProvider Custom Data Handler Basic Components • Service Data Provider Components • Service Data Aggregation Components • Registry Components Service Data Aggregation Components • ServiceDataAggregator components – provide a reusable mechanism for handling subscription, notification, and updating of • locally stored copies of service data which is generated by other services • By using the ServiceDataAggregator class in your service code – service Data from both locally executing information providers and other OGSA service instances can be aggregated into any given service In Detail: Service Data Aggregator • ServiceDataAggregator component – is used to perform server-side notification subscription management • Key additional feature of ServiceDataAggregator – notification data that is processed by the deliverNotification() function • is actually copied and stored locally as a SDE, which includes creation timestamp, TTL and source metadata In Detail: Service Data Aggregator • Aggregated SDEs are – organized by SDE QName – stored in a array which is returned as a org.gridforum.ogsa.ServiceDataSetType to FindServiceData name queries • Originator field (GSH type) of the OGSA ServiceDataType – is used as the “primary key” • to differentiate like-named entries from each other • to identify the source of the data itself In Detail: AggregatorPortType • addSubscription and removeSubscription In Detail: Mechanisms Service or User Grid Services addSubscription deliverNotification … Aggregator removeSubscription Basic Components • Service Data Provider Components • Service Data Aggregation Components • Registry Components Registry Components • Registry components – maintain a set of available peer Grid Service Handles – provides soft-state cataloging of a set of Grid Services • i.e., the registry of services is periodically updated with existence notification messages and any existing entries which fail to refresh within the timeout period are eventually expired • Registries – can be used to support query or other operations that may apply to one or more services in a set In Detail: RegistrationPortType • registerService and unregisterService In Detail: Mechanisms Grid Services registerService Registry … unregisterService logical structure of the Index Service User Service Service GSH Data Data Index Service Aggregator Collective Registry Provider Java Layer Mechanism Mechanism Mechanism Provider (caching here) Existence Notification Message Notification Message Service Service Service Data Data Data Resource Layer Grid Grid Grid Service Service Service Globus Data Management services Reliable File Transfer Service Overview • The Reliable Transfer Service (RFT) is an OGSA based service that provides interfaces for: – Controlling and monitoring 3rd party file transfers using GridFTP servers. – The client controlling the transfer is hosted inside of a grid service so it can be managed using the soft state model and queried using the ServiceData interfaces available to all grid services. • It is essentially a reliable and recoverable version of the GT2 globus-url-copy tool and more. Prerequisites and Dependencies • The Prerequisites to RFT are: – GridFTP Server with a Host Certificate – PostgreSQL • PostgreSQL is used to store the state of the transfer to allow for restart after failures. – The interface to PostgreSQL is JDBC so any DBMS that supports JDBC can be used. Note:GT3 used PostgreSQL version 7.3.2 for testing and the instructions provided to set up the database are good for the same. Prerequisites and Dependencies • GridFTP perfoms the actual file transfer. • GridFTP server can only be run on Unix or Linux. • There are 2 ways to get GridFTP: – Packaged with the core GT3 Final installation – As part of the Globus Toolkit 2.4 distribution Prerequisites and Dependencies 1. PostgreSQL Setup 2. Configure and Run a GridFTP Server 3. RFT Grid Service Setup 4. Build the GAR from Source Distribution www-unix.globus.org/toolkit/reliable_transfer.html Service Data Elements for RFT • Version :version of RFT. • FileTransferProgress: SDE that denotes the percentage of file that is transferred • FileTransferRestartMarker: SDE for the last restart marker for a particular transfer • FileTransferJobStatusElement: SDE for status of a particular transfer • FileTransferStatusElement: SDE that denotes the status of all the transfers in the request • GridFTPRestartMarkerElement: SDE of Restart marker of the transfer • GridFTPPerfMarkerElement: SDE of Performance Marker of the transfer The Replica Location Service The replica Location Service • The replica location service (RLS) maintains and provides access to mapping information from logical names for data items to target names. • The distributed RLS is intended to replace the centralized Globus replica catalog available in earlier releases of GT2.x. • The distributed RLS provides higher performance, reliability and scalability. Replica Location service • Replication of data items can reduce access latency, improve data locality, and increase robustness, scalability and performance for distributed applications. • An RLS typically does not operate in isolation, but functions as one component of a data grid architecture. Replica Location Service • Consistent local state maintained in Local Replica Catalogs (LRCs). – Local catalogs maintain mappings between arbitrary logical file names (LFNs) and the physical file names (PFNs) associated with those LFNs on its storage system(s). • Collective state with relaxed consistency maintained in Replica Location Indices (RLIs). – Each RLI contains a set of mappings from LFNs to LRCs. A variety of index structures can be defined with different performance characteristics, simply by varying the number of RLIs and amount of redundancy and partitioning among the RLIs. Replica Location Service • Soft state maintenance of RLI state. – LRCs send information about their state to RLIs using soft state protocols. State information in RLIs times out and must be periodically refreshed by soft state updates. • Compression of state updates. – Optional compression uses Bloom Filters to summarize the content of a LRC before sending a soft state update to a RLI Node. • Membership and partitioning information maintenance. – The current RLS implementation maintains static information about the LRCs and RLIs participating in the distributed system. – As new implementations of the RLS are developed, they will use OGSA mechanisms for registration of services and for service lifetime management. Relationship to Earlier Globus Replica Management Software • The RLS is intended to replace replica management tools available in GT2.X, including: – the Replica Catalog API – the Replica Management API. • The RLS differs from these earlier components in several important ways. – As a distributed system, the RLS is designed to provide reliability by avoiding single points of failure , load balancing, performance and scalability. – The RLS implementation is based on open source relational database technology. – The RLS separates replication information from other types of metadata. • The RLS does not include information about logical collections, but assumes such information is stored in a separate metadata service. The GridFTP Protocol and Software What is GridFTP ? • GridFTP is a high-performance, secure, reliable data transfer protocol optimized for high- bandwidth wide-area networks. • The GridFTP protocol is based on FTP, the highly-popular Internet file transfer protocol. Protocol Features • GSI security on control and data channels • Multiple data channels for parallel transfers • Partial file transfers • Third-party (direct server-to-server) transfers • Authenticated data channels • Reusable data channels • Command pipelining Protocol Features • Grid Security Infrastructure (GSI) and Kerberos support: – Robust and flexible authentication, integrity, and confidentiality features are critical when transferring or accessing files. • Third-party control of data transfer: – In order to manage large data sets for large distributed communities, it is necessary to provide third-party control of transfers between storage servers. • Parallel data transfer: – On wide-area links, using multiple TCP streams can improve aggregate bandwidth over using a single TCP stream. • Striped data transfer: – Partitioning data across multiple servers can further improve aggregate bandwidth. GridFTP supports striped data transfers through extensions defined in the Grid Forum draft. Protocol Features • Partial file transfer: – GridFTP introduces new FTP commands to support transfers of regions of a file. • Support for reliable data transfer: – Reliable transfer is important for many applications that manage data. Fault recovery methods for handling transient network failures, server outages, etc., are needed • Manual control of TCP buffer size: – This is a critical parameter for achieving maximum bandwidth with TCP/IP. The protocol also has support for automatic buffer size tuning • Integrated Instrumentation: – The protocol calls for restart and performance markers to be sent back. It is not specified how often, and this is something we intend to address shortly. What Does “GridFTP” Mean? • GridFTP Protocol: – This refers to the wire protocol used and is defined by a draft technical specification submitted to the Global Grid Forum. • The Globus Toolkit V2.0 GridFTP Server (GT2GridFTP): – This system is the widely used open source wuftpd FTP server code base extended to support the GridFTP protocol extensions. – GT2GridFTP is distributed with the Globus Toolkit. • The GridFTP family of tools: the term “GridFTP” is used to refer to the entire family of GridFTP tools distributed with the Globus Toolkit: The GridFTP server, client tools, client library, control library, etc. Implementation • The Globus implementation of the GridFTP protocol takes the form of two APIs and corresponding libraries: – globus_ftp_control – globus_ftp_client. • Besides supporting the protocol features described above, The APIs also include interfaces for adding software "plug-ins". • In addition to Globus software libraries, we have also implemented – an API/library (globus_gass_copy) – a command-line tool (globus-url-copy) that integrates GridFTP, HTTP, and local file I/O to enable secure transfers using any combination of these protocols. • Globus has adapted a popular FTP server package (Washington University's wu-ftpd) to support a majority of the GridFTP protocol features (GSI security, parallel transfer, third-party transfer, partial file transfer). Availability of the GridFTP • Our data grid software is currently available to the public as components of the Globus Toolkit 2.0 release. • Prior to GT2.X release, the software was tested and evaluated for more than a year by several external project teams who are using our technologies to build data grids for their own use. GASS: Global Access to Secondary Storage Requirement for Grid I/O service • Uniform data access • Diverse data source • Dynamic resource set • Support for streaming I/O • Little or no program modification • Support for programmer-direct performance optimization Joseph Bester et al. “GASS: A Data Movement and Access Service for Wide Area Computing GASS Architecture • Common Grid File Access Patterns • Default Data Movement Strategies • Specialized Data Movement Strategies • GASS Operation • Integration with the Globus Toolkit Joseph Bester et al. “GASS: A Data Movement and Access Service for Wide Area Computing Common Grid File Access Patterns • Read-only access • Write-shared access • Append-only access • Unrestricted read/write Joseph Bester et al. “GASS: A Data Movement and Access Service for Wide Area Computing Read-only access to Write access to Append-only access, • constant data, • Entire file, • multiple writers, • read entire file • Multiple writers:last writer wins READ READ WRITE WRITE APPEND APPEND Concurrent write and read access, Concurrent write access to the same file Read-only access to part of the file WRITE READ WRITE WRITE READ READ Joseph Bester et al. “GASS: A Data Movement and Access Service for Wide Area Computing Default Data Movement Strategies • GASS addresses bandwidth management issues by providing a file cache: a “local” secondary storage • By default, data is moved into and out of this cache when files are opened and closed Joseph Bester et al. “GASS: A Data Movement and Access Service for Wide Area Computing Processes Cache Cache GASS-server http-server ftp-server HPSS-server Joseph Bester et al. “GASS: A Data Movement and Access Service for Wide Area Computing GASS Operation • Grid applications access remote files using GASS by opening and closing the files with specialized open and close calls – globus_gass_open() – globus_gass_fopen() – globus_gass_close() – globus_gass_fclose() Note: the GASS open and close calls act like their standard Unix I/O counterparts, except that a URL rather than a le name is used to specify the location of the le data. Joseph Bester et al. “GASS: A Data Movement and Access Service for Wide Area Computing Integration With Globus Toolkit • The availability of GASS services has made it straightforward to extend the GRAM API: – Allow both executables and standard input, output, and error streams to be named by URLs – GASS mechanisms are used to fetch • URL-named executable into the cache. • standard input, and to redirect standard output and error. Joseph Bester et al. “GASS: A Data Movement and Access Service for Wide Area Computing
"Exploration of Embedded System Architectures"