WP2 Data Management by locknkey24

VIEWS: 6 PAGES: 13

									WP2: Data Management


         Paul Millar

 eScience All Hands Meeting

    September 2-4 2003
                  Introduction

       EDG in third and final year of project.

       1st Generation of tools provided a very good base for input.

       2nd Generation designed for modularity and to allow evolution.
       Java based services (using either Tomcat or Oracle 9iAS)
       Interface design defined in WSDL
       Client stubs for Java, C/C++ using AXIS and gSOAP
       Persistent service data is stored mySQL or Oracle.
       Replication service framework (RM, RLS, RMC, ROS)
       Java Security Package


eScience -All Hands Meeting 2-4 September 2003
                   Replication Service Framework
                                              Use               Proc
                                               r                essin
                                                                g
 Se                                                                      Repli
                                                                         ca
 cur                                     Replica                         Locati
 ity                                    Managem                          on Meta
Col                                                                      Servic
                                           ent                                Data
lect                                     Service         Cor                e Catal
ions                                                     e                    Tra
                                                                              ogue
 Ses                                                                          nsp
                                                                              ort
 sio
 ns                                               Opt             Rep
Sub                                                               lica
scri                                              imi             Sel
                                                  sati            ecti
ptio                                                             Ac
                                                  on              on
                                                                 ces
ns                        Con                                    s
                                                               Rep
                          sist                                   His
                                                               lica
                          enc                                    tor
                                                               tion
                                                                 y
                          y
 eScience -All Hands Meeting 2-4 September 2003                Init
                  Interaction with services

       Internal
          • Replica Location Service (RLS)
          • Replica Metadata Catalogue (RMC)
          • Replica Optimisation Service (ROS)

       External
          • Relational Grid Monitoring Architecture (R-GMA)
          • Globus C-based libraries, as well as CoG
          • EDG network monitoring services.
          • EDG-SE services.




eScience -All Hands Meeting 2-4 September 2003
                  Replica Location Service

       Maintains a (possibly distributed) catalogue of files: 1 file maps to
        potentially many replicas.
       Need to keep track of file location and consistently updated.
       RLS stores one-to-many relations between GUID and Physical File
        Names (PFNs).
       Two-level design: LRC (Local Replica Catalogue) and RLI (Replica
        Location Index). LRC contains a list of GUID to PFNs. RLI contains
        GUID to LRC mappings.
       RLS will operate with just an LRC. EDG2.0 operation
       LRCs publish Bloom filter objects: compact form of representing a
        set. May contain false +ve, but not false -ve.



eScience -All Hands Meeting 2-4 September 2003
                   RLS Demo at SC2002




eScience -All Hands Meeting 2-4 September 2003
                  Replica MetaData Catalogue

       RLS provides GUID to PFN mapping, but GUID isn't “user friendly”.
       RMC provides metadata on a per GUID basis.
       One such metadata is a Logical File Name, LFN.
       A GUID may have many LFNs associated.
       RMC is also capable of storing other metadata, such as file size,
        date of creation, owner...
       User-defined metadata can also be stored, and searched against.




eScience -All Hands Meeting 2-4 September 2003
                       LFNs, PFNs, GUIDs




                LFN1                                                                               PFN1, Glasgow


                                                                       G                            PFN2, CERN
                LFN2                                                   U
                                                     1223423-ASSDF4-11223-35465464
                                                                        I
                                                                       D                            PFN3, Lyon
                LFN3




                                                                        Replica Location Service
Replica Meta-data Catalogue



    eScience -All Hands Meeting 2-4 September 2003
                  Replica Optimisation Service

       Early TB1, getBestFile absent.
       Now available: select the “best” replica of several available.
       Light-weight web service gathers information from network
        monitoring service and Storage Element services.
       Resource Broker (meta-Scheduler) decides on which CE a job will
        run.
       ROS treat files mentioned in JDL as hints, returning an “access
        cost” for a given array of potential CEs, allowing RB to rank based
        on availability of data.
       Most research-oriented task.
       OptorSim developed to test replica optimization ideas.


eScience -All Hands Meeting 2-4 September 2003
                  Security

       Provided by separate Java package.
       Covers
          • Authentication
          • coarse-grain authorisation.

       Aim to be as flexible as possible.
       Investigating collaboration with Liberty Alliance – a consortium
        developing standards and solutions for federated identity.




eScience -All Hands Meeting 2-4 September 2003
                  Authentication

       Extends normal Java SSL.
       Mutual authentication in SSL happens by exchanging public
        certificates signed by mutually trusted CAs, and crypto challenges
          • Uses proxy certificates.
          • Accepts GSI proxies as the authentication method
          • Supports GSI proxy loading and reloading
          • Supports OpenSSL certificate-private key loading
          • Supports CRLs with periodic reloading
          • Integrates with Tomcat and Jakarta AXIS SOAP framework

       Proxy doesn't have to be signed by CA, but has to start with DN of
        the user's certificate.



eScience -All Hands Meeting 2-4 September 2003
                  Coarse grain authorisation

       Coarse-grain means the server decides what access to grant before
        the request is processed: role based.
       Modular design for client-server interaction. SOAP and HTTP web
        traffic already written.
       Modular configuration. Currently configuration modules exist for
        XML and text file (the gridmap file).
       Integration work with Virtual Organisation Membership Service
        (VOMS). This allows authorisation on per-VO basis, without
        gridmap files.




eScience -All Hands Meeting 2-4 September 2003
                  Conclusions

       The 2nd generation of data management services has been written
        based on the Web-services paradigm.
       We have chosen an extensible service framework. This will allow
        the adoption of upcoming OGSA standards.
       Our choice of software is based on our aim of supporting both high-
        availability commercial products and standard Open-Source
        solutions.

       The 2nd Generation of WP2 software is currently being rolled out in
        production systems as part of the 2.0 release of EDG Software.
       Integration of additional services (such as full RLS and VOMS) are
        being scheduled.



eScience -All Hands Meeting 2-4 September 2003

								
To top