Semantic Grid

Document Sample
Semantic Grid Powered By Docstoc
					The Semantic Grid
A Future e-Science Infrastructure
Presented by…

David De Roure
University of Southampton, UK

Structure of the talk
1.   Evolution of the Grid
2.   The Semantic Web
3.   The Semantic Grid
4.   The story so far
5.   Semantic Grid Projects
6.   Closing Remarks

                      Kyoto University November 2002   2
   Evolution of the Grid
    The Semantic Web
    The Semantic Grid
    The story so far
    Semantic Grid Projects
    Closing Remarks

     The Evolution of the Grid

                             The Semantic Grid
The Collaboratory Concept
• In 1989, William Wulf, then with the U.S. National
  Science Foundation, defined a collaboratory as

   "a center without walls, in which the
   nation's researchers can perform their
   research without regard to geographical
   location, interacting with colleagues,
   accessing instrumentation, sharing data
   and computational resources, and
   accessing information in digital libraries."

                    Kyoto University November 2002     4
A short history of the Grid
• “Science as a team sport”
      Grand Challenge Problems of the 80s
• Gigabit Testbed program
      Focus on applications for the local to
       wide area                                             CASA Gigabit Testbed

      Factoring via Network-Enabled
• I-Way at SC ‘95
      First large-scale grid experiment
      Provided the basis for modern grid                 (Fran Berman, SDSC)
       infrastructure efforts
                         Kyoto University November 2002                             5
 Datagrid perspective
                                                                                                               1 TIPS is approximately 25,000
                                                       Online System          ~100 MBytes/sec                  SpecInt95 equivalents

                                                                                   Offline Processor Farm
         There is a “bunch crossing” every 25 nsecs.
                                                                                          ~20 TIPS
         There are 100 “triggers” per second
                                                                                                        ~100 MBytes/sec
         Each triggered event is ~1 MByte in size

                                                      ~622 Mbits/sec
                                                                         Tier 0               CERN Computer Centre
                                       or Air Freight (deprecated)

Tier 1
         France Regional                   Germany Regional                  Italy Regional                     FermiLab ~4 TIPS
             Centre                            Centre                           Centre
                                                                                                                              ~622 Mbits/sec

                                                           Tier 2            Caltech                  Tier2
                                                                                              Tier2 Centre Centre        Tier2 Centre
                                                                                                               Tier2 Centre
                                                                             ~1 TIPS            ~1 TIPS ~1 TIPS ~1 TIPS ~1 TIPS
                                            ~622 Mbits/sec

                                       Institute Institute       Institute
                              ~0.25TIPS                                                       Physicists work on analysis “channels”.
                                                                                              Each institute will have ~10 physicists working on one or more
     Physics data cache
                                                ~1 MBytes/sec                                 channels; data for these channels should be cached by the
                                                                                              institute server
                                                                Tier 4
                   Physicist workstations

                                                         Kyoto University November 2002                                                                    6
1995 – 2000+:                    Grid Computing
• “Grid book” gave a comprehensive view
  of the state of the art
• Important infrastructure and middleware
  efforts initiated
      Globus, Legion, Condor, SRB, etc.
• 2000+: Beginnings of a Global Grid
      Evolution of the Global Grid Forum
      Some projects evolving to de facto
       standards (e.g. Globus, Condor)

                       Kyoto University November 2002   7
The Grid Problem
Resource sharing & coordinated problem solving
in dynamic, multi-institutional virtual organizations

                                      Foster, Kesselman, Tueke

                  Kyoto University November 2002                 8
Open Grid Services Architecture
• Anatomy vs Physiology
• Present Grid Architecture is a
  services architecture
• Implemented using Web Services
• OGSA will provide
     Naming /Authorization / Security /
     Higher level services: Workflow,
      Transactions, DataMining,Knowledge
• Exploiting Synergy: Commercial
  Internet with Grid Services
                        Kyoto University November 2002   9
UNiform Interface to COmputing REsources
 UNICORE is a vertically integrated Grid environment offering
 seamless, secure and intuitive access to distributed computing

                       Kyoto University November 2002             10
• ‘e-Science is about global collaboration in key areas
  of science, and the next generation of infrastructure
  that will enable it.’

• ‘e-Science will change the dynamic of the way science
  is undertaken.’
                                 John Taylor, DG of UK OST

•    ‘[The Grid] intends to make access to computing
    power, scientific data repositories and experimental
    facilities as easy as the Web makes access to
                                             Tony Blair, 2002

                       Kyoto University November 2002      11
UK e-Science Initiative
• $180M Programme over 3 years
• $130M is for Grid Applications in all areas of science
  and engineering
      Particle Physics and Astronomy (PPARC)
       - $25M GridPP and $8M AstroGrid
      Engineering and Physical Sciences (EPSRC)
       - funding 6 projects at around $5M each
      Biology, Medical and Environmental Science
       - funding projects with total value of $35M
• $50M ‘Core Program’ to encourage development of
  generic ‘industrial strength’ Grid middleware

                        Kyoto University November 2002     12
Some UK e-Science Projects
 •   GRIDPP (PPARC)            • (NERC)
 •   ASTROGRID (PPARC)         •   Oceanographic Grid (NERC)
 •   Comb-e-Chem (EPSRC)       •   Molecular Environmental Grid (NERC)
 •   DAME (EPSRC)              •   NERC DataGrid (NERC + OST-CP)
 •   DiscoveryNet (EPSRC)      •   Biomolecular Grid (BBSRC)
 •   GEODISE (EPSRC)           •   Proteome Annotation Pipeline (BBSRC)
 •   myGrid (EPSRC)            •   High-Throughput Structural Biology (BBSRC)
 •   RealityGrid (EPSRC)       •   Global Biodiversity (BBSRC)

 •   Biology of Ageing (BBSRC + MRC)       •          Interdisciplinary Research
 •   Sequence and Structure Data (MRC)                Collaborations ‘Grand Challenge’
 •   Molecular Genetics (MRC)                               Advanced Knowledge
 •   Cancer Management (MRC + PPARC)
                                                            Medical Images and Signals
 •   Clinical e-Science Framework (MRC)
                                                            Equator
 •   Neuroinformatics Modeling Tools (MRC)
                                                            DIRC (Dependability)

                            Kyoto University November 2002                                13
UK e-Science Grid

                                  DL        Newcastle

                 Belfast               Manchester

                                Oxford              Cambridge

                                 RAL                  Hinxton
                 Cardiff                    London

           Kyoto University November 2002                       14
                                             Access Grid

Access Grid nodes

            Kyoto University November 2002        15
Today’s Grid activities across the world

               Kyoto University November 2002   16
• The Grid has been about large scale computation
• But the applications are also about collaboration

• Middleware has provided some computational
• But we now need semantic interoperability

• The old problem was lots of different computers
• The new problem is lots of different projects!

                     Kyoto University November 2002   17
    Evolution of the Grid
   The Semantic Web
    The Semantic Grid
    The story so far
    Semantic Grid Projects
    Closing Remarks

              The Semantic Web

                             The Semantic Grid
 “The Semantic Web is an extension of the current Web in which
 information is given a well-defined meaning, better enabling
 computers and people to work in cooperation. It is the idea of
 having data on the Web defined and linked in a way that it can be
 used for more effective discovery, automation, integration and
 reuse across various applications. The Web can reach its full
 potential if it becomes a place where data can be processed by
 automated tools as well as people”

                   From the W3C Semantic Web Activity statement

                       Kyoto University November 2002            19
Resource Description Framework

      Kyoto University November 2002   20
Richer semantics


                       Kyoto University November 2002   21
OWL Web Ontology Language
  “The World Wide Web as it is currently constituted
  resembles a poorly mapped geography. Our insight into
  the documents and capabilities available are based on
  keyword searches, abetted by clever use of document
  connectivity and usage patterns. The sheer mass of this
  data is unmanageable without powerful tool support. In
  order to map this terrain more precisely, computational
  agents require machine-readable descriptions of the
  content and capabilities of web accessible resources.
  These descriptions must be in addition to the human-
  readable versions of that information.

                                                     The OWL Guide
                    Kyoto University November 2002                   22
SW Tools

           Kyoto University November 2002   23
• Semantic Web requires a metadata-enabled Web
• Where will the metadata come from?

• Semantic Web requires ontologies
• Where will the ontologies come from?

• What will motivate the generation of ontologies and

                     Kyoto University November 2002     24
    Evolution of the Grid
    The Semantic Web
   The Semantic Grid
    The story so far
    Semantic Grid Projects
    Closing Remarks

             The Semantic Grid

                             The Semantic Grid
Grid vision
 "Grid computing has emerged as an important new field,
 distinguished from conventional distributed computing by its
 focus on large-scale resource sharing, innovative applications,
 and, in some cases, high-performance orientation...we review the
 "Grid problem", which we define as flexible, secure, coordinated
 resource sharing among dynamic collections of individuals,
 institutions, and resources - what we refer to as virtual

        From "The Anatomy of the Grid: Enabling Scalable Virtual
                Organizations" by Foster, Kesselman and Tuecke

                       Kyoto University November 2002               26
Classical                 Classical
  Web                       Grid

        More computation
    Kyoto University November 2002    27
   Grid is metadata based middleware
1. Portals and Workbenches                       Astronomy Sky Survey
                                                       Data Grid
  2.Knowledge &
  Resource                                                             Bulk Data
  Management           3. Metadata        Data         Catalog
                          View            View         Analysis        Analysis

Concept space                               Standard APIs and Protocols
   4.Grid                 Information       Metadata       Data           Data
   Security            5. Discovery         delivery       Discovery      Delivery
   Replication       Standard Metadata format, Data model, Wire format
   Scheduling           6.      Catalog Mediator               Data mediator

                                  Catalog/Image Specific Access

7. Compute Resources         Derived Collections          Catalogs     Data Archives

                              Kyoto University November 2002                         28
For example…
Annotations of results, workflows and database entries could be
represented by RDF graphs using controlled vocabularies
described in RDF Schema and OWL
Personal notes can be XML documents annotated with metadata
or RDF graphs linked to results or experimental plans
Exporting results as RDF makes them available to be reasoned
RDF graphs can be the “glue” that associates all the components
(literature, notes, code, databases, intermediate results, sketches,
images, workflows, the person doing the experiment, the lab they
are in, the final paper)
The provenance trails that keep a record of how a collection of
services were orchestrated so they can be replicated or replayed,
or act as evidence

                        Kyoto University November 2002             29
Represent the syntactic data types of e-Science objects using
XML Schema data types
Represent domain ontologies for the semantic mediation between
database schema, an application’s inputs and outputs, and
workflow work items
Represent domain ontologies and rules for parameters of
machines or algorithms to reason over allowed configurations
Use reasoning over execution plans, workflows and other
combinations of services to ensure the semantic validity of the
Use RDF as a common data model for merging results drawn
from different resources or instruments
Capture the structure of messages that are exchanged between

                       Kyoto University November 2002             30
And more…
At the data/computation layer: classification of computational
and data resources, performance metrics, job control,
management of physical and logical resources
At the information layer: schema integration, workflow
descriptions, provenance trail
At the knowledge layer: problem solving selection, intelligent
Governance of the Grid, for example access rights to
databases, personal profiles and security groupings
Charging infrastructure, computational economy, support for
negotiation; e.g. through auction model

                      Kyoto University November 2002             31
Semantic Grid
• There is currently a gap between grid computing
  endeavours and the vision of Grid computing in which
  there is a high degree of easy-to-use and seamless
  automation and in which there are flexible
  collaborations and computations on a global scale.
• To support the full richness of the grid computing
  vision we need Semantic Web technologies to Grid
  middleware and applications; i.e. the Semantic Grid


                    Kyoto University November 2002           32
Richer semantics
                   Semantic                Semantic
                     Web                     Grid

                   Classical                 Classical
                     Web                       Grid

                           More computation
                                           Source: Norman Paton
                       Kyoto University November 2002             33
    Evolution of the Grid
    The Semantic Web
    The Semantic Grid
   The story so far
    Semantic Grid Projects
    Closing Remarks

                The Story So Far

                             The Semantic Grid
The Semantic Grid Initiative
• Originally motivated by
      3 layer model
      Agent-based computing
• In 2001 we aimed to
      promote service-oriented architecture
      bridge Grid and Semantic Web communities
      clarify knowledge layer
      encourage holistic approach

                       Kyoto University November 2002   35
                            Source: Keith Jeffery

Kyoto University November 2002                      36
Agent Technology:
A Canonical View

Interactions                                                relationships

 Sphere of influence
                                             Source: Jennings, CACM
                       Kyoto University November 2002                 37
Technical Report of the National e-Science Centre
UKeS-2002-02, 2001.

Research Agenda for the
Semantic Grid:
A Future e-Science Infrastructure

              David De Roure - Distributed Systems, Web
      Nigel Shadbolt - Advanced Knowledge Technologies
                Nick Jennings - Agent Based Computing
                         Mark Baker - Grid technologies

      A Research Agenda aiming to move from the
      current state-of-the-art in e-Science infrastructure
      to the future infrastructure that is needed to
      support the full richness of the e-Science vision.

                      Kyoto University November 2002         39
The report
• Commissioned for UK e-Science Programme
• Draft distributed in July 2001, samizdat publication
  was influential
• Completed in December, and report now split into two
   • The Evolution of the Grid (De Roure, Baker, Jennings,
   • The Semantic Grid (De Roure, Jennings, Shadbolt)
• See also Semantic Web and Grid Computing (Goble,
  De Roure)

                        Kyoto University November 2002       40
Grid -> Semantic Web community
WWW2002                             1st International
Semantic Web Track                    Semantic Web
Chairs: Carole Goble                  Conference ISWC
        & Eric Miller           Chairs: Ian Horrocks
12 refereed papers, 2 panels, 2       & Jim Hendler
   workshops, lots of posters   4 tutorials, 40 refereed
50% developers day at SW track.    papers

                      Kyoto University November 2002       41

 Ian Foster at WWW2002
 Carl Kesselman at ISWC2002

 Plus panel at WWW2002…

                 Kyoto University November 2002   42
Semantic Grid Panel
                              WWW2002 Semantic Grid Panel

         • What do grid computing and semantic web
             have in common? Where do they differ?
         • Does the Grid need the Semantic Web?
         • Does the Semantic Web need the Grid?
         • Where do you think it's going in 50 years?
         • What is the biggest challenge we must address
              meets Web
        Grid to realise the semantic grid?
                Kyoto University November 2002              43
Kyoto University November 2002   44
SW -> Grid Community
Global Grid Forum 5 in Edinburgh
• Semantic Grid BOF
• Ontologies and the Grid tutorial
• Semantic Web keynote

                  Dave                Carole         Nigel

                    Kyoto University November 2002           45
             Semantic Grid RG Charter
 The goal of this RG is to realise the added value of
 Semantic Web technologies for Grid users and
 It will provide a forum to track Semantic Web
 community activities and advise the Grid community
 on the application of Semantic Web technologies in
 Grid applications and infrastructure, to identify case
 studies and share good practice.

                     Kyoto University November 2002       46
          Semantic Grid RG Charter

When the scientist seeks answers to problems such as
the following, we wish to have the answer obtained by
automatic linking of databases and computational
resources by means of metadata, ontologies and
reasoning over both; i.e. Semantic Web and Grid

                   Kyoto University November 2002       47
               Semantic Grid RG Charter
• "Correlate the new molecular structure with the existing structural
  databases; what are the likely physical properties of the crystal?"
• "Retrieve & align 2000nt 5' from every serine/threonine kinase in
  Fabacae expressed exclusively in the root cortex whose expression
  increases 5x or more upon infection by Rhizobium but is not
  affected by osmotic or heavy-metal stresses & is <40%
  homologous in the active site to kinases known to be involved in
  cell-cycle regulation in any other species"
• “How many cows in Texas? And how many will there be if we
  increase land tax?” 

                         Kyoto University November 2002             48
              Semantic Grid RG Charter
Projected Tasks
1. Track semantic web activities and inform the Grid community on
   what tools and ideas to use now and which to watch
2. Provide a forum to discuss and share best practice in 'semantic
   grid' projects
3. Create links with other RG and WG to both push Semantic Grid
   expertise and to offer a service of expertise. For example,
   participation in the proposed working group on scheduling
4. Operate a community web portal
5. Encourage engagement between the Grid and Semantic Web

                        Kyoto University November 2002           49
    Evolution of the Grid
    The Semantic Web
    The Semantic Grid
    The story so far
   Semantic Grid Projects
    Closing Remarks

        Semantic Grid Projects

                         The Semantic Grid
Semantic Grid projects
 We are investigating Semantic Grids in
 e-Science projects:
    myGrid
    Comb-e-Chem              myGrid

    Geodise
    CoAKTinG
    GRIA
Semantic Grid Aspects
• Comb-e-Chem – automation requires machine-
  processable descriptions
• myGrid – ontologies in bioinformatics domain, plus
  ontologies for service description
• Geodise – knowledge in a problem solving environment
• CoAKTinG – ontologies to augment collaboration and
  communicate events
• GRIA – descriptive and operational metadata,
  description of negotiations

                    Kyoto University November 2002   52
Comb-e-Chem Project - Automation




  X-Ray                                                                       Properties
  e-Lab                                                                       e-Lab


                                   Kyoto University November 2002                    53
    myGrid Project - bioinformatics
•  Imminent ‘deluge’ of genomics
• Highly heterogeneous
• Highly complex and inter-
• Convergence of data and
   literature archives
1. Database access from the Grid
2. Process enactment on the Grid
3. Personalisation services
4. Metadata services
   Grid Services + Ontologies

                        Kyoto University November 2002   54
Geodise Project                                               Engineer
   Ontology for                        database
   Engineering,                   Traceability
 Computation, &
 Optimisation and                                         OPTIMISATION
  Design Search

                                                   OPTIONS                                 Globus, Condor, SRB
                              SERVICE                                                COMPUTATION
   Intelligent                                     and code                                                      Intelligent
   Application                                                                                                   Resource
                    CAD System              Analysis                                Parallel machines
    Manager                                                                                                       Provider
                      CADDS                  CFD                                         Clusters
                      IDEAS                  FEM                               Internet Resource Providers
                       ProE                  CEM                                       Pay-per-use
                    CATIA, ICAD
                                                     Geodise will provide grid-based seamless access to an intelligent knowledge
                                                       repository, a state-of-the-art collection of optimisation and search tools,
                                             Design industrial strength analysis 2002 and distributed computing & data resources
                                                Kyoto University November codes,                                               55
CoAKTinG will provide tools to assist scientific
collaboration by integrating
   intelligent meeting spaces
   ontologically annotated media streams from online
   decision rationale and group memory capture
   meeting facilitation
   issue handling, planning and coordination support
   constraint satisfaction
   instant messaging/presence.
                   Kyoto University November 2002       56
                   Integrity and Authentication
                                                                      Business Policy

Data Decryption

                                                     Negotiation         Capacity       Resourcing
                                                      Service           Estimation        Policy

Data Encryption

                   Data Signature

                                                                        Resource        Resource
                                                                        Interface       Manager


                  Industrial applications
                                                  Kyoto University November 2002                     57
    Evolution of the Grid
    The Semantic Web
    The Semantic Grid
    The story so far
    Semantic Grid Projects
   Closing Remarks

                 Closing remarks

                             The Semantic Grid
The Grid as a killer app for SW?
• Grid apps are a very good example of the type of application
  envisaged for the Semantic Web.
• Grid is a real application: the emphasis is on deployment and on
  high performance, and is on a large scale and has established
  communities of users.
• The Grid genuinely needs Semantic Web technologies.
• It will stress Semantic Web solutions
• It is self-contained, with a well-defined community who already
  work with common tools and standards.
• Aspects of the Semantic Web could be applications of grid
  computing, for example in search, data mining, translation and
  multimedia information retrieval.

                        Kyoto University November 2002           59
Grid and Pervasive computing
• The Grid shares many distributed systems issues with pervasive
  (ubiquitous) computing (e.g. service discovery and composition)
• e-Science needs pervasive computing, e.g. the ‘smart lab’ and
  support for collaboration
• The Equator project investigates the convergence of the digital and
  physical worlds – the Grid is a digital world!

            Technical innovation in physical
            and digital life

                         Kyoto University November 2002            60
         Limited Digital

                     Shared Info                        Multi User
                       Stores                           Machines

                           and Groupware
Increasingly Rich             Systems            PCS                 Growing Presence of
Digital                                                              the Digital in the
environments                    Web and     Mobile Devices           Physical World
                                Virtual        Wearables
                                Worlds      Novel Displays


 Seamless Meshing of                  Fully Converged
 Digital and Physical                   Digital and
 Interaction                           Environment

                                   Kyoto University November 2002                     61
 Grid-based Devices
 for everyday health
• Providing medical information
  onto the Grid
• Focus on combining medical
  information with motion
  information to provide
• Requires timely Grid
• Information reported remotely
  to mobile devices

                       Kyoto University November 2002   62
• Middleware enables interoperable use of heterogeneous computer
• Grid applications involve a wide range of problem driven
  pioneering and provide challenges in Information, Knowledge and
  Collaboration as well as high performance computation
• Semantic Grid enables interoperable use of heterogeneous Grid
• Semantic Web technologies should be applied now for machine-
  processable descriptions and future semantic interoperability
• Need to
      Track Semantic Web developments e.g. OWL tools
      Investigate enhanced collaboration environments
• Aim is to accelerate the scientific process and not just scientific
  computation – this is the reward that will motivate SG
                          Kyoto University November 2002                63
•   Mark Baker, University of Portsmouth
•   Carole Goble, University of Manchester
•   Nick Jennings, University of Southampton
•   Nigel Shadbolt, University of Southampton
•   Many colleagues in Grid and Semantic Web communities


Shared By: