Docstoc

Semantic Grid

Document Sample
Semantic Grid Powered By Docstoc
					The Semantic Grid
A Future e-Science Infrastructure
Presented by…

David De Roure
University of Southampton, UK



                                www.semanticgrid.org
                                dder@ecs.soton.ac.uk
Structure of the talk
1.   Evolution of the Grid
2.   The Semantic Web
3.   The Semantic Grid
4.   The story so far
5.   Semantic Grid Projects
6.   Closing Remarks




                      Kyoto University November 2002   2
   Evolution of the Grid
    The Semantic Web
    The Semantic Grid
    The story so far
    Semantic Grid Projects
    Closing Remarks




     The Evolution of the Grid


                             The Semantic Grid
The Collaboratory Concept
• In 1989, William Wulf, then with the U.S. National
  Science Foundation, defined a collaboratory as

   "a center without walls, in which the
   nation's researchers can perform their
   research without regard to geographical
   location, interacting with colleagues,
   accessing instrumentation, sharing data
   and computational resources, and
   accessing information in digital libraries."

                    Kyoto University November 2002     4
A short history of the Grid
• “Science as a team sport”
      Grand Challenge Problems of the 80s
• Gigabit Testbed program
      Focus on applications for the local to
       wide area                                             CASA Gigabit Testbed

• FAFNER
      Factoring via Network-Enabled
       Recursion
• I-Way at SC ‘95
      First large-scale grid experiment
      Provided the basis for modern grid                 (Fran Berman, SDSC)
       infrastructure efforts
                         Kyoto University November 2002                             5
 Datagrid perspective
                                    ~PBytes/sec
                                                                                                               1 TIPS is approximately 25,000
                                                       Online System          ~100 MBytes/sec                  SpecInt95 equivalents

                                                                                   Offline Processor Farm
         There is a “bunch crossing” every 25 nsecs.
                                                                                          ~20 TIPS
         There are 100 “triggers” per second
                                                                                                        ~100 MBytes/sec
         Each triggered event is ~1 MByte in size

                                                      ~622 Mbits/sec
                                                                         Tier 0               CERN Computer Centre
                                       or Air Freight (deprecated)

Tier 1
         France Regional                   Germany Regional                  Italy Regional                     FermiLab ~4 TIPS
             Centre                            Centre                           Centre
                                                                                                                              ~622 Mbits/sec


                                                           Tier 2            Caltech                  Tier2
                                                                                              Tier2 Centre Centre        Tier2 Centre
                                                                                                               Tier2 Centre
                                                                             ~1 TIPS            ~1 TIPS ~1 TIPS ~1 TIPS ~1 TIPS
                                            ~622 Mbits/sec


                               Institute
                                       Institute Institute       Institute
                              ~0.25TIPS                                                       Physicists work on analysis “channels”.
                                                                                              Each institute will have ~10 physicists working on one or more
     Physics data cache
                                                ~1 MBytes/sec                                 channels; data for these channels should be cached by the
                                                                                              institute server
                                                                Tier 4
                   Physicist workstations

                                                         Kyoto University November 2002                                                                    6
                        www.griphyn.org                                www.ppdg.net                          www.eu-datagrid.org
1995 – 2000+:                    Grid Computing
• “Grid book” gave a comprehensive view
  of the state of the art
• Important infrastructure and middleware
  efforts initiated
      Globus, Legion, Condor, SRB, etc.
• 2000+: Beginnings of a Global Grid
      Evolution of the Global Grid Forum
      Some projects evolving to de facto
       standards (e.g. Globus, Condor)



                       Kyoto University November 2002   7
The Grid Problem
Resource sharing & coordinated problem solving
in dynamic, multi-institutional virtual organizations




                                      Foster, Kesselman, Tueke

                  Kyoto University November 2002                 8
Open Grid Services Architecture
• Anatomy vs Physiology
• Present Grid Architecture is a
  services architecture
• Implemented using Web Services
  Technology
• OGSA will provide
     Naming /Authorization / Security /
      Privacy
     Higher level services: Workflow,
      Transactions, DataMining,Knowledge
      Discovery,…
• Exploiting Synergy: Commercial
  Internet with Grid Services
                        Kyoto University November 2002   9
UNiform Interface to COmputing REsources
 UNICORE is a vertically integrated Grid environment offering
 seamless, secure and intuitive access to distributed computing
 resources




                       Kyoto University November 2002             10
e-Science
• ‘e-Science is about global collaboration in key areas
  of science, and the next generation of infrastructure
  that will enable it.’

• ‘e-Science will change the dynamic of the way science
  is undertaken.’
                                 John Taylor, DG of UK OST

•    ‘[The Grid] intends to make access to computing
    power, scientific data repositories and experimental
    facilities as easy as the Web makes access to
    information.’
                                             Tony Blair, 2002

                       Kyoto University November 2002      11
UK e-Science Initiative
• $180M Programme over 3 years
• $130M is for Grid Applications in all areas of science
  and engineering
      Particle Physics and Astronomy (PPARC)
       - $25M GridPP and $8M AstroGrid
      Engineering and Physical Sciences (EPSRC)
       - funding 6 projects at around $5M each
      Biology, Medical and Environmental Science
       - funding projects with total value of $35M
• $50M ‘Core Program’ to encourage development of
  generic ‘industrial strength’ Grid middleware

                        Kyoto University November 2002     12
Some UK e-Science Projects
 •   GRIDPP (PPARC)            •   Climateprediction.com (NERC)
 •   ASTROGRID (PPARC)         •   Oceanographic Grid (NERC)
 •   Comb-e-Chem (EPSRC)       •   Molecular Environmental Grid (NERC)
 •   DAME (EPSRC)              •   NERC DataGrid (NERC + OST-CP)
 •   DiscoveryNet (EPSRC)      •   Biomolecular Grid (BBSRC)
 •   GEODISE (EPSRC)           •   Proteome Annotation Pipeline (BBSRC)
 •   myGrid (EPSRC)            •   High-Throughput Structural Biology (BBSRC)
 •   RealityGrid (EPSRC)       •   Global Biodiversity (BBSRC)

 •   Biology of Ageing (BBSRC + MRC)       •          Interdisciplinary Research
 •   Sequence and Structure Data (MRC)                Collaborations ‘Grand Challenge’
 •   Molecular Genetics (MRC)                               Advanced Knowledge
                                                             Technologies
 •   Cancer Management (MRC + PPARC)
                                                            Medical Images and Signals
 •   Clinical e-Science Framework (MRC)
                                                            Equator
 •   Neuroinformatics Modeling Tools (MRC)
                                                            DIRC (Dependability)

                            Kyoto University November 2002                                13
UK e-Science Grid


                                    Edinburgh
                  Glasgow
                                  DL        Newcastle


                 Belfast               Manchester


                                Oxford              Cambridge

                                 RAL                  Hinxton
                 Cardiff                    London
                                    Southampton




           Kyoto University November 2002                       14
                                             Access Grid

Access Grid nodes




            Kyoto University November 2002        15
Today’s Grid activities across the world




               Kyoto University November 2002   16
Observations
• The Grid has been about large scale computation
• But the applications are also about collaboration

• Middleware has provided some computational
  interoperability
• But we now need semantic interoperability

• The old problem was lots of different computers
• The new problem is lots of different projects!


                     Kyoto University November 2002   17
    Evolution of the Grid
   The Semantic Web
    The Semantic Grid
    The story so far
    Semantic Grid Projects
    Closing Remarks




              The Semantic Web


                             The Semantic Grid
Vision
 “The Semantic Web is an extension of the current Web in which
 information is given a well-defined meaning, better enabling
 computers and people to work in cooperation. It is the idea of
 having data on the Web defined and linked in a way that it can be
 used for more effective discovery, automation, integration and
 reuse across various applications. The Web can reach its full
 potential if it becomes a place where data can be processed by
 automated tools as well as people”

                   From the W3C Semantic Web Activity statement



                       Kyoto University November 2002            19
Resource Description Framework




      Kyoto University November 2002   20
Richer semantics
                   Semantic
                     Web

                   Classical
                     Web



                       Kyoto University November 2002   21
OWL Web Ontology Language
  “The World Wide Web as it is currently constituted
  resembles a poorly mapped geography. Our insight into
  the documents and capabilities available are based on
  keyword searches, abetted by clever use of document
  connectivity and usage patterns. The sheer mass of this
  data is unmanageable without powerful tool support. In
  order to map this terrain more precisely, computational
  agents require machine-readable descriptions of the
  content and capabilities of web accessible resources.
  These descriptions must be in addition to the human-
  readable versions of that information.

                                                     The OWL Guide
                    Kyoto University November 2002                   22
SW Tools




           Kyoto University November 2002   23
Observations
• Semantic Web requires a metadata-enabled Web
• Where will the metadata come from?

• Semantic Web requires ontologies
• Where will the ontologies come from?

• What will motivate the generation of ontologies and
  metadata?




                     Kyoto University November 2002     24
    Evolution of the Grid
    The Semantic Web
   The Semantic Grid
    The story so far
    Semantic Grid Projects
    Closing Remarks




             The Semantic Grid


                             The Semantic Grid
Grid vision
 "Grid computing has emerged as an important new field,
 distinguished from conventional distributed computing by its
 focus on large-scale resource sharing, innovative applications,
 and, in some cases, high-performance orientation...we review the
 "Grid problem", which we define as flexible, secure, coordinated
 resource sharing among dynamic collections of individuals,
 institutions, and resources - what we refer to as virtual
 organizations."


        From "The Anatomy of the Grid: Enabling Scalable Virtual
                Organizations" by Foster, Kesselman and Tuecke

                       Kyoto University November 2002               26
Classical                 Classical
  Web                       Grid

        More computation
    Kyoto University November 2002    27
   Grid is metadata based middleware
1. Portals and Workbenches                       Astronomy Sky Survey
                                                       Data Grid
  2.Knowledge &
  Resource                                                             Bulk Data
  Management           3. Metadata        Data         Catalog
                          View            View         Analysis        Analysis

Concept space                               Standard APIs and Protocols
   4.Grid                 Information       Metadata       Data           Data
   Security            5. Discovery         delivery       Discovery      Delivery
   Caching
   Replication       Standard Metadata format, Data model, Wire format
   Backup
   Scheduling           6.      Catalog Mediator               Data mediator

                                  Catalog/Image Specific Access

7. Compute Resources         Derived Collections          Catalogs     Data Archives

                              Kyoto University November 2002                         28
For example…
Annotations of results, workflows and database entries could be
represented by RDF graphs using controlled vocabularies
described in RDF Schema and OWL
Personal notes can be XML documents annotated with metadata
or RDF graphs linked to results or experimental plans
Exporting results as RDF makes them available to be reasoned
over
RDF graphs can be the “glue” that associates all the components
(literature, notes, code, databases, intermediate results, sketches,
images, workflows, the person doing the experiment, the lab they
are in, the final paper)
The provenance trails that keep a record of how a collection of
services were orchestrated so they can be replicated or replayed,
or act as evidence

                        Kyoto University November 2002             29
More…
Represent the syntactic data types of e-Science objects using
XML Schema data types
Represent domain ontologies for the semantic mediation between
database schema, an application’s inputs and outputs, and
workflow work items
Represent domain ontologies and rules for parameters of
machines or algorithms to reason over allowed configurations
Use reasoning over execution plans, workflows and other
combinations of services to ensure the semantic validity of the
composition
Use RDF as a common data model for merging results drawn
from different resources or instruments
Capture the structure of messages that are exchanged between
components

                       Kyoto University November 2002             30
And more…
At the data/computation layer: classification of computational
and data resources, performance metrics, job control,
management of physical and logical resources
At the information layer: schema integration, workflow
descriptions, provenance trail
At the knowledge layer: problem solving selection, intelligent
portals
Governance of the Grid, for example access rights to
databases, personal profiles and security groupings
Charging infrastructure, computational economy, support for
negotiation; e.g. through auction model


                      Kyoto University November 2002             31
Semantic Grid
• There is currently a gap between grid computing
  endeavours and the vision of Grid computing in which
  there is a high degree of easy-to-use and seamless
  automation and in which there are flexible
  collaborations and computations on a global scale.
• To support the full richness of the grid computing
  vision we need Semantic Web technologies to Grid
  middleware and applications; i.e. the Semantic Grid

                                      www.semanticgrid.org

                    Kyoto University November 2002           32
Richer semantics
                   Semantic                Semantic
                     Web                     Grid

                   Classical                 Classical
                     Web                       Grid

                           More computation
                                           Source: Norman Paton
                       Kyoto University November 2002             33
    Evolution of the Grid
    The Semantic Web
    The Semantic Grid
   The story so far
    Semantic Grid Projects
    Closing Remarks




                The Story So Far


                             The Semantic Grid
The Semantic Grid Initiative
• Originally motivated by
      3 layer model
      Agent-based computing
• In 2001 we aimed to
      promote service-oriented architecture
      bridge Grid and Semantic Web communities
      clarify knowledge layer
      encourage holistic approach



                       Kyoto University November 2002   35
                            Source: Keith Jeffery

Kyoto University November 2002                      36
Agent Technology:
A Canonical View

                                  Agent
                                                           Organisational
Interactions                                                relationships




                             Environment
 Sphere of influence
                                             Source: Jennings, CACM
                       Kyoto University November 2002                 37
Technical Report of the National e-Science Centre
UKeS-2002-02, 2001.

Research Agenda for the
Semantic Grid:
A Future e-Science Infrastructure

              David De Roure - Distributed Systems, Web
      Nigel Shadbolt - Advanced Knowledge Technologies
                Nick Jennings - Agent Based Computing
                         Mark Baker - Grid technologies
Aim


      A Research Agenda aiming to move from the
      current state-of-the-art in e-Science infrastructure
      to the future infrastructure that is needed to
      support the full richness of the e-Science vision.




                      Kyoto University November 2002         39
The report
• Commissioned for UK e-Science Programme
• Draft distributed in July 2001, samizdat publication
  was influential
• Completed in December, and report now split into two
  documents:
   • The Evolution of the Grid (De Roure, Baker, Jennings,
     Shadbolt)
   • The Semantic Grid (De Roure, Jennings, Shadbolt)
• See also Semantic Web and Grid Computing (Goble,
  De Roure)


                        Kyoto University November 2002       40
Grid -> Semantic Web community
WWW2002                             1st International
Semantic Web Track                    Semantic Web
Chairs: Carole Goble                  Conference ISWC
        & Eric Miller           Chairs: Ian Horrocks
12 refereed papers, 2 panels, 2       & Jim Hendler
   workshops, lots of posters   4 tutorials, 40 refereed
50% developers day at SW track.    papers




                      Kyoto University November 2002       41
Keynotes


 Ian Foster at WWW2002
 Carl Kesselman at ISWC2002




 Plus panel at WWW2002…



                 Kyoto University November 2002   42
Semantic Grid Panel
                              WWW2002 Semantic Grid Panel




         • What do grid computing and semantic web
             have in common? Where do they differ?
         • Does the Grid need the Semantic Web?
         • Does the Semantic Web need the Grid?
         • Where do you think it's going in 50 years?
         • What is the biggest challenge we must address
              meets Web
        Grid to realise the semantic grid?
                Kyoto University November 2002              43
Kyoto University November 2002   44
SW -> Grid Community
Global Grid Forum 5 in Edinburgh
• Semantic Grid BOF
• Ontologies and the Grid tutorial
• Semantic Web keynote




                  Dave                Carole         Nigel

                    Kyoto University November 2002           45
             Semantic Grid RG Charter
Goal
 The goal of this RG is to realise the added value of
 Semantic Web technologies for Grid users and
 developers.
 It will provide a forum to track Semantic Web
 community activities and advise the Grid community
 on the application of Semantic Web technologies in
 Grid applications and infrastructure, to identify case
 studies and share good practice.


                     Kyoto University November 2002       46
          Semantic Grid RG Charter


When the scientist seeks answers to problems such as
the following, we wish to have the answer obtained by
automatic linking of databases and computational
resources by means of metadata, ontologies and
reasoning over both; i.e. Semantic Web and Grid
technologies…




                   Kyoto University November 2002       47
               Semantic Grid RG Charter
• "Correlate the new molecular structure with the existing structural
  databases; what are the likely physical properties of the crystal?"
• "Retrieve & align 2000nt 5' from every serine/threonine kinase in
  Fabacae expressed exclusively in the root cortex whose expression
  increases 5x or more upon infection by Rhizobium but is not
  affected by osmotic or heavy-metal stresses & is <40%
  homologous in the active site to kinases known to be involved in
  cell-cycle regulation in any other species"
• “How many cows in Texas? And how many will there be if we
  increase land tax?” 


                         Kyoto University November 2002             48
              Semantic Grid RG Charter
Projected Tasks
1. Track semantic web activities and inform the Grid community on
   what tools and ideas to use now and which to watch
2. Provide a forum to discuss and share best practice in 'semantic
   grid' projects
3. Create links with other RG and WG to both push Semantic Grid
   expertise and to offer a service of expertise. For example,
   participation in the proposed working group on scheduling
   ontology.
4. Operate a community web portal
5. Encourage engagement between the Grid and Semantic Web
   communities.

                        Kyoto University November 2002           49
    Evolution of the Grid
    The Semantic Web
    The Semantic Grid
    The story so far
   Semantic Grid Projects
    Closing Remarks




        Semantic Grid Projects



                         The Semantic Grid
Semantic Grid projects
 We are investigating Semantic Grids in
 e-Science projects:
                                          Combechem
    myGrid
    Comb-e-Chem              myGrid

    Geodise
    CoAKTinG
    GRIA
Semantic Grid Aspects
• Comb-e-Chem – automation requires machine-
  processable descriptions
• myGrid – ontologies in bioinformatics domain, plus
  ontologies for service description
• Geodise – knowledge in a problem solving environment
• CoAKTinG – ontologies to augment collaboration and
  communicate events
• GRIA – descriptive and operational metadata,
  description of negotiations



                    Kyoto University November 2002   52
Comb-e-Chem Project - Automation
                     Video
                                                          Simulation

                                                                       Properties

                             Analysis
    Diffractometer




                                              Structures
                                              Database




  X-Ray                                                                       Properties
  e-Lab                                                                       e-Lab


                                           Grid

                                   Kyoto University November 2002                    53
    myGrid Project - bioinformatics
•  Imminent ‘deluge’ of genomics
   data
• Highly heterogeneous
• Highly complex and inter-
   related
• Convergence of data and
   literature archives
1. Database access from the Grid
2. Process enactment on the Grid
3. Personalisation services
4. Metadata services
   Grid Services + Ontologies


                        Kyoto University November 2002   54
Geodise Project                                               Engineer
                                                                           Reliability
                                                                           Security
                                                                             QoS
                                                           GEODISE
                                                                                             Visualization
                                                           PORTAL
                       Knowledge
                       repository
                                        Session
   Ontology for                        database
   Engineering,                   Traceability
 Computation, &
 Optimisation and                                         OPTIMISATION
  Design Search

                                                   OPTIONS                                 Globus, Condor, SRB
                                                    System
                                                                Optimisation
                                                                  archive
                            APPLICATION
                              SERVICE                                                COMPUTATION
                             PROVIDER
                                                   Licenses
   Intelligent                                     and code                                                      Intelligent
   Application                                                                                                   Resource
                    CAD System              Analysis                                Parallel machines
    Manager                                                                                                       Provider
                      CADDS                  CFD                                         Clusters
                      IDEAS                  FEM                               Internet Resource Providers
                       ProE                  CEM                                       Pay-per-use
                    CATIA, ICAD
                                                     Geodise will provide grid-based seamless access to an intelligent knowledge
                                                       repository, a state-of-the-art collection of optimisation and search tools,
                                             Design industrial strength analysis 2002 and distributed computing & data resources
                                                Kyoto University November codes,                                               55
                                             archive
CoAKTinG will provide tools to assist scientific
collaboration by integrating
   intelligent meeting spaces
   ontologically annotated media streams from online
    meetings
   decision rationale and group memory capture
   meeting facilitation
   issue handling, planning and coordination support
   constraint satisfaction
   instant messaging/presence.

    http://www.aktors.org/coakting/
                   Kyoto University November 2002       56
                   Integrity and Authentication
                                                        PKI
                                                                      Business Policy
                                                      Keystore




Data Decryption


                                                     Negotiation         Capacity       Resourcing
                                                      Service           Estimation        Policy




                                                    Authorisation
Data Encryption




                   Data Signature




                                                                        Resource        Resource
                                                                        Interface       Manager


                                                     Application
                                                      Service




                  Industrial applications
                                                  Kyoto University November 2002                     57
    Evolution of the Grid
    The Semantic Web
    The Semantic Grid
    The story so far
    Semantic Grid Projects
   Closing Remarks




                 Closing remarks



                             The Semantic Grid
The Grid as a killer app for SW?
• Grid apps are a very good example of the type of application
  envisaged for the Semantic Web.
• Grid is a real application: the emphasis is on deployment and on
  high performance, and is on a large scale and has established
  communities of users.
• The Grid genuinely needs Semantic Web technologies.
• It will stress Semantic Web solutions
• It is self-contained, with a well-defined community who already
  work with common tools and standards.
• Aspects of the Semantic Web could be applications of grid
  computing, for example in search, data mining, translation and
  multimedia information retrieval.

                        Kyoto University November 2002           59
Grid and Pervasive computing
• The Grid shares many distributed systems issues with pervasive
  (ubiquitous) computing (e.g. service discovery and composition)
• e-Science needs pervasive computing, e.g. the ‘smart lab’ and
  support for collaboration
• The Equator project investigates the convergence of the digital and
  physical worlds – the Grid is a digital world!




            Technical innovation in physical
            and digital life

                         Kyoto University November 2002            60
         Limited Digital
                                                                 Mainframes
          Environment

                        FTP
                     Shared Info                        Multi User
                       Stores                           Machines

                            Conferencing
                                               Networked
                           and Groupware
Increasingly Rich             Systems            PCS                 Growing Presence of
Digital                                                              the Digital in the
environments                    Web and     Mobile Devices           Physical World
                                Virtual        Wearables
                                Worlds      Novel Displays



                                   Grid



 Seamless Meshing of                  Fully Converged
 Digital and Physical                   Digital and
                                          Physical
 Interaction                           Environment




                                   Kyoto University November 2002                     61
 Grid-based Devices
 for everyday health
• Providing medical information
  onto the Grid
• Focus on combining medical
  information with motion
  information to provide
  context
• Requires timely Grid
  computation
• Information reported remotely
  to mobile devices


                       Kyoto University November 2002   62
Summary
• Middleware enables interoperable use of heterogeneous computer
  systems
• Grid applications involve a wide range of problem driven
  pioneering and provide challenges in Information, Knowledge and
  Collaboration as well as high performance computation
• Semantic Grid enables interoperable use of heterogeneous Grid
  projects!
• Semantic Web technologies should be applied now for machine-
  processable descriptions and future semantic interoperability
• Need to
      Track Semantic Web developments e.g. OWL tools
      Investigate enhanced collaboration environments
• Aim is to accelerate the scientific process and not just scientific
  computation – this is the reward that will motivate SG
                          Kyoto University November 2002                63
Credits
•   Mark Baker, University of Portsmouth
•   Carole Goble, University of Manchester
•   Nick Jennings, University of Southampton
•   Nigel Shadbolt, University of Southampton
•   Many colleagues in Grid and Semantic Web communities

    myGrid
                            Combechem

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:2
posted:2/20/2012
language:
pages:64