Web 2.0 in a Web Services and Grid - Indiana University

Document Sample
Web 2.0 in a Web Services and Grid - Indiana University Powered By Docstoc
					 Web 2.0 in a Web Services and
          Grid Context
Part I: CTS2007 Web 2.0 Tutorial
                           CTS 2007
 Embassy Suites Hotel-Lake Buena Vista Resort, Orlando, FL, USA
                         May 25 2007

           Geoffrey Fox and Marlon Pierce
              Computer Science, Informatics, Physics
                Pervasive Technology Laboratories
             Indiana University Bloomington IN 47401

                        gcf@indiana.edu                           1
        Applications, Infrastructure,
   This field is confused by inconsistent use of terminology
    – this is what I mean
   Web Services, Grids and Web 2.0 (Enterprise 2.0) are
   These technologies combine and compete to build
    electronic infrastructures termed e-infrastructure or
   e-moreorlessanything is an emerging application area
    of broad importance that is hosted on the
    infrastructures e-infrastructure or Cyberinfrastructure
e-moreorlessanything is the Application
    „e-Science is about global collaboration in key areas of science,
    and the next generation of infrastructure that will enable it.‟ from
    its inventor John Taylor Director General of Research Councils
    UK, Office of Science and Technology
   Similarly e-Business captures an emerging view of corporations as
    dynamic virtual organizations linking employees, customers and
    stakeholders across the world.
   Net Centric computing is a similar DoD vision
   This generalizes to e-moreorlessanything
   A deluge of data of unprecedented and inevitable size must be
    managed and understood.
   People (see Web 2.0), computers, data and instruments must be
   On demand assignment of experts, computers, networks and
    storage resources must be supported                                3
    Role of Electronic infrastructure
   Supports integration of data, people, computers for
    • Distributed Science or e-Science (US, Cyberinfrastructure)
    • Command and Control (US, Global Information Grid)
    • e-Business e-Science etc. (Europe, e-Infrastructure)
   Exploits Internet technology (Web2.0) adding (via Grid
    technology) management, security, supercomputers etc.
   It has two aspects: parallel – low latency (microseconds)
    between nodes and distributed – highish latency (milliseconds)
    between nodes
   Parallel needed to get high performance on individual 3D
    simulations, data analysis etc.
   Distributed aspect integrates already distinct components
   Electronic infrastructure is in general a distributed collection
    of parallel systems and presented as services (often Web
    services) that are “just” programs or data sources packaged
    for distributed access                                         4
        Not so controversial Ideas
   Distributed software systems are being “revolutionized” by
    developments from e-commerce, e-Science and the consumer
    Internet. There is rapid progress in technology families termed
    “Web services”, “Grids” and “Web 2.0”
   The emerging distributed system picture is of distributed services
    with advertised interfaces but opaque implementations
    communicating by streams of messages over a variety of protocols
    • Complete systems are built by combining either services or predefined/pre-
      existing collections of services together to achieve new capabilities
   Currently Grids are built using Web Services with possible
    enhancements like WSRF which we call Narrow or Web service
   We expect that future systems will be built as Broad Grids which
    are a collection of services mixing Web Service and Web 2.0
                      and Web Services
     Web 2.0 clearly defined protocols (SOAP) and aIwell
    Web Services have
    defined mechanism (WSDL) to define service interfaces
    • There is good .NET and Java support
    • The so-called WS-* specifications provide a rich sophisticated but
      complicated standard set of capabilities for security, fault tolerance, meta-
      data, discovery, notification etc.
   “Narrow Grids” build on Web Services and provide a robust
    managed environment with growing adoption in Enterprise
    systems and distributed science (so called e-Science)
   Web 2.0 supports a similar architecture to Web services but has
    developed in a more chaotic but remarkably successful fashion
    with a service architecture with a variety of protocols including
    those of Web and Grid services
    • Over 400 Interfaces defined at http://www.programmableweb.com/apis
   Web 2.0 also has many well known capabilities with Google
    Maps and Amazon Compute/Storage services of clear general
   There are also Web 2.0 services supporting novel collaboration
    modes and user interaction with the web as seen in social
    networking sites, portals, MySpace, YouTube,
      Web 2.0 and Web Services II
   I once thought Web Services were inevitable but this is
    no longer clear to me
   Web services are complicated, slow and non functional
     • WS-Security is unnecessarily slow and pedantic
       (canonicalization of XML)
     • WS-RM (Reliable Messaging) seems to have poor
       adoption and doesn‟t work well in collaboration
     • WSDM (distributed management) specifies too much
   There are de facto standards like Google Maps and
    powerful suppliers like Google which “define the rules”
   One can easily combine SOAP (Web Service) based
    services/systems with HTTP messages but the “lowest
    common denominator” suggests additional
    structure/complexity of SOAP will not easily survive
    Old and New (Web 2.0) Community Tools
   e-mail and list-serves are oldest and best used
   Kazaa, Instant Messengers, Skype, Napster, BitTorrent for P2P
    Collaboration – text, audio-video conferencing, files
   del.icio.us, Connotea, Citeulike, Bibsonomy, Biolicious manage
    shared bookmarks
   MySpace, YouTube, Bebo, Hotornot, Facebook, or similar sites
    allow you to create (upload) community resources and share
    them; Friendster, LinkedIn create networks
    • http://en.wikipedia.org/wiki/List_of_social_networking_websites
   Writely, Wikis and Blogs are powerful specialized shared
    document systems
   ConferenceXP and WebEx share general applications
   Google Scholar tells you who has cited your papers while
    publisher sites tell you about co-authors
    • Windows Live Academic Search has similar goals
   Note sharing resources creates (implicit) communities
     • Social network tools study graphs to both define communities
       and extract their properties
      “Best Web 2.0 Sites” -- 2006
   Extracted from http://web2.wsj2.com/
   Social Networking

   Start Pages

   Social Bookmarking

   Peer Production News

   Social Media Sharing

   Online Storage
    Web 2.0 Systems are Portals, Services, Resources
   Captures the incredible development of interactive
    Web sites enabling people to create and collaborate
                Mashups v Workflow?
   Mashup Tools are reviewed at http://blogs.zdnet.com/Hinchcliffe/?p=63
   Workflow Tools are reviewed by Gannon and Fox
   Both include
    scripting in PHP,
    Python, sh etc. as
    both implement
    programming at level
    of services
   Mashups use all
    types of service
    interfaces and do not
    have the potential
    robustness (security)
    of Grid service
   Typically “pure”
    HTTP (REST)                                                                  11
Grid Workflow Datamining in Earth Science
                       Work with Scripps Institute
        NASA GPS       Grid services controlled by workflow process real time
                        data from ~70 GPS Sensors in Southern California


  Streaming Data

  Data Checking

  Hidden Markov
 Datamining (JPL)
                                                                 Real Time

  Display (GIS)                                                              12
    Web 2.0 uses all types of Services
   Here a Gadget Mashup uses a 3 service workflow with
    a JavaScript Gadget Client

                    Web 2.0 APIs
   http://www.programmable
    web.com/apis has (May 14
    2007) 431 Web 2.0 APIs
    with GoogleMaps the most
    often used in Mashups
   This site acts as a “UDDI”
    for Web 2.0
The List of
Web 2.0 API‟s
   Each site has API and
    its features
   Divided into broad
   Only a few used a lot
    (42 API‟s used in
    more than 10
   RSS feed of new APIs
   Amazon S3 growing
    in popularity
      APIs/Mashups per Protocol
    Number of
           Number of

           del.icio.us                                                 virtual
         yahoo! search                                                 earth
yahoo! geocoding
     yahoo! images
     trynt                                                   amazon
        yahoo! local                                                    live.com
                   google                                     ECS
                   search                           flickr
                    amazon S3

    REST          SOAP          XML-RPC    REST,    REST,    REST,      JS         Other
                                            4 more Mashups
                                            each day
                                               For a total of 1906
                                                April 17 2007 (4.0 a
                                                day over last
                                               Note ClearForest
                                                runs Semantic Web
                                                Services Mashup
                                                competitions (not
                                               Some Mashup
                                                types: aggregators,
                                                search aggregators,
                                                visualizers, mobile,
Growing number of commercial Mashup Tools       maps, games
    Web 2.0
Display too large to
be a Gadget

Searched on Transit/Transportation

                            Google Maps Server
                                                                   Cass County Map
      Marion County          Hamilton County
       Map Server              Map Server
                                                                    (OGC Web Map
      (ESRI ArcIMS)            (AutoDesk)

 Must provide adapters
 for each Map Server      Adapter     Adapter     Adapter            Browser client fetches
 type .                                                              image tiles for the
                                                                     bounding box using
                                    Tile Server                      Google Map API.
Tile Server requests
map tiles at all zoom
levels with all layers.
                               Cache Server                      The cache server
These are converted
                                                                 fulfills Google map
to uniform projection,
                                                                 calls with cached tiles
indexed, and stored.
                                                                 at the requested
Overlapping images
                                                                 bounding box that fill
are combined.
                                                                 the bounding box.
                               Browser +
A “Grid” Workflow            Google Map API                   Uses Google Maps clients and
  (built in Java!)                                          server and non Google map APIs
    Indiana Map Grid Workflow/Mashup

GIS Grid of “Indiana Map” and ~10 Indiana counties with accessible Map (Feature)
Servers from different vendors. Grids federate different data repositories (cf Astronomy
VO federating different observatory collections)                                     21
Grid-style portal as used in Earthquake Grid
                          The Portal is built from portlets
                            – providing user interface
                            fragments for each service
                            that are composed into the
                            full interface – uses OGCE
                            technology as does planetary
                            science VLAB portal with
                            University of Minnesota

                                  Now to Portals
Note the many competitions powering Web 2.0
Mashup Development
         Portlets v. Google Gadgets
   Portals for Grid Systems are built using portlets with
    software like GridSphere integrating these on the
    server-side into a single web-page
   Google (at least) offers the Google sidebar and Google
    home page which support Web 2.0 services and do not
    use a server side aggregator
   Google is more user friendly!
   The many Web 2.0 competitions is an interesting model
    for promoting development in the world-wide
    distributed collection of Web 2.0 developers
   I guess Web 2.0 model will win!
           Typical Google Gadget Structure
Google Gadgets are an example of
Start Page technology
See http://blogs.zdnet.com/Hinchcliffe/?p=8

    … Lots of HTML and JavaScript </Content> </Module>
    Portlets build User Interfaces by combining fragments in a standalone Java Server
    Google Gadgets build User Interfaces by combining fragments with JavaScript on the client
          Web 2.0 v Narrow Grid I
   Web 2.0 and Grids are addressing a similar application class
    although Web 2.0 has focused on user interactions
     • So technology has similar requirements
   Web 2.0 chooses simplicity (REST rather than SOAP) to lower
    barrier to everyone participating
   Web 2.0 and Parallel Computing tend to use traditional (possibly
    visual) (scripting) languages for equivalent of workflow whereas
    Grids use visual interface backend recorded in BPEL
   Web 2.0 and Grids both use SOA Service Oriented Architectures
   “System of Systems”: Grids and Web 2.0 are likely to build
    systems hierarchically out of smaller systems
     • We need to support Grids of Grids, Webs of Grids, Grids of
       Services etc. i.e. systems of systems of all sorts
           Web 2.0 v Narrow Grid II
   Web 2.0 has a set of major services like GoogleMaps or Flickr
    but the world is composing Mashups that make new composite
    • End-point standards are set by end-point owners
    • Many different protocols covering a variety of de-facto standards
   Narrow Grids have a set of major software systems like Condor
    and Globus and a different world is extending with custom
    services and linking with workflow
   Popular Web 2.0 technologies are PHP, JavaScript, JSON,
    AJAX and REST with “Start Page” e.g. (Google Gadgets)
   Popular Narrow Grid technologies are Apache Axis, BPEL
    WSDL and SOAP with portlet interfaces
   Robustness of Grids demanded by the Enterprise?
   Not so clear that Web 2.0 won‟t eventually dominate other
    application areas and with Enterprise 2.0 it‟s invading Grids
                                 The world does itself in large numbers!
           Web 2.0 v Narrow Grid III
   Narrow Grids have a strong emphasis on standards and
    structure; Web 2.0 lets a 1000 flowers (protocols) and a million
    developers bloom and focuses on functionality, broad usability
    and simplicity
     • Semantic Web/Grid has structure to allow reasoning
     • Annotation in sites like del.icio.us and uploading to
       MySpace/YouTube is unstructured and free text search
       replaces structured ontologies
   Portals are likely to feature both Web and “desktop client” technology
    although it is possible that Web approach will be adopted more or less
   Web 2.0 has a very active portal activity which has similar architecture to
     • A page has multiple user interface fragments
   Web 2.0 user interface integration is typically Client side using Gadgets
    AJAX and JavaScript while
     • Grids are in a special JSR168 portal server side using Portlets WSRP and
        Java                                                                    27
          The Ten areas covered by the 60 core WS-*
WS-* Specification Area           Typical Grid/Web Service Examples
1: Core Service Model             XML, WSDL, SOAP
2: Service Internet               WS-Addressing, WS-MessageDelivery; Reliable
                                  Messaging WSRM; Efficient Messaging MOTM
3: Notification                   WS-Notification, WS-Eventing (Publish-
4: Workflow and Transactions      BPEL, WS-Choreography, WS-Coordination
5: Security                       WS-Security, WS-Trust, WS-Federation, SAML,
6: Service Discovery              UDDI, WS-Discovery
7: System Metadata and State      WSRF, WS-MetadataExchange, WS-Context
8: Management                     WSDM, WS-Management, WS-Transfer
9: Policy and Agreements          WS-Policy, WS-Agreement
10: Portals and User Interfaces   WSRP (Remote Portlets)
                        WS-* Areas and Web 2.0
WS-* Specification Area        Web 2.0 Approach
1: Core Service Model          XML becomes optional but still useful
                               SOAP becomes JSON RSS ATOM
                               WSDL becomes REST with API as GET PUT etc.
                               Axis becomes XmlHttpRequest
2: Service Internet            No special QoS. Use JMS or equivalent?
3: Notification                Hard with HTTP without polling– JMS perhaps?
4: Workflow and Transactions   Mashups, Google MapReduce
(no Transactions in Web 2.0)   Scripting with PHP JavaScript ….
5: Security                    SSL, HTTP Authentication/Authorization,
                               OpenID is Web 2.0 Single Sign on
6: Service Discovery           http://www.programmableweb.com
7: System Metadata and State   Processed by application – no system state –
                               Microformats are a universal metadata approach
8: Management==Interaction     WS-Transfer style Protocols GET PUT etc.
9: Policy and Agreements       Service dependent. Processed by application
10: Portals and User Interfaces Start Pages, AJAX and Widgets(Netvibes) Gadgets
               Drivers for Future
   Web 2.0 has momentum as it is driven by success of
    social web sites and the user friendly protocols
    attracting many developers of mashups
   Grids momentum driven by the success of eScience and
    the commercial web service thrusts largely aimed at
   We expect applications such as business and DoD
    where predictability and robustness important to be
    built on a Web Service (Narrow Grid) core with Web
    2.0 functionality enhancements
   Simplicity, supporting many developers are forces
    pressuring Grids!
   Robustness and coping with unstructured blooming of
    a 1000 flowers are forces pressuring Web 2.0

Shared By: