NaradaBrokering for DS-RT 2005 Grid Tutorial by MattySad

VIEWS: 19 PAGES: 51

									  NaradaBrokering for
DS-RT 2005 Grid Tutorial
 IEEE DS-RT 2005 Montreal Canada Oct. 9 2005

                 Geoffrey Fox

          CTO Anabas Corporation and
     Computer Science, Informatics, Physics
       Pervasive Technology Laboratories
    Indiana University Bloomington IN 47401

               gcf@indiana.edu
            http://www.infomall.org
        Grid (Web Service) Messaging
   Build distributed systems from “interoperable” services linked by messages
    (SOAP) – architect capabilities as services
   Grids are “just” large scale sets of such services
   Need to support real time streams and NOT just files (collections of
    messages) consistent with WS standards (P2P and “central”)
   Open Source http://www.naradabrokering.org (4 downloads/day) is a
    scalable distributed pub-sub system supporting multiple standards (JMS,
    WS) and subscription methods
     • Implements “Service Internet” and Notification areas of WS-*
        Infrastructure
   Manage messaging for
     • Optimize communication for bad links, firewalls etc
     • Collaboration (multi-cast streams)
     • Fault tolerance with re-transmitted messages and Replicated Services
     • Replay – access any message at any time
     • Virtualize addressing with pub-sub metaphor
     • Performance from protocol (UDP v Parallel TCP ..) and representation
     • Heterogeneous Clients – filter to and from PDA‟s
   Candidate for Axis2-MOM (Message Oriented Middleware) infrastructure
NaradaBrokering




                             Queues


  Stream




                      NB supports messages
NB role for Grid is
                      and streams
Similar to
    Traditional NaradaBrokering Features
Multiple protocol         Transport protocols supported include TCP, Parallel TCP
transport support         streams, UDP, Multicast, SSL, HTTP and HTTPS.
In publish-subscribe      Communications through authenticating proxies/firewalls &
Paradigm with different   NATs. Network QoS based Routing
Protocols on each link    Allows Highest performance transport
Subscription Formats      Subscription can be Strings, Integers, XPath queries, Regular
                          Expressions, SQL and tag=value pairs.
Reliable delivery         Robust and exactly-once delivery in presence of failures
Ordered delivery          Producer Order and Total Order over a message type. Time
                          Ordered delivery using Grid-wide NTP based absolute time
Recovery and Replay       Recovery from failures and disconnects.
                          Replay of events/messages at any time. Buffering services.
Security                  Message-level WS-Security compatible security
Message Payload options   Compression and Decompression of payloads
                          Fragmentation and Coalescing of payloads
Messaging Related         Java Message Service (JMS) 1.0.2b compliant
Compliance                Support for routing P2P JXTA interactions.
Grid Feature Support      NaradaBrokering enhanced Grid-FTP. Bridge to Globus GT3.
Web Services supported    Implementations of WS-ReliableMessaging, WS-Reliability
                          and WS-Eventing.
Features for March—June 2005 Releases
   Production implementations of WS-Eventing, WS-
    Notification, WS-RM and WS-Reliability.
   SOAP message support and NaradaBrokers viewed as SOAP
    Intermediaries
   Active replay support: Pause and Replay live streams.
   Stream Linkage: can link permanently multiple streams –
    using in annotating real-time video streams
   Replicated storage support for fault tolerance and resiliency
    to storage failures.
   Management: HPSearch Scripting Interface to streams and
    services
   Broker Discovery: Locate appropriate brokers
                         Summary
   NaradaBrokering provides a fully distributed queue
    manager where queues buffer streams with overheads of a
    few milliseconds per broker
     • << 30 ms frame interval
     • << 100‟s ms network delay
     • Much faster than using databases or writing files
   Collaboration is implemented by sharing synchronizing
    streams
   Compatible with Grids, Web Services, Java Message
    Service
   Streams are “first class entities” with rich set of features
     • Don „t open a socket; hand data to NaradaBrokering
   Software Overlay Network or Message Oriented
    Middleware
NaradaBrokering Services
           Reliable Delivery Service
   Guaranteed delivery in multiple producer/ consumer
    settings. Guarantees hold true in the presence of
    •   Node/Link Failures
    •   Links can lose messages and garble message order.
    •   Storage failures: Stores need to recover after failure.
    •   Prolonged entity disconnects
   Exactly-Once and Ordered delivery of events
   Uses both positive& negative acknowledgements
   Supports Replay and Fast Recovery from failures
   Independent of underlying archival system.
   Was used to enhance fault tolerance in Grid-FTP.
   Uses “Reliable Storage” to keep messages temporarily
          Transit delays/Standard deviations in a 3 broker network.
         NB-BestEffort(BE)(TCP) Vs NB-ReliableDelivery(RD)(UDP)
16
           Mean delay (NBRD-UDP)
           Mean delay (NBBE-TCP)
14           Std Dev (NBRD-UDP)
              Std Dev (NBBE-TCP)
12

10

8

6

4

2

0
     0    1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
                        Content Payload Size in Bytes
         Transit delays/Standard deviations in a single broker network.
           NB-Best Effort(TCP) Versus NB-Reliable Delivery(UDP)
12
           Mean delay (NBRD-UDP)
           Mean delay (NB-BETCP)
10           Std Dev (NBRD-UDP)
              Std Dev (NBBE-TCP)

8


6


4


2


0
     0     1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
                         Content Payload Size in Bytes
Dealing with large payload sizes
   To cope with large payloads, the substrate incorporate
    2 sets of services.
   Compression/Decompression service: The substrate
    incorporate support for zlib based compression and
    decompression of payloads.
   Fragmentation/Coalescing Service: These service can
    break up a large payload into smaller fragments. The
    coalescing service can take these smaller fragments and
    coalesce them into the original large payload.
    • This was used to deal with transfer of large payloads (up to 1
      GB) in the NB enhanced Grid-FTP application.
                 Replay Services
   Replay requestors can specify replays based on several
    parameters
    • A range of sequence numbers can be specified.
    • Additionally, constraints on an event‟s content synopsis can
      be specified.
    • Based on a specified time range.
   Replay services have been tested with applications such
    as Audio/Video conferencing, Whiteboards etc.
   Essential for recording and replay of collaborative
    sessions
   Important special case supports rewind and similar
    operations on a real-time stream
                Buffering Service
   This service is incorporated into the system to facilitate
    the buffering of events prior to releasing them.
   Buffering service time orders events and releases event
    based on three metrics
    • Number of events in the buffer
    • Size of the buffer
    • Time spent by event in a buffer.
         Time Differential Service
   This service is essential to reduce jitters in large
    distributed environments.
    • Networks introduce unpredictable delays that increase jitter.
   Service takes events released by buffering service, and
    ensures that it preserves time spacing between events.
   TDS can provide time spacing resolution of up to 1
    millisecond between events.
        Jitter values comparing the Input to the Buffering
                 Service and the Output of the TDS
  12
                                    Buffering Input
                                      TDS Output
  10
    8
    6
    4
    2
    0
   -2
        0 100 200 300 400 500 600 700 800 900 1000
                     Sample Number
Trans-Atlantic Settings
             Jitter values from the output of the TDS
   0.14
                                      TDS Output
   0.12
    0.1
   0.08
   0.06
   0.04
   0.02
      0
  -0.02
          0 100 200 300 400 500 600 700 800 9001000
                       Sample Number
Trans-Atlantic Settings
     NaradaBrokering NTP Service
   NaradaBrokering includes an implementation of the Network
    Time Protocol (NTP)
   All entities within the system use NTP to communicate with
    atomic time servers maintained by organizations like NIST and
    USNO to compute offsets
     • Offset is the computed difference between global time and the
       local time.
     • The offset is computed based on the time returned from
       multiple atomic time servers.
     • The NTP algorithms weighs results from individual time
       clocks based on the distance of the atomic server from the
       entity.
   This ensures that all entities are within 1 ms of each other.
   The timestamps account for clock drifts on machines
     • Time returned corrects software clocks which can slow down
       with increased computing load on the machine.
       NTP Offset variations over a period of 4 hours
              Indiana Linux machine with
             a native NTP daemon process
  1
                               Offset Variation

0.5

  0

-0.5

 -1
       0   20 40 60 80 100 120 140 160
             Elapsed time in 100s of seconds
        Broker Discovery Service
   Locates the nearest available broker that a client can
    connect to.
    • Incorporates specialized nodes – broker discovery nodes – to
      maintain broker info.
    • Depending on load or security issues, brokers may decide to
      respond/ignore discovery requests.
    • If available the scheme can exploit IP multicast for discovery.
    • Nearest broker determined by ping times, loss rates and
      available bandwidth.
Broker Discovery: Brokers at Indianapolis, NCSA, UMN, FSU, IU & San Diego
Supercomputing Center. Broker at Indy selects IU, NCSA and UMN for pings.
Broker Discovery: Brokers at Indianapolis, NCSA, University of Minnesota, FSU
& San Diego Supercomputing Center. Cardiff selects Indy, NCSA and UMN for pings
         Topic Discovery Service
   Allows publishers and subscribers to advertise topics.
   Creator of topic possesses credentials to indicate
    ownership of the topic.
   Discovery of topics takes into account credentials of
    client trying to discover topic.
    • Topic owner may restrict discovery to a limited authorized set
      of clients.
   Discovery requests can be made using simple strings or
    regular expression queries.
                                                    1   Request permission to publish
                     6
                                                        Respond back with topic
        5                            7              2
                                                        key if authorized to publish
                 8
                                                        Encrypt message with topic key
      Key                                               Compute Message Digest(MD)
  Management                                        3
                                                        Sign MD and message ID
  Center (KMC)                                          Publish Message
                         NaradaBrokering Broker
                                 Cloud                  Verify Signature & Permissions
                                4                   4   Check integrity by verifying MD
                                                        Check ID for replay attacks

            1    2          3                       5   Request permission to subscribe

                                                        Respond back with topic key if
                                                    6
                                                        authorized to subscribe
                                                        Create subscription request
                 Broker Node                            Compute Message Digest
                                                    7
                                                        Sign MD and message ID
                 Entity (Publisher or Subscriber)
                                                        Issue Subscription request Message
                 SSL encrypted
                                                         Verify Signature
                 communications
                                                         Verify Permissions for Subscribing
    Security Service                                8
                                                         Check integrity by verifying MD
                                                         Check ID for replay attacks
Based on Message Level Security
    Messages organized into topics
    Each topic has a separate key; Topics can be organized into sessions
         NaradaBrokering
Support for SOAP and Web Services
                  SOAP Support I
   The broker can receive SOAP messages (over HTTP)
    from any entity.
    • This removes any client dependence in client-broker
      interaction
   The broker can function as an intermediary
    performing multiple roles which could just be routing
    but also involve mapping using filters
   There can be multiple filter-pipelines, each comprising
    multiple filters, available at the broker node.
    • Some of these would be system filter-pipelines configured
      statically.
    • Filter-Pipelines can also be configured by users, dynamically,
      at run-time.
                   SOAP Support II
   Multiple roles could be associated with
    • Different servlets hosted by a broker.
    • A given servlet hosted by a broker.
   Scheme will allow filters to be registered for individual
    roles.
    • A filter could be part of multiple roles.
   There is a dedicated filter pipeline per role.
   This implies that a NaradaBroker can be used as a Web
    Service container although full container support is not
    yet available
   Filters are used internally by NB to implement
    performance monitoring
    The FilterPipeline-Filter model I
   The filter and filter-chain facilitate many of the
    interactions that are missing in JAX-RPC handlers.
    • Filters are NaradaBrokering approach to the handlers used
      in Web Service containers
   Filters can inject messages at any time
    • These messages can be sent either to the application or over
      the network.
    • No limit on the number of messages that can be triggered
      because of a single message from application.
   Messages can be injected into a Filter Pipeline from
    either directions.
   Filters can generate responses automatically. No need
    to route to application.
    The FilterPipeline-Filter model II
   Applications have access to individual filters and
    filter-pipeline at all times. Explicitly direct which
    filters need to be skipped or added.
   Filters have access to position within Filter
    Pipeline, and can specify message injection at a
    specific location.
   Dynamic reconfiguration possible for Filter
    Chain.
   Allow different networking substrates to be
    registered. This can be dynamically changed.
     • Network substrate is last filter and is
       responsible ONLY for routing SOAP message.
          Web Services Support I
   Currently we have incorporated support for the
    following Web Service specifications
    • WS-Eventing (WSE): This is a publish/subscribe based
      notification framework from Microsoft and IBM.
    • WS-ReliableMessaging WSRM): This is a protocol for
      ensuring the guaranteed delivery of SOAP messages between
      2 Web Service endpoints. This specification is from IBM and
      Microsoft.
    • WS-Reliability (WSR)- This is a competing specification from
      Oracle and Sun in the area of reliable messaging between
      Web Services.
   These handlers are available for use in Axis1.2 or
    exploiting NB SOAP Intermediary support without a
    container
    • Axis1.2 version can be used inside container or as a Proxy
        Web Services Support - II
   We are also working on implementing support for the
    WS-Notification (WSN) suite of specification that is
    part of the Web Services Resource Framework
    (WSRF).
   WS-Notification explicitly adds brokers to Eventing
   Note that almost all these specifications leverage the
    WS-Addressing (WSA) specification.
    • We have incorporated support for all the rules associated
      with WSA.
    NaradaBrokering in Web Services
   a) WSM WSR WSN WSE support for Axis1.2 which is
    available as standalone handlers without need for any
    NaradaBrokers
   b) The support described in a) implemented as a
    separate proxy and inside containers
   c) NaradaBrokers used as SOAP Intermediaries
   d) NaradaBrokers can support filters in SOAP
    intermediaries forming limited light-weight containers
   e) NaradaBrokers can be Brokers defined in WSN
    Specification
   f) One can use NaradaBrokers in non-brokered
    publish-subscribe such as WS-Eventing to make it
    scalable
      Implementation of WS-Reliable Messaging (WSRM) I

Operation                          Mean     Std      Std     Outlier Min   Max    Mem
                                            Dev      Error                        (Bytes)
Create an XMLBeans based           121.29   25.77    2.65    6      110    333    2192
Envelope Document
Create an Axis based               85.76    79.36    8.22    7      34     540    1824
SOAPMessage
Convert an EnvelopeDocument to a   3503.8   758.48   80.85   12     2632   5406   57152
SOAPMessage
Convert SOAPMessage to             730.08   392.35   41.58   11     327    1911   34424
EnvelopeDocument
Create a WS-Addressing EPR         84.61    25.61    2.67    8      72     301    2072
(Contains just a URL address)
Create a WS-Addressing EPR         133.13   35.64    3.71    8      114    354    2648
(Contains WSA
ReferenceProperties)
Create an Envelope targeted to a   157.98   12.19    1.27    8      140    219    7184
specific WSA EPR
Create an Envelope targeted to a   263.20   35.73    3.74    9      240    471    13880
specific WSA EPR with most WSA
message information headers
    Implementation of WS-Reliable Messaging (WSRM) II

Operation                       Mean     Std      Std     Outlier   Min   Max    Mem
                                         Dev      Error                          (Bytes)
Parse an EnvelopeDocument to    711.74   231.61   23.76   5         555   1317   61024
retrieve WSA Headers
Create a Wsrm Fault             413.80   239.17   25.07   9         271   1212   18096
Create a Wsrm SequenceRequest   268.95   37.93    3.97    9         212   374    16392
Create a Wsrm                   234.97   17.40    1.81    8         212   324    18160
SequenceResponse
Create a Wsrm                   43.812   2.99     0.30    4         42    53     2424
SequenceDocument
Add a WsrmSequenceDocument      13.01    0.57     0.05    4         11    15     464
to an existing envelope.
(Contains sequence identifier
and message number)
Create a WSRM                   461.17   172.40   18.27   11        301   1043   20624
SequenceAcknowledgement
based on a set of message
numbers
Create a WSRM                   20.95    1.30     0.13    4         20    25     2072
TerminateSequence
Transport Layer in
NaradaBrokering
                 Transport Layer
   Support for multiple network protocols such as TCP,
    UDP, Multicast, SSL, RTP, HTTP and Parallel TCP.
    • Support for both blocking and non-blocking IO in the TCP
      support.
    • The UDP support manages payloads greater than 64K
      datagram limit. Also incorporates pinging mechanism to
      detect connection losses in connectionless setting.
   Tunnel through firewalls/proxies
    • Microsoft‟s ISA, Checkpoint, Apache
                                     Mean transit delay for message samples in
                                   NaradaBrokering: Different communication hops
                               9
                                    hop-2
Transit Delay (Milliseconds)



                               8    hop-3
                               7    hop-5
                                    hop-7
                               6
                               5
                               4
                               3
                               2
                               1
                               0
                                        100                  1000
                                                                             Pentium-3, 1GHz,
                                           Message Payload Size (Bytes)      256 MB RAM
                                                                             100 Mbps LAN
                                                                             JRE 1.3 Linux
         Standard Deviation for message samples in NaradaBrokering
              Different communication hops - Internal Machines
0.8
                                                             hop-2
                                                             hop-3
0.7
                                                             hop-5
                                                             hop-7
0.6

0.5

0.4

0.3

0.2

0.1

 0
 1000   1500    2000     2500      3000     3500      4000      4500   5000
                            Message Payload Size
                                  (Bytes)
Performance of NaradaBrokering
in collaborative settings
          Average Latencies and Jitters for Audio Conferencing Clients.
                         Single Broker, Single Meeting
100
                                               Average Latency
                                                 Average Jitter


 10




  1




0.1
      0        200    400     600   800    1000      1200    1400    1600
                                Number of Users
       Average Latencies and Jitters for Video Conferencing Clients.
                      Single Broker, Single Meeting
1000
                                            Average Latency
                                              Average Jitter

 100


  10


   1


 0.1
       0   100    200    300    400 500 600         700    800    900
                               Number of Users
  Average Latencies for Video Conferencing Clients at different Brokers.
                        4 Brokers, Single Meeting
1000
      Latency at B1
      Latency at B2
      Latency at B3
      Latency at B4



 100




  10
   200       300     400   500      600       700        800      900
                       Number of Users per broker
 Average Latencies for Video Conferencing Clients at different Brokers.
         4 Brokers, Multiple Meetings (20 Users per Meeting)
100
    Latency at B1
    Latency at B2
    Latency at B3
    Latency at B4



 10




  1
   20      30      40     50    60      70        80      90     100
                          Number of Meetings
 Average Latencies for Video Conferencing Clients at different locations.
             Sites in Indiana, Florida, New York and Cardiff
100




 10



                                                    Indiana
                                                  New York
                                                     Florida
                                                  Cardiff UK
  1
      0    20      40     60     80     100      120       140     160
                        Number of Users per Site
    “GridMPI” v. NaradaBrokering
   In parallel computing, MPI and PVM provided “all the features
    one needed‟ for inter-node messaging
   NB aims to play same role for the Grid but the requirements and
    constraints are very different
    • NB is not MPI ported to a Grid/Globus environment
   Typically MPI aiming at microsecond latency but for Grid, time
    scales are different
    • 100 millisecond quite normal network latency
    • 30 millisecond typical packet time sensitivity (this is one audio or video
      frame) but even here can buffer 10-100 frames on client (conferencing to
      streaming)
    • 1 millisecond is time for a Java server to “think”
   Jitter in latency (transit time through broker) due to routing,
    processing (in NB) or packet loss recovery is important property
   Grids need and can use software supported message functions and
    trade-offs between hardware and software routing different from
    parallel computing
    HPSearch Management Engine
   HPSearch is an engine for orchestrating distributed
    Web Service interactions
    • It uses an event system and supports both file transfers and
      data streams.
   HPSearch flows can be scripted with JavaScript
    • HPSearch engine binds the flow to a particular set of services
      and executes the script.
   HPSearch can access and set NaradaBrokering features
    (create topics, display performance data)
   ProxyWebService: a wrapper class that adds
    notification and streaming support to a remote Web
    Service.
   HPSearch is a streaming sensitive workflow engine
WMS GIS service and a data Layer
Data can be stored and
retrieved from the 3rd part
                                                 WS Context
repository (Context Service)
                                                  (Tambora)                                       GPS Database
                                                                                                    (Gridfarm001)
                                                                        NaradaBroker network:
                                                                        Used by HPSearch
         WMS                                                            engines as well as for
                                                                        data transfer

                                                                                                     Data Filter
                                                 HPSearch                                               (Danube)
                                                     (TRex)

                                                                                                                    Virtual
                    WMS submits script
                                                                                                                    Data
                    execution request (URI
                                                                                                                    flow
                    of script, parameters)



                     HPSearch hosts an AXIS
                     service for remote
                     deployment of scripts
                                                                                                 Pattern Informatics
                                                                                                 (Danube)
                                                      HPSearch                                    Accumulate Data
                                                        (Danube)                                  Run PI Code
                                                                                                  Create Graph
                                                                                                  Convert RAW -> GML


                               Workflow (BPEL) Fragment
                                                                                                        GML
                                                                                                       (Danube)
                Actual Data flow
                                                                   HPSearch Engines
                HPSearch controls the Web services                 communicate using NB
                                                                   Messaging infrastructure
                Final Output pulled by the WMS
    SensorML and NaradaBrokering
   OGC defined a set SensorML of specifications
    indicating how to integrate Sensors with its GIS
    Services
   We are using Southern California SCIGN GPS data to
    prototype this
          Sensor
          Source
                                 RYO Binary

                   Filter           Text
                                                 Filter
                                    GML
    You can access whichever   NaradaBrokering
    version you want!              Topics
           NaradaBrokering Futures
   Support for replicated storages within the system.
    • In a system with N replicas the scheme can sustain the loss of N-1 replicas.
   Clarification and expansion of NB Broker to act as a WS
    container
   Integration with Axis 2.0 as Message Oriented Middleware
    infrastructure
   Support for High Performance transport and representation for
    Web Services
    • Needs Context catalog under development
   Performance based routing
    • The broker network will dynamically respond to changes in the network
      based on metrics gathered at individual broker nodes.
   Replicated publishers for fault tolerance
   Pure client P2P implementation (originally we linked to JXTA)
   Security Enhancements for fine-grain topic authorization, multi-
    cast keys, Broker attacks

								
To top