Open DMIX High Performance Web Services for Data Mining, by vev19514


									            Open DMIX

High Performance Web Services for
Data Mining, Data Integration, and
         Data Exploration
         Robert L Grossman &
       Steve Eick, Yunhong Gu,
     David Hanley & Xinwei Hong
    National Center for Data Mining
           Goal: Integrate & Explore
               Distributed Data

Middleware # Sites Critical   Avg.    Avg.       Avg.
                   Path       Arch.   Data Set   Select
Data Grid  1000 cycles        PB      TB         GB

Data Web     1M     access & TB       GB         MB
    High Performance Web Services
        Discovery UDDI             Bandwidth
      Description WSDL

        Packaging XML                      DSTP

     Transport SOAP/HTTP                   SOAP+

     Network Protocol TCP               SABUL/UDT

*Open DMIX is an open source collection of web services
for data mining, data integration and data exploration
   SABUL/UDT Protocol Overview
 Uses both Rate Control (RC) and (window based)
  Flow Control (FC)
  – Constant RC interval to remove RTT bias
  – Employs bandwidth estimation
 Selective   acknowledgement (ACK)
  – Reduces control traffic & results in faster recovery
 Uses packet delay as well as packet loss to
  indicate congestion
 Slow start
  – controlled by FC
           UDT Fast, Fair & Friendly,
              Easy & Efficient
                                Friendly to
                                TCP Flows

                                  Fair to
                                other High
                                Perf. Flows

New Trans-Atlantic Milestone for
    FFFEE Data Transport
          For More Information
Robert Grossman
grossman at uic dot edu

  Please join our open source project developing
     network protocols, data protocols, and web
              services on Source Forge.

To top