Docstoc

Algorithm

Document Sample
Algorithm Powered By Docstoc
					Adaptive Ordering of
Pipelined Stream Filters

    S. Babu, R. Motwani, K. Munagala, I.
    Nishizawa, and J. Widom
    In Proc. of SIGMOD 2004, June 2004
Outline

   Introduction
   Ordering of filters
   Adaptive ordering algorithm
   Preliminaries
   Algorithms
   Experimental evaluation
   Conclusion
Introduction

   Why “Stream”?
       Many modern applications deal with data that
           Is updated continuously
           Needs to be processed in real-time

   Why “Adaptive”?
       Arrival characteristics of streams may vary significantly
        over time

   Characteristic of filters
       Direct
       Commutative
Ordering of filters

   Challenges

       Selectivities across filters may be correlated.

       An exhaustive algorithm becomes infeasible.
           Greedy approach

       Arrival characteristics of streams may vary
        significantly over time
           Adaptive approach
Adaptive ordering algorithm

   Three-way tradeoff

       Run-time overhead
           Monitor the changes of statistics
           Determine when to change the current order

       Convergence properties
           If data and filter characteristics stabilize, adaptive algorithm
            should converge to a solution with desirable properties

       Speed of adaptivity
           The time the convergence takes when data and filter
            characteristics change
Preliminaries
       symbol            meaning
                           query
                       input stream
                           filters
                       stream tuple
                         ordering
                conditional drop probability
                Processing time per tuple



                         total cost
Algorithms

   A-GREEDY algorithm
   The SWEEP algorithm
   The INDEPENDENT algorithm
   The LOCALSWAP algorithm
(Static) Greedy algorithm

   Greedy approach





A-GREEDY Profiler
A-GREEDY Reoptimizer




   violation
A-GREEDY Algorithm (1/2)
A-GREEDY Algorithm (2/2)
A-GREEDY properties
   Convergence
     Good

    




   Run-time overhead
     Profile-tuple creation
     Profile-window maintenance

     Matrix-view update

     Detection and correction of GI violations



   Adaptivity
     rapidly
Algorithms

   A-GREEDY algorithm
   The SWEEP algorithm
   The INDEPENDENT algorithm
   The LOCALSWAP algorithm
The SWEEP Algorithm
   Rotating   over
SWEEP properties

   Convergence
       Good


   Run-time overhead
       A-Greedy:
       Sweep:


   Adaptivity
       Violations may remain undetected for a relatively long time
       Up to        stages
Algorithms

   A-GREEDY algorithm
   The SWEEP algorithm
   The INDEPENDENT algorithm
   The LOCALSWAP algorithm
The INDEPENDENT Algorithm
   Assume the filters are independent
       frequently in database literature
       seldom true in practice





    
INDEPENDENT properties

   Convergence
       independent
           converge to the optimal ordering
       dependent
           can be      times worse than the GI ordering


   Run-time overhead
       lower than A-Greedy

   Adaptivity
       rapidly
Algorithms

   A-GREEDY algorithm
   The SWEEP algorithm
   The INDEPENDENT algorithm
   The LOCALSWAP algorithm
The LOCALSWAPS Algorithm
   Detect
       a swap between adjacent filters in   would
        improve performance





    
LOCALSWAP properties

   Convergence
       dependent on the way characteristics change
       in some case,     times higher cost than GI ordering


   Run-time overhead
       lower than A-Greedy


   Adaptivity
       may take more time to converge
       may get stuck in a local optima
Experimental evaluation

   Three parts
       Convergence experiments
       Run-time overhead experiments
       Adaptivity experiments

Convergence experiments

   Factors
       Number of filters
       Filter selectivities
       Cost of filters
       Correlation among filters

   Comparison
       Optimal
       A-Greedy
       Independent
Convergence experiments (number)
   Factors
       Number of filters
Convergence experiments (selectivity)
   Factors
       Filter selectivities
Convergence experiments (correlation)
   Factors
       Correlation among filters
Run-time overhead experiments

   Factors
       Number of filters
       Filter selectivities
       Cost of filters

   Comparison
       Optimal
       A-Greedy
       Sweep
       LocalSwaps
       Independent
Run-time overhead experiments (time)

   Factors
       Spending time



                                ≥ 98%
Run-time overhead experiments (number)
   Factors
       Number of filters
Run-time overhead experiments (cost)

   Factors
       Cost of filters
Adaptivity experiments

   Factors
       Rate of change
       Cost of filters


   Comparison
       A-Greedy
       Sweep
       LocalSwaps
       Independent
Adaptivity experiments (speed)
Adaptivity experiments (rate)
   Factors
       Rate of change
Adaptivity experiments (cost)
   Tradeoff
       Run-time overhead ↔ adaptivity
Conclusion

   Different points along the tradeoff spectrum
       Fast Adaptivity
           A-Greedy
       Low run-time overhead
           Slow adaptivity, good convergence
               Sweep
           Independent filters
               Independent
           Swap between adjacent filters, unpredicted convergence
               LocalSwaps
Appendix
stable
correlation factor Γ
       filters are divided into             groups
       Each contains Γ filters

   Filters in
     different groups

           independent
       the same groups
           80% the same result of input tuples


       most correlated

       completely independent
Example 1
Example 2




   For                       ,
   Total cost = 20

Input     1   2       1   4       7   2   5   4   total
Cost      4   3       4   2       1   3   1   2    20
Example 6

      1, 2, 3, …, 100, 1, 2, …, 100, 1, 2, …


                              INDEPENDENT
                              permutation of         +
                              cost:
 1          1      50
 2          2      51
 …          …      …          A-GREEDY
 49         49     99
                              one of           +   +permutation of others
                   100
                              cost:
Example 7

    1, 2, 3, …, 100, 1, 2, …, 100, 1, 2, …


                           Before,

                           After,
                           A-GREEDY
x                     x
                                         + permutation of others
                      x
                      x    cost:                /100 ?
    x                      LOCALSWAP
            x          x

                           cost:              /100 ?

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:21
posted:9/28/2012
language:Unknown
pages:43