Your Federal Quarterly Tax Payments are due April 15th Get Help Now >>

An Evolutionary Algorithm for Query Optimization by yaofenji

VIEWS: 29 PAGES: 20

									An Evolutionary Algorithm for
    Query Optimization
                  in Database
                       Kayvan Asghari,
                    Ali Safari Mamaghani
                   Mohammad Reza Meybodi



                    International Joint Conferences
   on Computer, Information, and Systems Sciences, and Engineering
                           CISSE 2007

                                                                     1
                          Introduction
Optimizing the database queries is one of hard research problems
 If the number of relations is more than five or six relations,
  exhaustive search techniques will bear high cost regarding the
  memory and time.
 Examples of queries with large number of relations can be found
  in:
       Deductive database management systems
       Expert systems
       Engineering database management systems (CAD/CAM)
       Decision Support Systems
       Data mining
       Scientific database management systems
   Whatever the reason, database management systems need the use
    of query optimizing techniques with low cost in order to
    counteract with such complicated queries.

                                                                2
Searching Algorithms for Suitable Join Order
   Exact algorithms that search all of state space and sometimes they reduce this
    space by heuristic methods:
        Dynamic programming method which at first introduced by Selinger et al.
         for optimizing the join ordering in System-R.
        Minimum selectivity algorithm

        KBZ algorithm

        AB algorithm

    But Exhaustive search techniques like dynamic programming is suitable for
    queries with a few relation and we have to use random and evolutionary
    methods
   Random algorithms have been introduced for showing the inability of exact
    algorithms versus large queries.
        Iterative improvement

        Simulated annealing

        Two-phase optimization

        Toured simulated annealing

        Random sampling

        Evolutionary algorithms

                                                                               3
               Evolutionary algorithms

   Genetic algorithm has been done by Bennet et al.
   Some other works have been done by Steinbrunn et al. that
    they have used different coding methods and genetic operators
   genetic programming which is introduced by Stillger et al.
   CGO genetic optimizer has also been introduced by Mulero et
    al.

     In our paper a hybrid evolutionary algorithm has been
      proposed that uses two methods of genetic algorithm
     and learning automata synchronically for searching the
                    states space of problem.

                                                                4
               The Definition of Problem
   DBMS selects the best query execution plan (qep) from among
    execution plans, in a way that query execution bears the low
    cost, especially the cost of input/output operations and time of
    processing.
   If we show the all of allocated execution plans for responding to
    the query with S set, each member qep that belongs to S set has
    cost(qep) The purpose of each optimization algorithm is finding
    a member like qep0 which belongs to S set, so that:
                               cost(qep )  minqep S cost(qep)
                                        0            
   Processing and Optimizing the join operators in query are
    difficult
   Join operator considers two relations as input and combines
    their tuples one by one on the basis of a definite criterion and
    produce a new relation as output.
   The join operator has associative and commutative features thus
    the number of execution plans for responding to a query
    increases exponentially when the number of joins among
    relations increases.                                             5
                   Learning automata
   Learning automata approach for learning involves
    determination of an optimal action from a set of allowable
    actions.
   It selects an action from its finite set of actions.




                                                                 6
    Proposed Hybrid Algorithm for Solving
           Join Ordering Problem

 Combining genetic algorithms and learning automata
 Generation, penalty and reward are some of features
  of hybrid algorithm.
 In proposed algorithm, unlike classical genetic
  algorithm, binary coding or natural permutation
  representations aren't used for chromosomes.
 Each chromosome is represented by learning
  automata of object migration kind
 Each of genes in chromosome is attributed to one of
  automata actions, and is placed in a definite depth of
  that action.
                                                           7
     Learning automata of object migration
                    kind
   In these automata, α  {α1, α2 , ... , αk } is set of allowed
    actions of automata.
     {1 ,  2 , ... ,  KN } Is set of states and N is memory depth for
    automata.
   Now consider the following query:
                (A∞C) and (B∞C) and (C∞D) and (D∞E)
   An example of a query graph :
          B          P2
                              P3
                              E              P4
                P1        C             D             E
          A
                                                                          8
Learning automata of object migration kind



Display of joins
permutation (p3, p2, p1, p4)
by learning automata based
on Tsetlin automata
connections




                                         9
                      Fitness function
   The purpose of searching the optimized order of query joins is
    finding permutation of join operators, so that total cost of
    query execution is minimized in this permutation.
   One important point in computing fitness function is the
    number of references to the disc so we can define the fitness
    function of F for an execution plan like qep as follows:

                                                   1
                         F (qep) 
                                     The number of referencesto disk


                           Ctotal  C ( R1 )  C ( R2 )  C ( R1R2 )

                                                                        10
    Operators of Hybrid genetic algorithm
 Selection operator: The selection used for this
  algorithm is roulette wheel.
 Crossover Operator: In this operator, two parent
  chromosomes are selected and two genes i and j are
  selected randomly in one of the two parent
  chromosomes.
 Mutation operator: For executing this operator, we can
  use different method which are suitable for work with
  permutations. For example in swap mutation, two
  actions (genes) from one automata (chromosome) are
  selected randomly and replaced with each other.
                                                      11
Crossover Operator




                     12
Mutation operator




                    13
      Penalty and Reward Operator

 In each chromosome, evaluating the fitness rate of a
  gene which is selected randomly, penalty or reward is
  given to that gene.
 As a result of giving penalty or reward, the depth of
  gene changes.
 For example, in automata like Tsetlin connections, if
  p2 join be in states set {6,7,8,9,10}, and the cost for
  p2 join in the second action will be less than average
  join costs of chromosome, reward will be given to
  this join and it's Certainty will be increased and
  moves toward the internal states of that action.


                                                        14
An Example of Reward Operator




                                15
The manner of giving penalty to the join
 that is in a state except boundary state




                                        16
The manner of giving penalty to the
  join that is in a boundary state




                                      17
               Experiment Results
                                                                            Automata      GA       GALA
                                                     70000
                                                     60000




                           Cost of Query Execution
Comparison of averaged                               50000
   cost obtained from                                40000
  hybrid algorism and                                30000
learning automata based                              20000
 on Tsetline and genetic                             10000

        algorithm                                       0
                                                             10   20   30     40    50    60    70     80   90   100
                                                                             Number of Join Operator
                                                                            Automata       GA        GALA

    Comparison of                                    70000
                           Cost of Query Execution



averaged cost obtained                               60000
                                                     50000
 from hybrid algorithm
                                                     40000
and learning automata                                30000
 based on Krinsky and                                20000
   genetic algorithm                                 10000
                                                         0
                                                             10   20   30     40    50    60    70     80   90 100
                                                                             Number of Join Operator           18
               Experiment Results
                                                                                   Automata      GA       GALA
                                                      70000




                            Cost of Query Execution
                                                      60000
 Comparison of averaged                               50000
cost obtained from hybrid                             40000
  algorithm and learning                              30000
automata based on Krylov                              20000
  and genetic algorithm                               10000
                                                         0
                                                              10        20    30    40     50    60     70     80   90   100
                                                                                    Number of Join Operator
                                                                                    Automata      GA       GALA
                                                      70000
 Comparison of averaged
                            Cost of Query Execution



                                                      60000
cost obtained from hybrid                             50000
  algorithm and learning                              40000
                                                      30000
automata based on Oomen
                                                      20000
  and genetic algorithm                               10000
                                                          0
                                                                   10    20    30     40   50     60    70     80   90 100
                                                                                     Number of Join Operator
                                                                                                                       19
              Experiment Results
                                                                                               GALA - Krinisky      GALA - Krylov
                                                                                               GALA - Oomen         GALA - Tsetlin
                                                                 70000




                            Cost of Query Execution
Comparison of averaged                                           60000
                                                                 50000
  cost obtained from
                                                                 40000
hybrid algorithms based                                          30000
 on different Automata                                           20000
                                                                 10000
                                                                     0
                                                                                     10   20    30   40     50    60       70   80   90   100
                                                                                                     Number of Join Operator
                                                                                                GALA - Krinisky        GALA - Krylov
                                                                                                GALA - Oomen           GALA - Tsetlin




 Comparison of averaged
                                              C ost of Q ue ry E xe cution



                                                                             43000

cost obtained from hybrid
                                                                             38000
  algorithms based on
different Automata depth                                                     33000


                                                                             28000


                                                                             23000
                                                                                     1          3         5            7         9         20
                                                                                                                                           11
                                                                                                       De pth of Automata

								
To top