Four Main Activities of Operation Manager - PowerPoint

Document Sample
Four Main Activities of Operation Manager - PowerPoint Powered By Docstoc
					Query Processing




        Presented by

                Aung S. Win
Objectives

   Query processing and optimization.
   Static versus dynamic query optimization.
   How a query is decomposed and
    semantically analyzed.
   How to create a relational algebra tree to
    represent a query.
   The rules of equivalence for the relational
    algebra operations.
(Cont.)

   Heuristic transformation rules.
   The types of database statistics required to
    estimate the cost of operations.
   The different strategies for implementing the
    relational algebra operations.
   The difference between materialization and
    pipelining.
   The advantages of left-deep trees.
Query Processing

   The activities involved in retrieving data from
    the database.
   The aims of query processing
     (1)to transform a query written in a high-level
        language into a low-level language
     (2)to execute the strategy to retrieve the
        required data.
(Cont.)

   Query processing can be divided into four
    main phases: decomposition, optimization,
    code generation, and execution.
Query Decomposition

   The aims of query decomposition
    (1)to transform a high-level query into a
    relational algebra query.
    (2)to check that the query is syntactically and
    semantically correct.
(Cont.)

   The typical stages of query decomposition
    are analysis, normalization, semantic
    analysis, simplification, and query
    restructuring.
Analysis

   The query is lexically and syntactically
    analyzed using the techniques of
    programming language compilers.
   Verifies that the relations and attributes
    specified in the query are defined in the
    system catalog.
   Verifies that any operations applied to
    database objects are appropriate for the
    object type.
(Cont.)

   On completion of the               Root
    analysis, the high-level
    query has been
    transformed into some
    internal representation    Intermediate operations
    (query tree) that is
    more suitable for
    processing.
                                      leaves
Normalization

   Converts the query into a normalized form
    that can be more easily manipulated.
   There are two different normal forms,
    conjunctive normal form and disjunctive
    normal form.
Conjunctive normal form

   A sequence of conjuncts that are connected
    with the and operator. Each conjunct
    contains one or more terms connected by the
    or operator.
    for example
    (position=‘Manager’ V salary>20000) ^
    branchNo = ‘B003’
Disjunctive normal form

   A sequence of disjuncts that are connected
    with the or operator. Each disjunt contains
    one or more terms connected by the and
    operator.
    for example
    (position=‘Manager’ ^ branchNo = ‘B003’) V
    (salary>20000 ^ branchNo = ‘B003’)
Semantic analysis

   The objective is to reject normalized queries
    that are incorrectly formulated or
    contradictory.
Simplification

   To detect redundant qualifications, eliminate
    common subexpressions , and transform the
    query to a semantically equivalent but more
    easily and efficiently computed form.
   Access restrictions, view definitions, and
    integrity constraints are considered at this
    stage.
Query restructuring

   The final stage of query decomposition.
   The query is restructured to provide a more
    efficient implementation.
Query optimization

   The activity of choosing an efficient execution
    strategy for processing a query.
   An important aspect of query processing is
    query optimization.
   The aim of query optimization is to choose
    the one that minimizes resource usage.
(Cont.)

   Every method of query optimization depend on
    database statistics.
   The statistics cover information about relations,
    attribute, and indexes.
   Keeping the statistics current can be problematic.
   If the DBMS updates the statistics every time a tuple
    is inserted, updated, or deleted, this would have a
    significant impact on performance during peak
    period.
(Cont.)

   An alternative approach is to update the
    statistics on a periodic basis, for example
    nightly, or whenever the system is idle.
Dynamic query optimization

   Advantage: all information required to select
    an optimum strategy is up to date.
   Disadvantage: the performance of the query
    is affected because the query has to be
    parsed, validated, and optimized before it
    can be executed.
Static query optimization

 The query is parsed, validated, and
  optimized once that is similar to the approach
  taken by a compiler for a programming
  language.
 Advantages
1)The runtime overhead is removed
2)More time available to evaluate a larger
  number of execution strategies.
(cont.)

   Disadvantage: the execution strategy that is
    chosen as being optimal when the query is
    compiled may no longer be optimal when the
    query is run.
Transformation Rules for the
Relational Algebra Operations

   By applying transformation rules, we can
    transform one relational algebra into an
    equivalent expression that is more efficient.
   There are twelve rules that can be used to
    restructure the relational algebra tree
    generated during query decomposition.
Heuristics rules

   Many DBMSs use heuristics to determine strategies
    for query processing.
   Heuristics rules include
    -performing Selection and Projections as early as
    possible.
    -combining Cartesian product with a subsequent
    selection whose predicate represents a join
    condition into a join operation.
(Cont.)

 -using associativity of binary operations to
 rearrange leaf nodes so that leaf nodes with
 the most restrictive Selections are executed
 first.
Cost estimation

   Depends on statistical information held in the
    system catalog.
   Typical statistics include the cardinality of
    each base relation, the number of blocks
    required to store a relation, the number of
    distinct values for each attribute, the
    selection cardinality of each attribute, and the
    number of levels in each multilevel index.
Join operation

   Block nested loop join
   Indexed nested loop join
   Sort-merge join
   Hash join
Pipelining

   In materialization, the output of one operation is
    stored in a temporary relation for processing by the
    next operation.
   An alternative approach is to pipeline the results of
    one operation to another operation without creating a
    temporary relation to hold the intermediate result.
   By using it, we can save on the cost of creating
    temporary relations and reading the results back in
    again.
Left – deep trees

   A relational algebra tree where the right-hand
    relation is always a base relation.
   Advantages: reducing the search space for
    the optimum strategy and allowing the query
    optimizer to be based on dynamic processing
    techniques.

				
DOCUMENT INFO
Shared By:
Categories:
Stats:
views:17
posted:4/13/2011
language:English
pages:28
Description: Four Main Activities of Operation Manager document sample