Docstoc

ANSI Join SQL

Document Sample
ANSI Join SQL Powered By Docstoc
					To ANSI or Not To ANSI


Gravenstein, Costello, Maurer
Agenda
 Brief History
 Review Join Technologies
 Analysis
 Recommendation
ANSI/Oracle Support History
 ANSI here refers to SQL/99 Join Syntax
   The standard to which all RDBMS vendors
    strive to comply

 SQL/99 support started with Oracle 9i in 2001
   Two years after the release of the standard,
    Oracle supports it.

 This presentation restricted to the Oracle
  implementation of the ANSI standard
Join Condition Types
 Equijoin
     Columns with the same name
     Columns with different names
 Outerjoin
     Left (left driving table)
     Right (right driving table)
     Full (both tables driving)
 Cross/Cartesian product
Traditional Equijoin - Same Name
 Traditional Oracle Approach

SELECT e.ename AS Employee_name,      Table prefix is required to
                                        remove ambiguity on
       d.deptno,
                                          common columns
       d.dname AS Department_name
  FROM emp e,
       dept d
 WHERE e.deptno = d.deptno;


                                Join conditions must be
                                         listed
  ANSI Equijoin Natural Syntax
   ANSI SQL Natural Join

  SELECT ename AS employee_name,    No table prefix if column is
                                      part of join condition.
         deptno,
         dname AS department_name
    FROM emp                            No commas between
         NATURAL JOIN dept;                   tables.



                                        Join columns implied,
God forbid that you accidentally
                                        based on columns that
 add a non-join column to both
                                         have the same name
tables…(um.. Audit columns...)
ANSI Equijoin Using Syntax
 ANSI SQL Join USING
                                  Table prefix allowed on columns that
SELECT d.dname,                   are not part of the using clause (join
       e.ename                                  condition)
  FROM emp d
       JOIN dept d USING ( deptno );




                                                Several columns share
   Add additional join columns using
                                               same name, only joining
       ( deptno, col2, col3, …)
                                               on some of them, in this
                                                    case deptno
Traditional Equijoin Syntax
 Traditional Join, columns different

SELECT d.department_name,
       l.city
  FROM departments d,
       locations l
 WHERE d.location_id = l.id;


                                 Join column names are
                                         different
ANSI On Equijoin Syntax
 ANSI SQL ON

SELECT d.department_name,
       l.city
  FROM departments d
       JOIN locations l ON ( d.location_id = l.id );




      Use ON when join column    List join conditions here
        names are different       like traditional syntax
  ANSI Equijoin Syntax
 ANSI SQL Multi Table On

                            Bring in first table join
SELECT e.empno,
       l.loc_id,
       d.dname,                          No commas between tables
       l.state_tx
  FROM locations l
       JOIN dept d ON ( d.location_id = l.id )
       JOIN emp e ON ( d.deptno = e.deptno );


Bring in second table join          Any prior table column is visible
                                        – joins from left to right
 ANSI Equijoin Syntax
 ANSI SQL INNER
SELECT e.emp_id,
       l.city,
       d.dept_name,            ON clause allows reference to join
                                    columns by table name
       c.city,
       d.deptno
  FROM locations l
       INNER JOIN dept d ON ( d.location_id = l.id )
       INNER JOIN emp e ON d.deptno = e.deptno;



     INNER – an optional keyword           Parenthesis are optional, we
    stating this is an equijoin (not an   like to include them for clarity
           outer or cross join)
 Traditional Outerjoin Syntax
 Traditional Outer Join
                                 NULL in name if no employees in
SELECT e.ename,                          the department.
       d.dname
  FROM emp e,
       dept d
 WHERE e.deptno (+) = d.deptno



                       Traditional Outer Join Notation
                        (+) indicator denotes expand
                       records on this side if needed
       ANSI Outerjoin Syntax
     Left Outer Join                         NULL in last name if no employees
                                                     in the department.
    SELECT e.ename,
           d.dname                  OUTER keyword is optional.
      FROM dept d
           LEFT OUTER JOIN emp e
               ON (e.deptno = d.deptno);




LEFT denotes that the dominant table is to the left (dept)
 and that all of it’s rows will be returned. The right table
              is expanded with NULL records
 ANSI Outerjoin Syntax
 Left Outer Join

SELECT e.ename,
       d.dname
  FROM dept d
       NATURAL LEFT JOIN emp e;


                            We don’t recommend using it here
                                         either!

NATURAL can be used in an
 INNER and OUTER join.
  ANSI Outerjoin Syntax
 Right Outer Join
                                        NULL in last name if no employees
SELECT e.ename,                                in the department.
       d.dname
  FROM emp e
       RIGHT OUTER JOIN dept d
           ON (e.deptno = d.deptno);




RIGHT OUTER denotes that the dominant table is
  to the right. The left table gets expanded with
                      NULLS.
 Traditional Outerjoin Syntax
 Full Outer Join
      Can only be represented with a UNION query.
SELECT e.ename,
       d.dname
                                    Shouldn’t see many of these
  FROM emp e, dept d
                                     (We’ve never needed one)
 WHERE e.deptno (+) = d.deptno
UNION
SELECT e.ename,
       d.dname                      UNION ALL is incorrect as it
                                     results in duplicate rows
  FROM emp e, dept d
 WHERE e.deptno = d.deptno (+)

                                   UNION performs an implicit
                                  “DISTINCT” on both queries –
                                 possibly removing desired rows
  ANSI Outerjoin Syntax

Full Outer Join                            NULL in last name if no employees
                                                   in the department.

SELECT e.ename,
       d.dname                 NULL in department name if
  FROM emp e                  employee not in a department.
       FULL OUTER JOIN dept d
           ON (e.deptno = d.deptno);
                                                        OUTER is optional

 FULL OUTER denotes that the table to the right
AND the table to the left will have all their records
                    returned
Traditional Cross Join/Cross Product
 Cross Join

SELECT emp_id,
       ename,
       dname
  FROM emp e,
       dept d
WHERE d.deptno = 10;


                No join condition between tables…
                      um, normally not good.
ANSI Cross Join/Cross Product
 Cross Join

SELECT emp_id,
       ename,
       dname
  FROM emp e
       CROSS JOIN dept d
WHERE d.deptno = 10;



                              Explicit CROSS condition,
                           impossible to do this inadvertently.
ANSI Correlated Join
 Correlated Query Syntax Join
                             ANSI doesn’t allow join clause on
SELECT empno,                          first table
       ename
  FROM emp e                      Only tables in “current”
 WHERE EXISTS                     FROM clause visible to
                                      ANSI join logic
       ( SELECT NULL
           FROM dept d
                INNER JOIN locations l
                   ON ( l.loc_id = d.loc )
          WHERE d.deptno = e.deptno
       );
                                            Not really a “mixed”
                                                syntax join
ANSI Correlated Subquery Issues
SELECT *
  FROM dept d
       INNER JOIN locations2 l
         USING ( loc )
 WHERE EXISTS                Don’t be tempted to remove the
       (                              table prefix “l”
       SELECT NULL
         FROM emp e
        WHERE e.loc = l.loc
        )                   Scope of reference dictates that the
                                       closest column be used
ORA-25154: column part of USING clause cannot
  have qualifier
                            “NATURAL” joins have the same
                                        issue
ANSI Outer Join Subtleties
SELECT d.deptno, e.ename, e.job
    FROM dept d
         LEFT JOIN emp e
          ON (     e.deptno = d.deptno
               AND e.job = 'SALESMAN');

DEPTNO   ENAME    JOB
======   ======   ========
  30     ALLEN    SALESMAN        Filter is applied before join is
  30     WARD     SALESMAN                   executed
  30     MARTIN   SALESMAN
  30     TURNER   SALESMAN
  50
  40
  20
  10
                             A number of rows returned that
                             have “OUTER” joined emp data
ANSI Outer Join Subtleties
SELECT d.deptno, e.ename, e.job
    FROM dept d
         LEFT JOIN emp e
          ON (    e.deptno = d.deptno)
   WHERE e.job = 'SALESMAN';

DEPTNO   ENAME    JOB
======   ======   ========
  30     ALLEN    SALESMAN      Filter is applied after join is
  30     WARD     SALESMAN                 executed
  30     MARTIN   SALESMAN
  30     TURNER   SALESMAN



                             No outer joined data
Mixed Traditional/ANSI Join
 Mixed Syntax Join

SELECT emp_id,
       ename,
       dname
  FROM emp e
       INNER JOIN dept d USING (deptno),
       dual
 WHERE d.deptno = 10;

                                       Uhg….
                             Choose one or the other, but
                                  not both please!
ANSI vs Traditional Join Analysis
 Impact areas
     Flexibility
     Code Clarity
          Readability
     Ease of Use
          Developer Training
          DBA Training
     Legacy Code
     Standards
    Flexibility –
    (+) Restrictions not present in ANSI
   You cannot specify the (+) operator in a query block that also contains FROM clause join syntax.
   The (+) operator can appear only in the WHERE clause or, in the context of left-correlation (that is, when specifying
    the TABLE clause) in the FROM clause, and can be applied only to a column of a table or view.
   If A and B are joined by multiple join conditions, then you must use the (+) operator in all of these conditions. If you do
    not, then Oracle Database will return only the rows resulting from a simple join, but without a warning or error to
    advise you that you do not have the results of an outer join.
   The (+) operator does not produce an outer join if you specify one table in the outer query and the other table in an
    inner query.
   You cannot use the (+) operator to outer-join a table to itself, although self joins are valid. For example, the following
    statement is not valid:

             SELECT employee_id, manager_id
              FROM employees                                                          Oracle recommends
             WHERE employees.manager_id(+) = employees.employee_id;
                                                                                      using ANSI OUTER
   However, the following self join is valid:                                            JOIN syntax
             SELECT e1.employee_id, e1.manager_id, e2.employee_id
              FROM employees e1, employees e2
             WHERE e1.manager_id(+) = e2.employee_id;

   The (+) operator can be applied only to a column, not to an arbitrary expression. However, an arbitrary expression
    can contain one or more columns marked with the (+) operator.
   A WHERE condition containing the (+) operator cannot be combined with another condition using the OR logical
    operator.
   A WHERE condition cannot use the IN comparison condition to compare a column marked with the (+) operator with
    an expression.
   A WHERE condition cannot compare any column marked with the (+) operator with a subquery.
Flexibility - Performance
 In theory the optimizer should find the same
  “best” execution plan for all logically
  equivalent SQL statements
 In practice, the more complex the statement,
  the less likely logical equivalent statements
  result in the same plan
     Our preliminary research has shown ANSI
      OUTER joins can result in more efficient
      execution plans when compared with
      equivalent traditional joins
Code Clarity (Traditional)
SELECT /*+ qb_name(orig) */
       fdla.dim_borrower_v_id dim_borrower_v_id
  FROM dim_as_of_date_vw daod,
       dim_daily_loan_applctn_detl ddlad,                               Can you quickly
       dim_disbursement_date_vw dddv,
       dim_loan_originator dlo,                                       determine how tables
       fact_daily_loan_application fdla,
       dim_loan_applctn_status_vw dlasv                                    are joined?
 WHERE     daod.dim_as_of_date_v_id = ddlad.dim_as_of_date_v_id
       AND daod.dim_as_of_date_v_id = fdla.dim_as_of_date_v_id
       AND ddlad.dim_daily_loan_applctn_detl_id =fdla.dim_daily_loan_applctn_detl_id
       AND ddlad.dim_as_of_date_v_id = fdla.dim_as_of_date_v_id
       AND dddv.dim_disbursement_date_v_id = fdla.dim_disbursement_date_v_id
       AND dlo.dim_loan_originator_id = fdla.dim_loan_originator_id
       AND dlasv.DIM_LOAN_APPLCTN_STATUS_V_ID = fdla.DIM_LOAN_APPLCTN_STATUS_V_ID
       AND NOT (dlasv.STATUS_CODE BETWEEN '700' AND '740')
       AND NOT (dlasv.status_code BETWEEN '000' AND '429')
       AND daod.as_of_calendar_date = (CASE WHEN &in_DATE_SLICE IS NULL THEN
                                               LAST_DAY (ADD_MONTHS (TRUNC(SYSDATE), -1)) +
                                                         &c_DEFAULT_SLICE_OFFESET
                                         ELSE TO_DATE( &in_DATE_SLICE, &c_DATE_FORMAT )
                                    END)
       AND dddv.disburse_date BETWEEN
   TRUNC(NVL(TO_DATE(&in_START_REPORT_MONTH,&c_DATE_FORMAT),ADD_MONTHS(SYSDATE, -1)), 'MM')
                             AND TRUNC (LAST_DAY
   (NVL(TO_DATE(&in_END_REPORT_MONTH,&c_DATE_FORMAT),ADD_MONTHS(SYSDATE, -1))))
       AND ddlad.loan_transfer_status_code != 'T'
                                                               This is a real join we’ve
           Is a table join condition                           implemented as part of
        missing? How do you know?                                  a recent project
Code Clarity (ANSI)
SELECT /*+ qb_name(orig) */
       fdla.dim_borrower_v_id dim_borrower_v_id
  FROM
        dim_as_of_date_vw daod
  INNER JOIN fact_daily_loan_application fdla                                               Can
          ON (daod.dim_as_of_date_v_id = fdla.dim_as_of_date_v_id)
  INNER JOIN dim_daily_loan_applctn_detl ddlad
                                                                                            Not
          ON (     ddlad.dim_as_of_date_v_id = daod.dim_as_of_date_v_id
               AND ddlad.dim_daily_loan_applctn_detl_id =
                                                                                            Add
                   fdla.dim_daily_loan_applctn_detl_id                                      Table
               AND ddlad.dim_as_of_date_v_id = fdla.dim_as_of_date_v_id )
  INNER JOIN dim_disbursement_date_vw dddv                                                  Without
          ON (     dddv.dim_disbursement_date_v_id = fdla.dim_disbursement_date_v_id)
  INNER JOIN dim_loan_originator dlo                                                        Join
          ON (dlo.dim_loan_originator_id = fdla.dim_loan_originator_id)
  INNER JOIN dim_loan_applctn_status_vw dlasv
                                                                                            Condition
          ON (dlasv.dim_loan_applctn_status_v_id = fdla.dim_loan_applctn_status_v_id)
 WHERE      NOT (dlasv.STATUS_CODE BETWEEN '700' AND '740')
        AND NOT (dlasv.status_code BETWEEN '000' AND '429')
        AND daod.as_of_calendar_date = (CASE WHEN &in_DATE_SLICE IS NULL THEN
                                                LAST_DAY (ADD_MONTHS (TRUNC(SYSDATE), -1)) +
                                                          &c_DEFAULT_SLICE_OFFESET
                                               ELSE TO_DATE( &in_DATE_SLICE, &c_DATE_FORMAT )
                                         END)
        AND dddv.disburse_date BETWEEN
    TRUNC(NVL(TO_DATE(&in_START_REPORT_MONTH,&c_DATE_FORMAT),ADD_MONTHS(SYSDATE, -1)), 'MM')
                               AND TRUNC (LAST_DAY
    (NVL(TO_DATE(&in_END_REPORT_MONTH,&c_DATE_FORMAT),ADD_MONTHS(SYSDATE, -1))))
        AND ddlad.loan_transfer_status_code != 'T'
Ease of Use/Developer & DBA
            Training
 Traditional Join
     Long time Oracle developers do nothing
     New Oracle Developers need to learn Oracle
      syntax
     Easier to make mistakes
 ANSI Join
     Works with SQL Server/Oracle/MySQL/…
     Syntax is more readable/self documenting
     Natural join is “un-natural”
Installed Code Base
 ANSI Joins not present in the Oracle installed
  code base
 ANSI Joins present in other RDBMS installed
  code
     More of these databases coming all the time
 Harm in having two join syntaxes
     Support personnel have to be comfortable with
      both syntaxes
     Additional training required
Corporate Standards
 Does it make sense to allow both traditional and
  ANSI?
 Other existing examples of “equivalent” functions
      TRUNC vs ROUND                    DECODE is Traditional,
         Both allowed                      CASE is ANSI
      DECODE vs CASE statement
         Standards says no new DECODE
      CAST vs TO_CHAR, TO_NUMBER, TO_DATE, …
         Both are allowed
                                       TO_* is Traditional,
         No documented standard
                                         CAST is ANSI
      ANSI Join
         ANSI has not been allowed

         No documented standard
Stay with Traditional Joins
 Positives
     Long time Oracle professionals have
      familiarity with the syntax
     Installed code base is all this way
     Going forward let’s not introduce additional
      methods/styles which need to be supported
     Additionally ANSI functionality, rarely, if ever
      needed
     No “dangerous” natural join possibility
Allow ANSI Joins
 Positives
     Clearer, more readable code
        join syntax is segregated out of the predicate

     More flexible
        Oracle is adding enhancements to this syntax not

         available with traditional join syntax
        Apparent path of progress

     Standards based
        other DB vendors support ANSI SQL (MS, mysql,…)

        Simplified multi-vendor RDBMS support

     Much harder to make join errors
Fishbone Diagram
       Only Allow Traditional Join Syntax




                                            Error Free
                                               SQL




            Allow ANSI Join Syntax
Recommendations
 Allow both
 Provide training so that all are familiar with both
 Place some restrictions on ANSI syntax to prevent
  problems
      Do not allow natural joins
 Single SQL statements should use one or the other
  but not both
      New development should try to use same syntax
       throughout
 Long term goal, ANSI only
     ?
Open Discussion

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:10
posted:9/29/2011
language:English
pages:37