Docstoc

Characteristic Functions_1_

Document Sample
Characteristic Functions_1_ Powered By Docstoc
					Characteristic Functions
Characteristic Functions
  (from fin_data table in Sybase Sample Database)

  Have:

  Year    quarter   code    amount
  2001    Q1        e1        198
  2001    Q2        e1        204
  2001    Q3        e1        214
  2001    Q4        e1        231

  Want:

  Year    Code      Q1Amt     Q2Amt   Q3Amt    Q4Amt
  2001    e1          198       204     214      231
Characteristic Functions
  Have: Normalized

  Year   quarter   code    amount
  2001   Q1        e1        198
  2001   Q2        e1        204
  2001   Q3        e1        214
  2001   Q4        e1        231


  Want: Un-normalized

  Year   Code      Q1Amt     Q2Amt   Q3Amt   Q4Amt
  2001   e1          198       204     214     231
Characteristic Functions
  Similar to Excel Database Format

  Year   quarter   code    amount
  2001   Q1        e1        198
  2001   Q2        e1        204
  2001   Q3        e1        214
  2001   Q4        e1        231


  Pivot Table

  Year    Code     Q1Amt     Q2Amt   Q3Amt   Q4Amt
  2001    e1         198       204     214     231
    Possible Solutions
(without using Characteristic Functions)



            Self Join
          as in type 4 queries
            Possible Solution – Self Join
Use a self join as in type 4 queries

  Start with just Q1 and Q2, code E1, Year 2001

     Year     code     Q1Amt      Q2Amt
     2001     e1          198          204

     Write the SQL query
      Possible Solution – Self Join
SELECT F1.year, F1.code,         Year   code Q1Amt Q2Amt
 F1.amount AS Q1Amt,             2001 e1            198   204
 F2.amount AS Q2Amt

FROM fin_data F1, fin_data F2

WHERE F1.year = 2001 // Only 2001
AND F2.year = 2001

AND F1.code = „e1‟      // Only financial code e1
AND F2.code = „e1‟

AND F1.quarter = „Q1‟   // Get Q1 amount
AND F2.quarter = „Q2‟   // Get Q2 amount



                                                          Code
         Possible Solution – Self Join

Expand to all years

  Year     code       Q1Amt   Q2Amt
  1999     e1           101      93
  2000     e1           153     149
  2001     e1           198     204
           Possible Solution – Self Join
Expand to all years
                                       Year   code Q1Amt Q2Amt
   SELECT F1.year, F1.code,            1999 e1         101    93
    F1.amount AS Q1Amt,                2000 e1         153   149
    F2.amount AS Q2Amt                 2001 e1         198   204

   FROM fin_data F1, fin_data F2

   WHERE F1.year = F2.year // Same Year

   AND F1.code = „e1‟      // Only financial code e1
   AND F2.code = „e1‟

   AND F1.quarter = „Q1‟   // Get Q1 amount
   AND F2.quarter = „Q2‟   // Get Q2 amount


                                                             Code
            Possible Solution – Self Join
   Expand to all four quarters

Year   code    Q1Amt     Q2Amt     Q3Amt   Q4Amt
1999   e1         101         93     129     145
2000   e1         153        149     157     163
2001   e1         198        204     214     231
              Possible Solution – Self Join
All four quarters                Year   code Q1Amt Q2Amt Q3Amt Q4Amt

 SELECT F1.year, F1.code,        1999 e1            101      93   129          145
  F1.amount AS Q1Amt,            2000 e1            153    149    157          163
  F2.amount AS Q2Amt,            2001 e1            198    204    214          231
  F3.amount AS Q3Amt,
  F4.amount AS Q4Amt
 FROM fin_data F1, fin_data F2, fin_data F3, fin_data F4
 WHERE F1.year = F2.year       // Same Year
 AND F2.year = F3.year
 AND F3.year = F4.year
 AND F1.code = „e1‟            // Only financial code e1
 AND F2.code = „e1‟
 AND F3.code = ‘e1’
 AND F4.code = ‘e1’
 AND F1.quarter = „Q1‟         // One record for each quarter
 AND F2.quarter = „Q2‟
 AND F3.quarter = ‘Q3’
 AND F4.quarter = ‘Q4’
                                                                        Code
      Possible Solution – Self Join
              Problems
Coding:
Suppose we wanted months instead of quarters…

Performance:
Suppose fin_data had 100,000 records instead of 84…



       We need a better solution!
    Possible Solutions
(without using Characteristic Functions)



         SubQueries
        Possible Solution – SubQueries
Use a subquery as a field in the select clause

  Start with just Q1 and Q2, code E1, Year 2001

     Year     code    Q1Amt      Q2Amt
     2001     e1          198       204

     Write the SQL query
        Possible Solution – SubQueries
Use a subquery as a field in the select clause

 SELECT F1.year, F1.code, F1.amount AS Q1Amt,
 (
   SELECT F2.amount as Q2Amt        Year code Q1Amt Q2Amt
   FROM fin_data F2
                                    2001 e1     198   204
   WHERE F2.quarter = „Q2‟
   AND F2.code = „e1‟
   AND F2.year = 2001
 )
 FROM fin_data F1
 WHERE F1.quarter = „Q1‟
 AND F1.code = „e1‟
 AND F1.year = 2001


                                                      Code
     Possible Solution – SubQueries

Expand to all years

  Year     code       Q1Amt   Q2Amt
  1999     e1           101      93
  2000     e1           153     149
  2001     e1           198     204
         Possible Solution – SubQueries
Expand to all years
   SELECT F1.year, F1.code, F1.amount AS Q1Amt,
   (
     SELECT F2.amount as Q2Amt         Year code Q1Amt Q2Amt
     FROM fin_data F2
                                       1999 e1     101    93
     WHERE F2.quarter = „Q2‟           2000 e1     153   149
     AND F2.code = „e1‟                2001 e1     198   204
     AND F2.year = F1.year
   )
   FROM fin_data F1
   WHERE F1.quarter = „Q1‟
   AND F1.code = „e1‟
   //AND F1.year = 2001


This is now a correlated subquery
                                                          Code
       Possible Solution – SubQueries
 Expand to all four quarters

Year   code    Q1Amt      Q2Amt      Q3Amt   Q4Amt
1999   e1          101          93     129     145
2000   e1          153         149     157     163
2001   e1          198         204     214     231
         Possible Solution – SubQueries
All four quarters          Year    code Q1Amt Q2Amt Q3Amt Q4Amt
                            1999   e1           101     93    129          145
SELECT F1.year, F1.code,
                            2000   e1           153    149    157          163
F1.amount AS Q1Amt,
                            2001   e1           198    204    214          231
(
  SELECT F2.amount as Q2Amt
  FROM fin_data F2                      (
  WHERE F2.quarter = „Q2‟                   SELECT F2.amount as Q4Amt
  AND F2.code = „e1‟                        FROM fin_data F2
  AND F2.year = F1.year                     WHERE F2.quarter = „Q4‟
),                                          AND F2.code = „e1‟
(                                           AND F2.year = F1.year
  SELECT F2.amount as Q3Amt             )
  FROM fin_data F2
  WHERE F2.quarter = „Q3‟               FROM fin_data F1
  AND F2.code = „e1‟                    WHERE F1.quarter = „Q1‟
  AND F2.year = F1.year                 AND F1.code = „e1‟
),
                                                                    Code
          Possible Solution – Self Join
                  Problems
Coding:
Again, Suppose we wanted months instead of quarters…
Performance:
Our example requires the effective execution of 10
    queries
• The F1 query is run once to return the year, code and
    Q1Amt columns
• The F2 query is run 9 times return the Q2, Q3, and
    Q4 amounts for 1999, 2000 and 2001
If our table had 100,000 rows F2 would run 300,000
    times!
         We still need a better solution!
      Possible Solution – Temporary Table
                              Strategy

Create a table with fields for year, code, Q1Amt, Q2Amt, Q3Amt, and Q4Amt
Insert Query to add records and fill in year, code, Q1Amt
Update Query to add Q2Amt
Update Query to add Q3Amt
Update Query to add Q4Amt
      Possible Solution – Temporary Table
                        Create a Table

Create a table with fields for year, code, Q1Amt, Q2Amt, Q3Amt, and Q4Amt


CREATE TABLE QAmt
(
   year char(4),
   code char(2),
   Q1Amt numeric(9),
   Q2Amt numeric(9),
   Q3Amt numeric(9),
   Q4Amt numeric(9)
);
      Possible Solution – Temporary Table
                         DROP a Table

There will be times when you want to get rid of a table you have created.
Perhaps you made an error when you created it. Or, you may simply be
through using it. To get rid of a table you DROP it from the database.


DROP TABLE QAmt
      Possible Solution – Temporary Table
                         INSERT Query

Insert Query to add records and fill in year, code, Q1Amt


INSERT INTO QAmt (year, code, Q1Amt)
   SELECT year, code, amount
   FROM fin_data
   WHERE quarter = 'Q1'
   AND code = 'e1'
       Possible Solution – Temporary Table
                          Delete Query

There may also be times when you want to keep a table, but get rid of all the
records in the table… to empty it. If, for example, you find a problem with your
INSERT query, you may want to empty the table before you run the corrected
INSERT query. To empty a table you DELETE all the records FROM the table.
To empty the QAmt table you:


DELETE FROM QAmt
       Possible Solution – Temporary Table
                      After INSERT Query

Check the results:


SELECT *
FROM QAmt


Year   code Q1Amt    Q2Amt    Q3Amt    Q4Amt
1999   e1     101    (NULL)   (NULL)   (NULL)
2000   e1     153    (NULL)   (NULL)   (NULL)
2001   e1     198    (NULL)   (NULL)   (NULL)
       Possible Solution – Temporary Table
                      Update Query
Update Query to add Q2Amt


UPDATE QAmt Q
SET Q2Amt =
( SELECT amount
  FROM fin_data F
  WHERE F.year = Q.year
  AND F.code = Q.code                 We do two more update queries
  AND F.quarter = 'Q2„ )              to fill in Q3Amt and Q4Amt


Year    code Q1Amt   Q2Amt   Q3Amt    Q4Amt
1999    e1     101      93   (NULL)   (NULL)
2000    e1     153     149   (NULL)   (NULL)
2001    e1     198     204   (NULL)   (NULL)
      Possible Solution – Temporary Table
                        Is this really different
At this point you may have noticed that the temporary table approach looks a lot
like the subquery approach. In fact, it is virtually the same. It seems
conceptually simpler because the temporary table allows us break down the
problem into separate and distinct subtasks. We can complete part of the
solution, check the results, complete some more, etc. For this reason, you will
see it used. It has some legitimate applications, but this isn‟t one of them.

This point becomes even more evident if we use either the self-join approach or
the subquery approach to populate the table in one step.
      Possible Solution – Temporary Table
                    One Step Using Self Join
INSERT INTO QAmt
 (year, code, Q1Amt, Q2Amt, Q3Amt, Q4Amt)

SELECT F1.year, F1.code,
 F1.amount,
 F2.amount,
 F3.amount,                        AND      F1.code = 'e1'
 F4.amount                         AND      F2.code = 'e1'
FROM                               AND      F3.code = 'e1'
 fin_data F1, fin_data F2,         AND      F4.code = 'e1'
 fin_data F3, fin_data F4          AND      F1.quarter = 'Q1'
WHERE F1.year = F2.year            AND      F2.quarter = 'Q2'
AND F2.year = F3.year              AND      F3.quarter = 'Q3'
AND F3.year = F4.year              AND      F4.quarter = 'Q4'
      Possible Solution – Temporary Table
                  One Step Using SubQuery
INSERT INTO QAmt
 (year, code, Q1Amt, Q2Amt, Q3Amt, Q4Amt)

SELECT year, code, amount,

(SELECT amount                       (SELECT amount
  FROM fin_data F                      FROM fin_data F
  WHERE F.year = Q.year                WHERE F.year = Q.year
  AND F.code = Q.code                  AND F.code = Q.code
  AND F.quarter = 'Q2'                 AND F.quarter = 'Q4'
 ),                                   )

(SELECT amount                       FROM fin_data Q
  FROM fin_data F                    WHERE quarter = 'Q1'
  WHERE F.year = Q.year              AND code = 'e1'
  AND F.code = Q.code
  AND F.quarter = 'Q3'
 ),
       Possible Solution – Temporary Table
                                    Problems

So… using a temporary table really isn‟t a distinct approach at all. It can make
the solution conceptually simpler by allowing us to divide the solution into
discrete steps, but it does nothing to reduce overall coding complexity or to
improve performance.
In addition, it introduces potential problems with regard to updates made
between the time the table is created and the time it is used (currency). If you
really do want a snapshot of the data at a particular time then this can be
useful. In general, it‟s undesirable
You can overcome the currency problem by creating a view instead of a table
but, so far, you‟re still basically using either a self-join or a subquery method. If
you want to create a View you‟re better off using Characteristic Functions to do
it!


              We still need a better solution!
Better Solution

Characteristic
 Functions
                Characteristic Functions
                               Strategy

Distribute amount to one of four new columns:

year   code   quarter amount   Q1Amt Q2Amt   Q3Amt Q4Amt
1999   e1     Q1        101      101     0       0     0
1999   e1     Q2         93        0    93       0     0
1999   e1     Q3        129        0     0     129     0
1999   e1     Q4        145        0     0       0   145
2000   e1     Q1        153      153     0       0     0
2000   e1     Q2        149        0   149       0     0
2000   e1     Q3        157        0     0     157     0
2000   e1     Q4        163        0     0       0   163
etc.
                  Characteristic Functions
                                 Strategy
year   code   quarter   amount    Q1Amt   Q2Amt   Q3Amt   Q4Amt
1999   e1     Q1           101      101       0       0       0
1999   e1     Q2            93        0      93       0       0
1999   e1     Q3           129        0       0     129       0
1999   e1     Q4           145        0       0       0     145
etc.

and GROUP BY year, summing the amounts in the new columns for each
year.


year   fin_code                  Q1Amt Q2Amt      Q3Amt Q4Amt
1999   e1                          101    93        129   145
etc.
               Characteristic Functions
                     WHERE can be Bad

In the previous examples, the reason we needed four copies
of the table or four different queries is that the determination
of which rows to include was made in the WHERE clause.

WHERE quarter = „Q1‟ allows us to put only the amounts for
Q1 in the Q1Amt column. That‟s good. But… it also
removes the possibility of filling in the Q2Amt column in the
same query. So Q2Amt must be filled in using a separate
copy of the table or a separate subquery with a WHERE
clause that allows us to see the amounts in the rows WHERE
quarter = „Q2‟
            Characteristic Functions


Characteristic Functions allow us to control which rows get
  represented in a particular column without using a
  WHERE clause.

With Characteristic Functions the determination of whether
  amount is represented in Q1Amt or Q2Amt or Q3Amt or
  Q4Amt is made in the SELECT clause.
               Characteristic Functions
                             1 or 0
A Characteristic Functions is an expression that evaluates to:
• 1 if a field should be represented in a particular column
• 0 if it should not.


For example, a characteristic function to determine whether
an amount should be represented in Q1Amt would return 1 if
quarter is „Q1‟, 0 if it isn‟t. We might call this particular
characteristic function CF1.
                  Characteristic Functions
Assume for a moment that we have created CF1. We could then use it to
control whether amount is represented in Q1Amt by defining Q1Amt as
CF1 * amount.
                         SELECT   year, code, quarter, amount,
                                  CF1, CF1 * amount AS Q1Amt
                         FROM fin_data
                         WHERE code = „e1‟

                                                           Note: this
 Would yield:
                                                           obviously
 year    code   quarter amount    CF1   Q1Amt              won‟t run yet
 1999     e1      Q1       101      1     101              since we
 1999     e1      Q2        93      0       0              haven‟t
 1999     e1      Q3       129      0       0              actually
 1999     e1      Q4       145      0       0              created CF1
 2000     e1      Q1       153      1     153              at this point.
 2000     e1      Q2       149      0       0
 etc.
                  Characteristic Functions
We then create CF2 to return 1 if quarter contains „Q2‟, 0 otherwise.
We use CF2 in the same SELECT clause to control whether amount is
represented in Q2Amt
                         SELECT   year, code, quarter, amount,
                                  CF1, CF1 * amount AS Q1Amt,
                                  CF2, CF2 * amount AS Q2Amt
                         FROM fin_data
                         WHERE code = „e1‟

 Would yield:
 year    code quarter amount     CF1 Q1Amt         CF2 Q2Amt
 1999     e1    Q1       101       1   101           0     0
 1999     e1    Q2        93       0     0           1    93
 1999     e1    Q3       129       0     0           0     0
 1999     e1    Q4       145       0     0           0     0
 2000     e1    Q1       153       1   153           0     0
 2000     e1    Q2       149       0     0           1   149
 etc.
                 Characteristic Functions
We then GROUP BY year and SUM the Q1Amt and Q2Amt for the year

         SELECT   year, MAX(code),
                  SUM (CF1 * amount) AS Q1Amt,
                  SUM (CF2 * amount AS Q2Amt
         FROM fin_data
         WHERE code = „e1‟
         GROUP BY year


  Would yield:
  year    code   Q1Amt   Q2Amt
  1999     e1      101      93
  2000     e1      153     149
  2001     e1      198     204
              Characteristic Functions
                    Implementing CF1

Recall that CF1 evaluates to:
• 1 if quarter is „Q1‟,
• 0 otherwise.
Both Sybase and SQL Server contain a CASE statement that
can be used to implement CF1 as
(CASE WHEN quarter = „Q1‟ THEN 1 ELSE 0 END)
                 Characteristic Functions
                           CF1 and CF2
SELECT   year, code, quarter, amount,
         (CASE WHEN quarter = „Q1‟ THEN 1 ELSE 0 END) as CF1,
         (CASE WHEN quarter = „Q2‟ THEN 1 ELSE 0 END) as CF2
FROM fin_data
WHERE code = „e1‟

Yields:

Year      code   quarter   amount   CF1   CF2
1999      e1     Q1          101      1     0
1999      e1     Q2           93      0     1
1999      e1     Q3          129      0     0
1999      e1     Q4          145      0     0
2000      e1     Q1          153      1     0
2000      e1     Q2          149      0     1
                  Characteristic Functions
                        Using CF1 & CF2
SELECT year, code, quarter, amount,
    (CASE WHEN quarter = „Q1‟ THEN 1 ELSE 0 END) * amount AS Q1Amt,
    (CASE WHEN quarter = „Q2‟ THEN 1 ELSE 0 END) * amount AS Q2Amt
FROM fin_data
WHERE code = „e1‟

 Yields:

 Year      code   quarter   amount   Q1Amt Q2Amt
 1999      e1     Q1          101      101     0
 1999      e1     Q2           93        0    93
 1999      e1     Q3          129        0     0
 1999      e1     Q4          145        0     0
 2000      e1     Q1          153      153     0
 2000      e1     Q2          149        0   149
                  Characteristic Functions
                          Group By Year
SELECT year, MAX(code) AS fin_code,
SUM((CASE WHEN quarter = „Q1‟ THEN 1 ELSE 0 END) * amount) AS Q1Amt,
SUM((CASE WHEN quarter = „Q2‟ THEN 1 ELSE 0 END) * amount) AS Q2Amt
FROM fin_data
WHERE code = „e1‟
GROUP BY year

 Yields:

 year      fin_code   Q1Amt   Q2Amt
 1999      e1           101      93
 2000      e1           153     149
 2001      e1           198     204
                  Characteristic Functions
                              All 4 quarters
SELECT year, MAX(code) AS fin_code,
 SUM((CASE WHEN quarter = „Q1‟ THEN 1    ELSE 0   END) * amount) AS Q1Amt,
 SUM((CASE WHEN quarter = „Q2‟ THEN 1    ELSE 0   END) * amount) AS Q2Amt,
 SUM((CASE WHEN quarter = „Q3‟ THEN 1    ELSE 0   END) * amount) AS Q3Amt,
 SUM((CASE WHEN quarter = „Q4‟ THEN 1    ELSE 0   END) * amount) AS Q4Amt
FROM fin_data
WHERE code = „e1‟
GROUP BY year
 Yields:

 year      fin_code   Q1Amt     Q2Amt Q3Amt       Q4Amt
 1999      e1           101        93   129         145
 2000      e1           153       149   157         163
 2001      e1           198       204   214         231
       Characteristic Functions
 Can you modify the query to show all
         codes, not just e1?

year    code   Q1Amt   Q2Amt Q3Amt   Q4Amt
1999    e1       101      93   129     145
1999    e2       403     459   609     632
1999    e3      1437    2033  2184    2145
1999    e4       623     784   856    1043
1999    e5       381     402   412     467
1999    r1      1023    2033  2998    3014
1999    r2       234     459   601     944
2000    e1       153     149   157     163
2000    e2       643     687   898     923
etc.
                  Characteristic Functions
              Can you modify the query to show all
                      codes, not just e1?
SELECT year, code,
  SUM((CASE WHEN quarter = 'Q1'   THEN 1   ELSE 0   END) * amount) AS Q1Amt,
  SUM((CASE WHEN quarter = 'Q2'   THEN 1   ELSE 0   END) * amount) AS Q2Amt,
  SUM((CASE WHEN quarter = 'Q3'   THEN 1   ELSE 0   END) * amount) AS Q3Amt,
  SUM((CASE WHEN quarter = 'Q4'   THEN 1   ELSE 0   END) * amount) AS Q4Amt
FROM fin_data
GROUP BY year, code
ORDER BY year, code

       year    code      Q1Amt      Q2Amt Q3Amt          Q4Amt
       1999    e1          101         93   129            145
       1999    e2          403        459   609            632
       1999    e3         1437       2033  2184           2145
       1999    e4          623        784   856           1043
       etc.
       Characteristic Functions
               Result:
We create a “Pivot” with only a single
      pass through the table
                 and
 without resorting to programming!

year    code   Q1Amt   Q2Amt Q3Amt   Q4Amt
1999    e1       101      93   129     145
1999    e2       403     459   609     632
1999    e3      1437    2033  2184    2145
1999    e4       623     784   856    1043
1999    e5       381     402   412     467
1999    r1      1023    2033  2998    3014
1999    r2       234     459   601     944
etc.
                  Characteristic Functions
                                Add Slide

Timing for Join
Timing for SubQuery
Timing for Temporary Table
Timing for Characteristic Function

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:2
posted:10/1/2011
language:English
pages:49