# ra

Document Sample

```					IELM 511: Information System design

Introduction

Part 1. ISD for well structured data – relational and other DBMS
Info storage (modeling, normalization)
Info retrieval (Relational algebra, Calculus, SQL)
DB integrated API’s

ISD for systems with non-uniformly structured data
Basics of web-based IS (www, web2.0, …)
Markup’s, HTML, XML
Design tools for Info Sys: UML

Part III: (one out of)
API’s for mobile apps
Security, Cryptography
IS product lifecycles
Algorithm analysis, P, NP, NPC
Agenda

Relational Algebra

Relational Calculus

Structured Query Language (SQL)

DB API’s
Recall our Bank DB design
BRANCH( b-name, city, assets)

CUSTOMER( cssn, c-name, street, city, banker, banker-type)

LOAN( l-no, amount, br-name)

PAYMENT( l-no, pay-no, date, amount)

EMPLOYEE( e-ssn, e-name, tel, start-date, mgr-ssn)
1

ACCOUNT( ac-no, balance)

SACCOUNT( ac-no, int-rate)
n
n                   1       n
CACCOUNT( ac-no, od-amt)                        n        m
m

BORROWS( cust-ssn, loan-num)                                                 n
1            1
DEPOSIT( c-ssn, ac-num, access-date)                         n

DEPENDENT( emp-ssn, dep-name)
Background: Algebra

What is an algebra ?

Study of systems of mathematical objects and operations defined on the objects

Examples of algebras:

Integers, with operations: +, -, ×, /, % …

Real numbers, with operations: +, -, ×, /, …

Vectors, with operations: +, -, , ×, …

Booleans, with operations: , , , …
Relational Algebra

Relational Algebra:
objects: instances of relational schemas (namely, tables)
operations: s, P, ×, set-theoretic operations: , , -, ‚

Key concepts:
Operator arguments:
Arguments of operators are instances of schemas (table)

Operation closure:
The outcome of the operator is an instance of schema

Expressions:
A sequence of operations can be written as an expression

Operator precedence:
The sequence of application of operations in an expression is fixed.
Compare these concepts to those in other algebras
Relational Algebra: select, s

Notation: in remainder, we will refer to an instance of a schema as a table

s : unary operator, input: one table; output: table

LOAN
loan_number   amount   branch_name
L17           1000     Downtown                  s[amount > 1200](LOAN)
L23           2000     Redwood
L15           1500     Pennyridge
L93           500      Mianus
loan_number   amount    branch_name
L11           900      Round Hill
L23           2000      Redwood
L16           1300     Pennyridge
L15           1500      Pennyridge
L16           1300      Pennyridge
Relational Algebra: select, s
q conditions of s operator:
- Denote the criterion for selection of a given tuple
- Must be evaluated one tuple at a time
- Must evaluate to ‘true’ or ‘false’
- Output = set of tuples for which q-conditions are ‘true’

LOAN
loan_number   amount   branch_name     s[(amount > 1200)  (branch_name = ‘Pennyridge’)]
L17           1000     Downtown        (LOAN)
L23           2000     Redwood
L15           1500     Pennyridge                  loan_number     amount      branch_name
L93           500      Mianus                      L15             1500        Pennyridge
L11           900      Round Hill                  L16             1300        Pennyridge
L16           1300     Pennyridge
Relational Algebra: project, P

P : unary operator, input: one table; output: table

P[list of attributes] (TABLE)

P[loan_number, amount] (LOAN)

LOAN
loan_number   amount     branch_name                 loan_number   amount
L17           1000       Downtown                    L17           1000
L23           2000       Redwood                     L23           2000
L15           1500       Pennyridge                  L15           1500
L93           500        Mianus                      L93           500
L11           900        Round Hill                  L11           900
L16           1300       Pennyridge                  L16           1300
Relational Algebra: project, P

Project returns a set of tuples; the number of rows may be smaller that input

Example: Find the names of all branches that have given loans

P[branch_name] (LOAN)

LOAN
loan_number   amount   branch_name                 branch_name
L17           1000     Downtown                    Downtown
L23           2000     Redwood                     Redwood
L15           1500     Pennyridge                  Pennyridge
L93           500      Mianus                      Mianus
L11           900      Round Hill                  Round Hill
L16           1300     Pennyridge
Relational Algebra: combining operations

Example: Find the names of all branches that have given loans larger than 1200

P[branch_name] ( s[(amount > 1200) ] (LOAN))
LOAN
loan_number    amount   branch_name
L17            1000     Downtown
branch_name
L23            2000     Redwood
Redwood
L15            1500     Pennyridge
Pennyridge
L93            500      Mianus
L11            900      Round Hill
L16            1300     Pennyridge
Relational Algebra: combining operations
Note: expressions impose a sequence in which operations are perfromed

Example: Find the names of all branches that have given loans larger than 1200

X = ( s[(amount > 1200) ] (LOAN))          Y = P[branch_name] (X)

LOAN                                   X                                      Y
loan_number   amount    branch_name     loan_number   amount   branch_name    branch_name
L17           1000      Downtown        L23           2000     Redwood        Redwood
L23           2000      Redwood         L15           1500     Pennyridge     Pennyridge
L15           1500      Pennyridge      L16           1300     Pennyridge
L93           500       Mianus
L11           900       Round Hill
L16           1300      Pennyridge
Relational Algebra: join, ×

Join is useful when the information required is in two (or more) tables.

Tables are sets of tuples, and
the join of two tables produces a cartesian product of the two sets

Background (set theory): cartesian product, A × B = { (x, y) | x  A, y  B}

Example:
A = { 1, 2, 3 }, B = { a, s}

A × B = { (1, a), (1, s), (2, a), (2, s), (3, a), (3, s) }
Relational Algebra: join, ×
BORROWS                 LOAN
customer      loan_no    loan_number   amount         branch_name
111-12-0000   L17        L17           1000           Downtown
222-12-0000   L23        L23           2000           Redwood
333-12-0000   L15        L15           1500           Pennyridge
444-00-0000   L93        L93           500            Mianus
666-12-0000   L17        L11           900            Round Hill
111-12-0000   L11        L16           1300           Pennyridge
999-12-0000   L17
777-12-0000   L16
Cartesian product, BORROWS × LOAN                       5 columns

customer        loan_no        loan_number   amount   branch_name
111-12-0000     L17            L17           1000     Downtown
111-12-0000     L17            L23           2000     Redwood
111-12-0000     L17            L15           1500     Pennyridge
48 rows     …
…
777-12-0000     L16            L16           1300     Pennyridge
Relational Algebra: join, ×
Usually, a cartesian product produces several tuples with un-related
information.
q-join specifies a q-condition (same as a selection criterion) to restrict
the output of a join to meaningful tuples only.

Example: Find the loan no, amount and branch name for all customers.

BORROWS ×[loan_no = loan_number] LOAN                   5 columns
customer      loan_no   loan_number   amount     branch_name
111-12-0000   L17       L17           1000       Downtown
222-12-0000   L23       L23           2000       Redwood
8 rows [Why ?]    333-12-0000   L15       L15           1500       Pennyridge
444-00-0000   L93       L93           500        Mianus
666-12-0000   L17       L17           1000       Downtown
111-12-0000   L11       L11           900        Round Hill
999-12-0000   L17       L17           1000       Downtown
777-12-0000   L16       L16           1300       Pennyridge
Relational Algebra: dot-notation in join, ×
Two tables being joined may have the same attribute name
(possibly denoting two different things). To distinguish the columns
in the q-join, the names of attributes use dot-notation

The following are all equivalent:

C = BORROWS ×[loan_no = loan_number] LOAN

C = BORROWS ×[BORROWS.loan_no = LOAN.loan_number] LOAN

A = BORROWS
B = LOAN
C = A ×[A.loan_no = B.loan_number] B
Relational Algebra: set theoretic operations, 

Since a table is a set of tuples, it is possible to make a union of two
tables.
BUT: we require closure (union of two tables should be a table).
 Union is defined for two tables with identical schemas.

Example: Find the names of customers who have either a deposit, or a loan
with the bank
name
A = P[customer] (BORROWS)  P[c_ssn] (DEPOSIT)
Jones
Smith
RESULT = P[name] (A ×[A.customer= CUSTOMER.ssn] CUSTOMER )
Hayes
Curry
Turner
Williams
Johnson
Brooks
Lindsay
Relational Algebra: set theoretic operations, 

Other set theoretic operations can be applied with same rules.

Example: Find the names of customers who have both, a deposit and a loan
with the bank

A = P[customer] (BORROWS)  P[c_ssn] (DEPOSIT)

RESULT = P[name] (A ×[A.customer= CUSTOMER.ssn] CUSTOMER )
RESULT
c_ssn             customer          customer                    name
888-12-0000      111-12-0000   =   111-12-0000                 Jones
222-12-0000       222-12-0000       222-12-0000                 Smith
333-12-0000       333-12-0000       333-12-0000                 Hayes
555-00-0000       444-00-0000
111-12-0000       666-12-0000
000-12-0000       999-12-0000
777-12-0000
Relational Algebra: set theoretic operations, -

Other set theoretic operations (same rules).

Example: Find the names of customers who have a loan but no deposits.

A = P[customer] (BORROWS) - P[c_ssn] (DEPOSIT)

RESULT = P[name] (A ×[A.customer= CUSTOMER.ssn] CUSTOMER )
RESULT
c_ssn             customer          customer                    name
888-12-0000   -   111-12-0000   =   888-12-0000                 Johnson
222-12-0000       222-12-0000       555-12-0000                 Turner
333-12-0000       333-12-0000       000-12-0000                 Lindsay
555-00-0000       444-00-0000
111-12-0000       666-12-0000
000-12-0000       999-12-0000
777-12-0000
Relational Algebra: set theoretic operations, ÷
Set division extends the meaning of integer division, in the sense that it
‘cancels away’ common multiples. It is useful in answering ‘for all’ queries.

Example: Do all the loan officers have the same manager ?
A solution: Find the ssn of the person who manages all the loan officers.

A = P[banker] (s[b_type=LO] (CUSTOMER) )
B = P[mgr_ssn, e_ssn] (EMPLOYEE)
RESULT = B ÷ A

B   mgr_ssn          e_ssn                 A                                         RESULT
321-32-4321     111-22-3333            banker
111-22-3333     333-11-4444                                                     mgr_ssn
÷         333-11-4444
111-22-3333     123-45-6789                                                      111-22-3333
123-45-6789
321-32-4321     555-66-8888
888-99-9999     987-65-4321
777-77-7777     888-99-9999
777-77-7777     321-32-4321   Note: for this example, we have to specify that
null   777-77-7777   the common divisor in B is e_ssn.
Relational Algebra: set theoretic operations, ÷
Generic definition of ÷

Attribute restrictions:
A ÷ B is defined only for A( R, C) and B( C), where R, C are sets of attributes.

Output:
The output contains each ti[R] such that
 tuples tj[C]  B,  a tuple, t  A in which t[C] = tj[C] and t[R] = ti[R].

common attribute set, C
OUTPUT
attribute set, R
r1…   rm
r1…    rm     c1 …   ck               c1 …   ck

÷                    t1       …
…
tn

…
Relational Algebra: concluding remarks

RA provides a formal language to get information from the database

RA can potentially answer any query, as long as the query pertains to
exactly one row of some table derivable using expressions.

Limitations of RA: aggregation and summary information
Examples:
find the average amount of assets in the branches
find the total assets of the bank,
…

RA is procedural, namely, an expression of RA specifies a step by
step procedure for computing the result.
Relational Calculus (RC)

Background: what is a calculus ?

RC is based on a formal system in logic, first order predicate calculus (fopc)

A formal system has:
a set of symbols;
rules about how the symbols can be arranged in well formed formulae (wff)
a (logical) mechanism to derive if a wff is true/false.
additionally, fopc allows wff with ‘variables’ and quantifiers (, ).

A query in RC takes the form:        {t | P(t) }

Meaning: the set of all tuples, t, for which some Proposition, P(t) is true.
P is also called a predicate.
Relational Calculus (RC) examples

1. Report the loans that exceed \$1200:

{ t | t  LOAN  t[amount] > 1200}

2. Find the names of customers who took a loan from the Pennyridge branch.

{ t[name] |        s  BORROWS  s[customer] = t[ssn]
  u  LOAN  u[loan_number] = s[loan_no]
 u[branch_name] = ‘Pennyridge’}
Relational Calculus (RC) remarks

RC is non-procedural – any way that the predicate P can be evaluated
is valid.

RC is the formal basis for Structured Query Language (SQL)

SQL is the de facto standard language for all RDBMSs

In terms of functionality (i.e. the power to get some information from any
DB) RA and RC are equivalent). Namely, any query that can be written in
RC has an equivalent RA formula, and vice versa.

Advantage of RC (over RA): conceptually, it is better to allow the user to
define the logic of the query, but leave the procedure for computing it
to the program [why ?].
Bank tables..
BRANCH                                  EMPLOYEE
branch_name   city         assets       e_ssn         e_name   tel       start_date   mgr_ssn
Downtown      Brooklyn        9000000   111-22-3333   Jones      12345    Nov-2005    321-32-4321
Redwood       Palo Alto       2100000   333-11-4444   Smith      54321    Mar-1998    111-22-3333
Pennyridge    Horseneck       1700000   123-45-6789   Lee        54321    Mar-1998    111-22-3333
Mianus        Horseneck        400000   555-66-8888   Turner     55555    Aug-2002    321-32-4321
Round Hill    Horseneck       8000000   987-65-4321   Jones      87621    Mar-1995    888-99-9999
Pownal        Bennington       300000   888-99-9999   Chan       87654    Feb-1980    777-77-7777
North Town    Rye             3700000   321-32-4321   Adams      77777    Feb-1990    777-77-7777
Brighton      Brooklyn        7100000   777-77-7777   Black      99111    Jan-1980              null
CUSTOMER                                                                   DEPOSIT
ssn           name       street       city         banker         b_type   c_ssn         ac_num    accessDate
111-12-0000   Jones      Main         Harrison      321-32-4321     CRM    888-12-0000   A101      Jan 1, 09
222-12-0000   Smith      North        Rye           321-32-4321     CRM    222-12-0000   A215      Feb 1, 09
333-12-0000   Hayes      Main         Harrison      321-32-4321     CRM    333-12-0000   A102      Feb 28, 09
444-12-0000   Curry      North        Rye           333-11-4444      LO    555-00-0000   A305      Mar 10, 09
555-12-0000   Turner     Putnam       Stamford      888-99-9999      DO    888-12-0000   A201      Mar 1, 98
666-12-0000   Williams   Nassau       Princeton     333-11-4444      LO    111-12-0000   A217      Mar 1, 09
777-12-0000   Adams      Spring       Pittsfield    123-45-6789      LO    000-12-0000   A101      Feb 25, 09
888-12-0000   Johnson    Alma         Palo Alto     888-99-9999      DO
999-12-0000   Brooks     Senator      Brooklyn      123-45-6789      LO             BORROWS
000-12-0000   Lindsay    Park         Pittsfield    888-99-9999      DO              customer      loan_no
111-12-0000   L17
LOAN
loan_number   amount     branch_name                                                 222-12-0000   L23

L17           1000       Downtown                                                    333-12-0000   L15

L23           2000       Redwood                                                     444-00-0000   L93

L15           1500       Pennyridge                                                  666-12-0000   L17

L93           500        Mianus                                                      111-12-0000   L11

L11           900        Round Hill                                                  999-12-0000   L17

L16           1300       Pennyridge                                                  777-12-0000   L16

Silberschatz, Korth, Sudarshan, Database Systems Concepts, McGraw Hill

Next: SQL and DB API’s

```
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
 views: 12 posted: 9/28/2011 language: English pages: 27