Date on Database Writings

W
Description

Date on Database Writings document sample

Document Sample
scope of work template
							 Critique of Relational Database Models


                   Why relational?
      Relational, network and CODASYL DBs
            Advantages of RDBs classified


1/18/2011               1            CS319 Theory of Databases
    Orientation / schedule for module 2005
Wk 1-2         Generalities on databases        3
Wk 2-6         Relational database theory      13
Wk 7           Evaluating relational databases  3
Wk 8           SQL and object-relational DBs    4
Wk 9           Temporal Relational Databases    4
Wk 10          Reflection on DBs                3

Hugh Darwen in weeks 8 and 9
  Week 8 - Monday 2pm + 5pm, Thursday 2pm + 5pm
  Week 9 - Monday 2pm + 5pm, Thursday 2pm + 5pm


   1/18/2011              3           CS319 Theory of Databases
 Why relational?                            C.J. Date

Relational Database Writings 1985-1989

Purpose of the paper ...

... a succint and reasonably comprehensive summary of
    the main advantages of the relational approach

… concerned with technical not business advantages

… to evaluate relational models in DBs fully we must
 also consider the most fundamental issues


    1/18/2011              4          CS319 Theory of Databases
   The agenda for reading Why Relational?

Where is Date coming from? what is his bias?


How do we classify Date's perceived virtues of relational
models? Some virtues differ in nature from others ...

To what extent are the qualities of relational databases
  fundamentally to do with relations?

What is the future for databases as a concept?



    1/18/2011             5            CS319 Theory of Databases
    Orientation on the issues raised by Date

Paper has a rationale behind it - to defend relational
models from emerging new technologies (c. 1989)

Date has a long history as a relational DB champion

Even the initial claim of the paper is contested (by 1989)

First and primary advantage of RDB model: simplicity

Issue: is SQL and ORACLE simple ... ?
       … but with what is it being compared?
    1/18/2011             6             CS319 Theory of Databases
    Context: candidate abstract data models
3 classical models:

hierarchical
   e.g. Information Management System (IMS)
   developed late 1960s for Apollo mission

network
  Conference on Data Systems Languages
  CODASYL : standardised COBOL
  CODASYL : Database Task Group (DBTG)
  Official CODASYL reports 1971-1978
    1/18/2011          7           CS319 Theory of Databases
Context: candidate abstract data models (cont.)
3 classical models:

hierarchical, network, ...

relational
   proposed by E.F. Codd in 1970
   E.F. Codd was at IBM San Jose RL
   Examples:
        System R [Sequel -> SQL],
        Ingres [Quel], QBE, PRTV [ISBL]
   Commercial Relational Systems in 1980s

    1/18/2011                8      CS319 Theory of Databases
          Context: Other Candidate Models
Clear that relational database are good for many
  commercial enterprises involved in data processing

What about other applications? need different models?

• interactive design
  human interaction & intervention essential in design
• real-time applications
  need fast response, no encoding overheads
• integrated project support environments
  need to store pieces of code, diagrams etc.

    1/18/2011            9            CS319 Theory of Databases
          Context: Other Candidate Models

Possible alternative approaches


Extensions to relational e.g. deductive dbs
  Datalog (proper subset of Prolog)
  logic language
  cf. Kowalski Logic for Problem Solving


object-oriented databases
  application of OOP to DBs dates from late 1980s:
  e.g. Orion, Kim, Cactis, Gemstone, O2, Iris

    1/18/2011            10            CS319 Theory of Databases
           Putting Date's view in context ....

• is Date biased?
  list of advantages could go on for ever, or at least for
  a very long time (p3)
  anywhere from 5-fold to 20-fold increases in
  productivity (p5) cf. quotes from other sources ...
  tables are sufficient, in the sense that there is no
  known data that cannot be represented in tabular
  form (p5) (what about ”the Mona Lisa", or "the sound
  of the last act of Marriage of Figaro”?)
    1/18/2011             11            CS319 Theory of Databases
 Useful to put Date's view in historical context
Brief history establishes the historical context ….

CODASYL databases on the network model


Outline of a network model for the HVFC


MEMBERS (NAME, ADDRESS, BALANCE)
ORDERS (ORDER_NO, NAME, ITEM, QUANTITY)
SUPPLIERS (SNAME, SADDR, ITEM, PRICE)

Develop an entity-relationship diagram ...
    1/18/2011             12            CS319 Theory of Databases
CODASYL databases on the network model 1
Network model for the HVFC:
                     MEMBERS (NAME, ADDRESS, BALANCE)
                ORDERS (ORDER_NO, NAME, ITEM, QUANTITY)
                   SUPPLIERS (SNAME, SADDR, ITEM, PRICE)
Develop an entity-relationship diagram … have two
many-many relationships
  SUPPLIES (SUPPLIERS, ITEMS)
  ORDERS (MEMBERS, ITEMS)

Principle of querying in a CODASYL model
    • replace many-many relationships by functions
    • navigate around sets of records via functions
    1/18/2011              13          CS319 Theory of Databases
CODASYL databases on the network model 2

A many-many relationship XY can be expressed as
a-1b where a: RX & b: RY are many-one functions


Example: to factorise many-many relationship in HVFC
                ORDERS (MEMBERS, ITEMS)


Introduce a set of records to represent ORDERS

Typical record is (m_name, i_name, quantity)


    1/18/2011            14          CS319 Theory of Databases
CODASYL databases on the network model 3
               A many-many relationship XY can be expressed as
               a-1b where a: RX & b: RY are many-one functions
Factorise ORDERS into two projection maps:
  MEMBORD : ORDERS  MEMBERS
  ITEMORD : ORDERS  ITEMS
where
  MEMBORD (m_name, i_name, quantity) = m_name
  ITEMORD (m_name, i_name, quantity) = i_name
Represent many-many ORDERS relationship by
               MEMBORD-1 . ITEMORD
by combining the two projections thus:
          MEMBERS  ORDERS  ITEMS
   1/18/2011                  15             CS319 Theory of Databases
                A Sample CODASYL query
     "Find how much Granola Brooks has ordered"
NAME := "Brooks"
FIND MEMBERS RECORD USING CALC-KEY
LOOP: repeat forever
   FIND NEXT ORDERS RECORD IN CURRENT MEMBORD SET
   if FAIL then break LOOP
   FIND OWNER OF CURRENT ITEMORD SET
   GET ITEMS; INAME
   if ITEMS.INAME = "Granola" then do
         FIND CURRENT OF ORDERS RECORD
         GET ORDERS; QUANTITY
         print QUANTITY
         break LOOP
   end
end LOOP
    1/18/2011            16         CS319 Theory of Databases
     Commentary on the CODASYL query 1
NAME := "Brooks"
> find the MEMBERS record associated with Brooks
> assume stored by CALC_key (hash-code) NAME
FIND MEMBERS RECORD USING CALC-KEY
LOOP: repeat forever
    FIND NEXT ORDERS RECORD
    IN CURRENT MEMBORD SET
    > traverse link  MEMBORD: ORDERS  MEMBERS
    > current MEMBERS record is Brooks’s
    >  link to his orders
    if FAIL then break LOOP
    FIND OWNER OF CURRENT ITEMORD SET
    > apply link  ITEMORD: ORDERS  ITEMS
    > to determine what item was ordered

     1/18/2011            17            CS319 Theory of Databases
   Commentary on the CODASYL query (cont.)
  ...
  > apply link  ITEMORD: ORDERS  ITEMS
  > to determine what item was ordered
  GET ITEMS; INAME
  > access name of the item ordered
  if ITEMS.INAME = "Granola" then do
        > check to see if item ordered is Granola
        FIND CURRENT OF ORDERS RECORD
        > current orders record is order by Brooks of Granola
        GET ORDERS; QUANTITY
        > access quantity of Granola ordered by Brooks
        print QUANTITY
        break LOOP
  end
end LOOP
      1/18/2011                18              CS319 Theory of Databases
         About the CODASYL environment
Issue: is SQL and ORACLE simple ... ?

  “ The sheer range of FIND commands and their almost
  Byzantine intricacy is one of the reasons why DBTG
  databases are programmed by experts … ”


  “ The efficiency of CODASYL implementations for
  performing access and update has been a very large factor
  in their widespread use. This efficiency has been
  purchased at the cost of using a baffling variety of storage
  strategies and DML commands … ”


                      Peter Gray: Logic, Algebra and Databases
    1/18/2011              19             CS319 Theory of Databases
      SQL is simple - relative to CODASYL

ORDERS (ORDER_NO, NAME, ITEM, QUANTITY)

"Find how much Granola Brooks has ordered”

  select QUANTITY
  from ORDERS
  where NAME=‘Brooks’ and ITEM=‘Granola’

The SQL-CODASYL comparison highlights reason for
Date-Darwen concern about ‘back-to-the-future’ in DBs

    1/18/2011           20           CS319 Theory of Databases
                 Why relational? 1
CODASYL is bad, but is relational good?
[ also beware! CODASYL is bad, but is network bad? ]

... first try to understand Date's claims by comparing the
    two models ....

Areas of usefulness for relational model:
  data manipulation
  database design
  database definition
  database installation
  ....

    1/18/2011             21            CS319 Theory of Databases
                 Why relational? 2
Advantages of relational technology:

  usability

  productivity

... promotes end-user programming

Evident in relation to the CODASYL alternative!
cf Korth and Silberchatz file system vs DBMS


    1/18/2011            22            CS319 Theory of Databases
                Why relational? 3
Perceived advantages of relational DBs:
• simple data structure
• simple operators
• no frivolous distinctions
• SQL support
• the view mechanism
• sound theoretical base
• small number of concepts
• the dual-mode principle
• physical data independence
• logical data independence
    1/18/2011           23           CS319 Theory of Databases
                  Why relational? 4
Perceived advantages of relational DBs (cont.):
•   ease of application development
•   dynamic data definition
•   ease of installation and ease of operations
•   simplified database design
•   integrated dictionary
•   distributed database support
•   performance
•   extendability

… all evident in relation to CODASYL comparison
     1/18/2011             24           CS319 Theory of Databases
    A brief elaboration of Date's concerns 1
• simple data structure
  table is the basis of the relational model
• simple operators
  5 relational operators for completeness
  set-level operations / closure / declarative
• no frivolous distinctions
  uniform methods of interaction with DB
  e.g. for update relation, or impose constraint
• SQL support
  high-level queries / widespread use, acceptance
• the view mechanism
  means to customise the DB without new concepts
• sound theoretical base
  relational model is mathematically rigorous
    1/18/2011          25           CS319 Theory of Databases
    A brief elaboration of Date's concerns 2
• small number of concepts
  single mode of representation + uniform update
  cf multi-mode + proliferation of mechanisms
• the dual-mode principle
  embedded DML to access the DB from programs
  autonomous activity resembles user interaction
• physical data independence
  separate conceptual model / physical database
• logical data independence
  separate conceptual model / user views
• ease of application development
  makes application generators possible
  makes high-level prototyping easy
• dynamic data definition
  can modify a relational DB design incrementally
    1/18/2011           26          CS319 Theory of Databases
    A brief elaboration of Date's concerns 3
• ease of installation and ease of operations
  robust, easy to manage by few personnel
• simplified database design
  have principles for database design
• integrated dictionary
  consistent interface for meta-level access
  metadata-driven programs can be written
• distributed database support
  high semantic content of queries, declarative nature
  cf. problems of breaking up procedural chains
• performance
  down to optimiser, not applications programmer
• extendability
  can easily build on relational database models
    1/18/2011            27           CS319 Theory of Databases
         Date's concerns and CODASYL 1
• simple data structure?
  cf complexity of DBTG sets and pointers
• simple operators?
  no high-level operators, nothing at the set-level
  have to record state, pointers create modes
• no frivolous distinctions?
  complex methods of interaction with DB
  e.g. update relation and impose constraint would be
  dealt with in entirely separate ways
• SQL support?
  no concept of high-level query, was widespread!
• the view mechanism?
  has no analogue for CODASYL
• sound theoretical base?
  no discernible theory in CODASYL framework
    1/18/2011           28           CS319 Theory of Databases
          Date's concerns and CODASYL 2
• small number of concepts?
  multi-mode + proliferation of mechanisms
  representation ways to select, insert, delete, update
• the dual-mode principle?
  no clear distinction between high-level queries and
  application programmer's mode of access
• physical data independence?
  conceptual model mixed with physical database
• logical data independence?
  no provision for user views
• ease of application development?
  CODASYL doesn't make data access much easier
• dynamic data definition?
  DB design has to be carefully preconceived and can't
  easily be adapted
     1/18/2011            29           CS319 Theory of Databases
         Date's concerns and CODASYL 3
• ease of installation and ease of operations?
  CODASYL probably keeps program surgery busy
• simplified database design?
  principles for database design more suspect
• integrated dictionary?
  meta-level issues not addressed within model
• distributed database support?
  who'd like to parallelise CODASYL updates?
• performance?
  was traditionally better than relational models!
• extendability?
  CODASYL not something to be built on ...
    1/18/2011           30           CS319 Theory of Databases
Interpreting Date’s defence of relational models
Date’ s arguments in defence of relational models are
very powerful when seen in the context of CODASYL

Need to understand them in relation what might be the
best data modelling practices for today and the future

Important for this purpose to classify the defences:
• defence from theory       ? is the theory adequate
• defence from practice ? will the practice change
• special qualities exhibited by the relational model
  ? are they particular to RDBs, or generalisable
    1/18/2011            31            CS319 Theory of Databases
 Classifying the advantages cited by Date 1
Will classify Date’s list of advantages into
  THEORY, PRINCIPLES and CONSEQUENCES
and further subdivide
  PRINCIPLES into PRACTICAL & FOUNDATIONAL

THEORY
• simple data structure
• simple operators
• no frivolous distinctions
• sound theoretical base
• small number of concepts
    1/18/2011           32     CS319 Theory of Databases
 Classifying the advantages cited by Date 2
PRINCIPLES - PRACTICAL ASPECT
  • SQL support
  • the view mechanism
  • the dual-mode principle
  • physical data independence
  • logical data independence
  • dynamic data definition
PRINCIPLES - FOUNDATIONAL ASPECT
  • simplified database design
  • integrated dictionary: metadata-driven
  • distributed database support: atomicity
    1/18/2011           33           CS319 Theory of Databases
 Classifying the advantages cited by Date 3


CONSEQUENCES
• ease of application development
• ease of installation and ease of operations
• performance
• extendability

The status of these advantages is relevant when we
come to consider what is really siginificant about the
relational model in comparison with other alternatives ...


    1/18/2011             34            CS319 Theory of Databases
… will return to express personal views concerning the
 defence of the relational position later … turn next to
 the issue of ‘Why not Relational?’

   whynotrel.ppt




    1/18/2011            35            CS319 Theory of Databases
What are the virtues of the relational model? 1
Certain features of relational models wish to retain ...

The defence from theory …
• simple data structure
  want elegant and consistent structures
• simple operators
  want high-level operators
  need techniques at the set-level
  don't want to have to record state
  don't want to maintain pointers
    1/18/2011              36            CS319 Theory of Databases
What are the virtues of the relational model? 2
Certain features of relational models wish to retain ...

The defence from theory …
• small number of concepts
  want a unified view for representation
  uniform ways to manipulate
• sound theoretical base
  want to be able to apply mathematical techniques

... but all these attributes apply to Miranda, for example,
    and this hasn't made it widely / wildly successful
    1/18/2011              37            CS319 Theory of Databases
What are the virtues of the relational model? 3
The defence from practice ...
• SQL support
    need concept of high-level query
• the view mechanism
    must be able to represent different user views
• the dual-mode principle
    invoking user commands automatically is a powerful
    principle for program development and debugging
• physical and logical data independence
    must be possible to separate concerns
    at high and low levels of abstraction
... but do these qualities fit into a general scheme or are
    they specific to the relational framework?
    1/18/2011              38            CS319 Theory of Databases
What are the virtues of the relational model? 4
Evidence of special suitability for real-world modelling ...

• simplified database design
  have principles for database design
  contrast the messiness of CODASYL

• integrated dictionary
  can write metadata-driven programs
  no chance to take high-level view in CODASYL

• distributed database support
  high semantic content of queries, atomicity of action
  queries in CODASYL not much about the real-world
    1/18/2011              39            CS319 Theory of Databases
What are the virtues of the relational model? 5
Evidence of special suitability for real-world modelling ...

... database design reveals very direct connections
    between dependencies amongst attributes of real-
    world objects and forms for their representation in
    relation schemes

content = real-world meaning
      dictating form = structure of the representation

Fundamental conflict between theory and practice
over the relationship between form and content
    1/18/2011              40            CS319 Theory of Databases
What are the virtues of the relational model? 6
Important aspects of relational DBs (in WMB’s view)

THEORY aspect       underlying algebraic model

• provides basis for unambiguous evaluation
• closure properties
• potential for optimisation & axiomatisation

PRINCIPLES represented in
  views + application generators + spreadsheets


    1/18/2011            41           CS319 Theory of Databases
What are the virtues of the relational model? 7
Important aspects of relational DBs (in WMB’s view)


PRACTICAL aspect

• involve state essentially, so not purely declarative
• good for expressing agent actions / views
• good for representing levels of abstraction
cf ACE & A Small Matter of Programming, Bonnie Nardi

Represents a framework for managing state cleaner
than procedural programming, more expressive than FP
    1/18/2011            42           CS319 Theory of Databases
What are the virtues of the relational model? 8

Important aspects of relational DBs (in WMB’s view)


FOUNDATIONAL aspect
• concerned with metaphor not symbolic representation
• invokes form and content in combination

Notes on these respective issues
• metaphor: the form reflects the content
  [as is true to some degree of relational models]
• cf logicism debate in AI:
  A Critique of Pure Reason McDermott et seq
    1/18/2011            43            CS319 Theory of Databases
        Issues for database development 1
How to avoid "back to the future"?
• need theoretical foundation
• need qualities of declarative query
• need principles to handle abstraction at many levels:
  data independence
• need to support interaction of agents at high-levels of
  abstraction
• need to retain / replace the form-content relationships
  that relational DB design theory introduces

    1/18/2011             44            CS319 Theory of Databases
        Issues for database development 2
Modern database demands
• enormous volumes of data
• high-performance e.g. for multi-media, real-time
• support for metaphor e.g. visual image not table
• concurrent access, distributed data
• closer integration between direct (human) and
  programmed (computer) data access
• support for modern data abstractions: objects,
  inheritance, aggregation
• applicability to design environment needs:
  incremental intensional change
    1/18/2011            45             CS319 Theory of Databases

						
Related docs
Other docs by ghd26630
Database of Liaison Office
Views: 578  |  Downloads: 0
Dataflow Templates
Views: 16  |  Downloads: 0
David Einhorn 2010 Value Investing Congress
Views: 12  |  Downloads: 0
Database Project
Views: 24  |  Downloads: 0
Database Requirements Template
Views: 33  |  Downloads: 0
Datastage Project
Views: 14  |  Downloads: 0
Database Programming with Openoffice.Org Base
Views: 1185  |  Downloads: 0
Date on Database Writings
Views: 1  |  Downloads: 0
Date Expense Income Table
Views: 2  |  Downloads: 0