Date on Database Writings
W
Description
Date on Database Writings document sample
Document Sample


Critique of Relational Database Models
Why relational?
Relational, network and CODASYL DBs
Advantages of RDBs classified
1/18/2011 1 CS319 Theory of Databases
Orientation / schedule for module 2005
Wk 1-2 Generalities on databases 3
Wk 2-6 Relational database theory 13
Wk 7 Evaluating relational databases 3
Wk 8 SQL and object-relational DBs 4
Wk 9 Temporal Relational Databases 4
Wk 10 Reflection on DBs 3
Hugh Darwen in weeks 8 and 9
Week 8 - Monday 2pm + 5pm, Thursday 2pm + 5pm
Week 9 - Monday 2pm + 5pm, Thursday 2pm + 5pm
1/18/2011 3 CS319 Theory of Databases
Why relational? C.J. Date
Relational Database Writings 1985-1989
Purpose of the paper ...
... a succint and reasonably comprehensive summary of
the main advantages of the relational approach
… concerned with technical not business advantages
… to evaluate relational models in DBs fully we must
also consider the most fundamental issues
1/18/2011 4 CS319 Theory of Databases
The agenda for reading Why Relational?
Where is Date coming from? what is his bias?
How do we classify Date's perceived virtues of relational
models? Some virtues differ in nature from others ...
To what extent are the qualities of relational databases
fundamentally to do with relations?
What is the future for databases as a concept?
1/18/2011 5 CS319 Theory of Databases
Orientation on the issues raised by Date
Paper has a rationale behind it - to defend relational
models from emerging new technologies (c. 1989)
Date has a long history as a relational DB champion
Even the initial claim of the paper is contested (by 1989)
First and primary advantage of RDB model: simplicity
Issue: is SQL and ORACLE simple ... ?
… but with what is it being compared?
1/18/2011 6 CS319 Theory of Databases
Context: candidate abstract data models
3 classical models:
hierarchical
e.g. Information Management System (IMS)
developed late 1960s for Apollo mission
network
Conference on Data Systems Languages
CODASYL : standardised COBOL
CODASYL : Database Task Group (DBTG)
Official CODASYL reports 1971-1978
1/18/2011 7 CS319 Theory of Databases
Context: candidate abstract data models (cont.)
3 classical models:
hierarchical, network, ...
relational
proposed by E.F. Codd in 1970
E.F. Codd was at IBM San Jose RL
Examples:
System R [Sequel -> SQL],
Ingres [Quel], QBE, PRTV [ISBL]
Commercial Relational Systems in 1980s
1/18/2011 8 CS319 Theory of Databases
Context: Other Candidate Models
Clear that relational database are good for many
commercial enterprises involved in data processing
What about other applications? need different models?
• interactive design
human interaction & intervention essential in design
• real-time applications
need fast response, no encoding overheads
• integrated project support environments
need to store pieces of code, diagrams etc.
1/18/2011 9 CS319 Theory of Databases
Context: Other Candidate Models
Possible alternative approaches
Extensions to relational e.g. deductive dbs
Datalog (proper subset of Prolog)
logic language
cf. Kowalski Logic for Problem Solving
object-oriented databases
application of OOP to DBs dates from late 1980s:
e.g. Orion, Kim, Cactis, Gemstone, O2, Iris
1/18/2011 10 CS319 Theory of Databases
Putting Date's view in context ....
• is Date biased?
list of advantages could go on for ever, or at least for
a very long time (p3)
anywhere from 5-fold to 20-fold increases in
productivity (p5) cf. quotes from other sources ...
tables are sufficient, in the sense that there is no
known data that cannot be represented in tabular
form (p5) (what about ”the Mona Lisa", or "the sound
of the last act of Marriage of Figaro”?)
1/18/2011 11 CS319 Theory of Databases
Useful to put Date's view in historical context
Brief history establishes the historical context ….
CODASYL databases on the network model
Outline of a network model for the HVFC
MEMBERS (NAME, ADDRESS, BALANCE)
ORDERS (ORDER_NO, NAME, ITEM, QUANTITY)
SUPPLIERS (SNAME, SADDR, ITEM, PRICE)
Develop an entity-relationship diagram ...
1/18/2011 12 CS319 Theory of Databases
CODASYL databases on the network model 1
Network model for the HVFC:
MEMBERS (NAME, ADDRESS, BALANCE)
ORDERS (ORDER_NO, NAME, ITEM, QUANTITY)
SUPPLIERS (SNAME, SADDR, ITEM, PRICE)
Develop an entity-relationship diagram … have two
many-many relationships
SUPPLIES (SUPPLIERS, ITEMS)
ORDERS (MEMBERS, ITEMS)
Principle of querying in a CODASYL model
• replace many-many relationships by functions
• navigate around sets of records via functions
1/18/2011 13 CS319 Theory of Databases
CODASYL databases on the network model 2
A many-many relationship XY can be expressed as
a-1b where a: RX & b: RY are many-one functions
Example: to factorise many-many relationship in HVFC
ORDERS (MEMBERS, ITEMS)
Introduce a set of records to represent ORDERS
Typical record is (m_name, i_name, quantity)
1/18/2011 14 CS319 Theory of Databases
CODASYL databases on the network model 3
A many-many relationship XY can be expressed as
a-1b where a: RX & b: RY are many-one functions
Factorise ORDERS into two projection maps:
MEMBORD : ORDERS MEMBERS
ITEMORD : ORDERS ITEMS
where
MEMBORD (m_name, i_name, quantity) = m_name
ITEMORD (m_name, i_name, quantity) = i_name
Represent many-many ORDERS relationship by
MEMBORD-1 . ITEMORD
by combining the two projections thus:
MEMBERS ORDERS ITEMS
1/18/2011 15 CS319 Theory of Databases
A Sample CODASYL query
"Find how much Granola Brooks has ordered"
NAME := "Brooks"
FIND MEMBERS RECORD USING CALC-KEY
LOOP: repeat forever
FIND NEXT ORDERS RECORD IN CURRENT MEMBORD SET
if FAIL then break LOOP
FIND OWNER OF CURRENT ITEMORD SET
GET ITEMS; INAME
if ITEMS.INAME = "Granola" then do
FIND CURRENT OF ORDERS RECORD
GET ORDERS; QUANTITY
print QUANTITY
break LOOP
end
end LOOP
1/18/2011 16 CS319 Theory of Databases
Commentary on the CODASYL query 1
NAME := "Brooks"
> find the MEMBERS record associated with Brooks
> assume stored by CALC_key (hash-code) NAME
FIND MEMBERS RECORD USING CALC-KEY
LOOP: repeat forever
FIND NEXT ORDERS RECORD
IN CURRENT MEMBORD SET
> traverse link MEMBORD: ORDERS MEMBERS
> current MEMBERS record is Brooks’s
> link to his orders
if FAIL then break LOOP
FIND OWNER OF CURRENT ITEMORD SET
> apply link ITEMORD: ORDERS ITEMS
> to determine what item was ordered
1/18/2011 17 CS319 Theory of Databases
Commentary on the CODASYL query (cont.)
...
> apply link ITEMORD: ORDERS ITEMS
> to determine what item was ordered
GET ITEMS; INAME
> access name of the item ordered
if ITEMS.INAME = "Granola" then do
> check to see if item ordered is Granola
FIND CURRENT OF ORDERS RECORD
> current orders record is order by Brooks of Granola
GET ORDERS; QUANTITY
> access quantity of Granola ordered by Brooks
print QUANTITY
break LOOP
end
end LOOP
1/18/2011 18 CS319 Theory of Databases
About the CODASYL environment
Issue: is SQL and ORACLE simple ... ?
“ The sheer range of FIND commands and their almost
Byzantine intricacy is one of the reasons why DBTG
databases are programmed by experts … ”
“ The efficiency of CODASYL implementations for
performing access and update has been a very large factor
in their widespread use. This efficiency has been
purchased at the cost of using a baffling variety of storage
strategies and DML commands … ”
Peter Gray: Logic, Algebra and Databases
1/18/2011 19 CS319 Theory of Databases
SQL is simple - relative to CODASYL
ORDERS (ORDER_NO, NAME, ITEM, QUANTITY)
"Find how much Granola Brooks has ordered”
select QUANTITY
from ORDERS
where NAME=‘Brooks’ and ITEM=‘Granola’
The SQL-CODASYL comparison highlights reason for
Date-Darwen concern about ‘back-to-the-future’ in DBs
1/18/2011 20 CS319 Theory of Databases
Why relational? 1
CODASYL is bad, but is relational good?
[ also beware! CODASYL is bad, but is network bad? ]
... first try to understand Date's claims by comparing the
two models ....
Areas of usefulness for relational model:
data manipulation
database design
database definition
database installation
....
1/18/2011 21 CS319 Theory of Databases
Why relational? 2
Advantages of relational technology:
usability
productivity
... promotes end-user programming
Evident in relation to the CODASYL alternative!
cf Korth and Silberchatz file system vs DBMS
1/18/2011 22 CS319 Theory of Databases
Why relational? 3
Perceived advantages of relational DBs:
• simple data structure
• simple operators
• no frivolous distinctions
• SQL support
• the view mechanism
• sound theoretical base
• small number of concepts
• the dual-mode principle
• physical data independence
• logical data independence
1/18/2011 23 CS319 Theory of Databases
Why relational? 4
Perceived advantages of relational DBs (cont.):
• ease of application development
• dynamic data definition
• ease of installation and ease of operations
• simplified database design
• integrated dictionary
• distributed database support
• performance
• extendability
… all evident in relation to CODASYL comparison
1/18/2011 24 CS319 Theory of Databases
A brief elaboration of Date's concerns 1
• simple data structure
table is the basis of the relational model
• simple operators
5 relational operators for completeness
set-level operations / closure / declarative
• no frivolous distinctions
uniform methods of interaction with DB
e.g. for update relation, or impose constraint
• SQL support
high-level queries / widespread use, acceptance
• the view mechanism
means to customise the DB without new concepts
• sound theoretical base
relational model is mathematically rigorous
1/18/2011 25 CS319 Theory of Databases
A brief elaboration of Date's concerns 2
• small number of concepts
single mode of representation + uniform update
cf multi-mode + proliferation of mechanisms
• the dual-mode principle
embedded DML to access the DB from programs
autonomous activity resembles user interaction
• physical data independence
separate conceptual model / physical database
• logical data independence
separate conceptual model / user views
• ease of application development
makes application generators possible
makes high-level prototyping easy
• dynamic data definition
can modify a relational DB design incrementally
1/18/2011 26 CS319 Theory of Databases
A brief elaboration of Date's concerns 3
• ease of installation and ease of operations
robust, easy to manage by few personnel
• simplified database design
have principles for database design
• integrated dictionary
consistent interface for meta-level access
metadata-driven programs can be written
• distributed database support
high semantic content of queries, declarative nature
cf. problems of breaking up procedural chains
• performance
down to optimiser, not applications programmer
• extendability
can easily build on relational database models
1/18/2011 27 CS319 Theory of Databases
Date's concerns and CODASYL 1
• simple data structure?
cf complexity of DBTG sets and pointers
• simple operators?
no high-level operators, nothing at the set-level
have to record state, pointers create modes
• no frivolous distinctions?
complex methods of interaction with DB
e.g. update relation and impose constraint would be
dealt with in entirely separate ways
• SQL support?
no concept of high-level query, was widespread!
• the view mechanism?
has no analogue for CODASYL
• sound theoretical base?
no discernible theory in CODASYL framework
1/18/2011 28 CS319 Theory of Databases
Date's concerns and CODASYL 2
• small number of concepts?
multi-mode + proliferation of mechanisms
representation ways to select, insert, delete, update
• the dual-mode principle?
no clear distinction between high-level queries and
application programmer's mode of access
• physical data independence?
conceptual model mixed with physical database
• logical data independence?
no provision for user views
• ease of application development?
CODASYL doesn't make data access much easier
• dynamic data definition?
DB design has to be carefully preconceived and can't
easily be adapted
1/18/2011 29 CS319 Theory of Databases
Date's concerns and CODASYL 3
• ease of installation and ease of operations?
CODASYL probably keeps program surgery busy
• simplified database design?
principles for database design more suspect
• integrated dictionary?
meta-level issues not addressed within model
• distributed database support?
who'd like to parallelise CODASYL updates?
• performance?
was traditionally better than relational models!
• extendability?
CODASYL not something to be built on ...
1/18/2011 30 CS319 Theory of Databases
Interpreting Date’s defence of relational models
Date’ s arguments in defence of relational models are
very powerful when seen in the context of CODASYL
Need to understand them in relation what might be the
best data modelling practices for today and the future
Important for this purpose to classify the defences:
• defence from theory ? is the theory adequate
• defence from practice ? will the practice change
• special qualities exhibited by the relational model
? are they particular to RDBs, or generalisable
1/18/2011 31 CS319 Theory of Databases
Classifying the advantages cited by Date 1
Will classify Date’s list of advantages into
THEORY, PRINCIPLES and CONSEQUENCES
and further subdivide
PRINCIPLES into PRACTICAL & FOUNDATIONAL
THEORY
• simple data structure
• simple operators
• no frivolous distinctions
• sound theoretical base
• small number of concepts
1/18/2011 32 CS319 Theory of Databases
Classifying the advantages cited by Date 2
PRINCIPLES - PRACTICAL ASPECT
• SQL support
• the view mechanism
• the dual-mode principle
• physical data independence
• logical data independence
• dynamic data definition
PRINCIPLES - FOUNDATIONAL ASPECT
• simplified database design
• integrated dictionary: metadata-driven
• distributed database support: atomicity
1/18/2011 33 CS319 Theory of Databases
Classifying the advantages cited by Date 3
CONSEQUENCES
• ease of application development
• ease of installation and ease of operations
• performance
• extendability
The status of these advantages is relevant when we
come to consider what is really siginificant about the
relational model in comparison with other alternatives ...
1/18/2011 34 CS319 Theory of Databases
… will return to express personal views concerning the
defence of the relational position later … turn next to
the issue of ‘Why not Relational?’
whynotrel.ppt
1/18/2011 35 CS319 Theory of Databases
What are the virtues of the relational model? 1
Certain features of relational models wish to retain ...
The defence from theory …
• simple data structure
want elegant and consistent structures
• simple operators
want high-level operators
need techniques at the set-level
don't want to have to record state
don't want to maintain pointers
1/18/2011 36 CS319 Theory of Databases
What are the virtues of the relational model? 2
Certain features of relational models wish to retain ...
The defence from theory …
• small number of concepts
want a unified view for representation
uniform ways to manipulate
• sound theoretical base
want to be able to apply mathematical techniques
... but all these attributes apply to Miranda, for example,
and this hasn't made it widely / wildly successful
1/18/2011 37 CS319 Theory of Databases
What are the virtues of the relational model? 3
The defence from practice ...
• SQL support
need concept of high-level query
• the view mechanism
must be able to represent different user views
• the dual-mode principle
invoking user commands automatically is a powerful
principle for program development and debugging
• physical and logical data independence
must be possible to separate concerns
at high and low levels of abstraction
... but do these qualities fit into a general scheme or are
they specific to the relational framework?
1/18/2011 38 CS319 Theory of Databases
What are the virtues of the relational model? 4
Evidence of special suitability for real-world modelling ...
• simplified database design
have principles for database design
contrast the messiness of CODASYL
• integrated dictionary
can write metadata-driven programs
no chance to take high-level view in CODASYL
• distributed database support
high semantic content of queries, atomicity of action
queries in CODASYL not much about the real-world
1/18/2011 39 CS319 Theory of Databases
What are the virtues of the relational model? 5
Evidence of special suitability for real-world modelling ...
... database design reveals very direct connections
between dependencies amongst attributes of real-
world objects and forms for their representation in
relation schemes
content = real-world meaning
dictating form = structure of the representation
Fundamental conflict between theory and practice
over the relationship between form and content
1/18/2011 40 CS319 Theory of Databases
What are the virtues of the relational model? 6
Important aspects of relational DBs (in WMB’s view)
THEORY aspect underlying algebraic model
• provides basis for unambiguous evaluation
• closure properties
• potential for optimisation & axiomatisation
PRINCIPLES represented in
views + application generators + spreadsheets
1/18/2011 41 CS319 Theory of Databases
What are the virtues of the relational model? 7
Important aspects of relational DBs (in WMB’s view)
PRACTICAL aspect
• involve state essentially, so not purely declarative
• good for expressing agent actions / views
• good for representing levels of abstraction
cf ACE & A Small Matter of Programming, Bonnie Nardi
Represents a framework for managing state cleaner
than procedural programming, more expressive than FP
1/18/2011 42 CS319 Theory of Databases
What are the virtues of the relational model? 8
Important aspects of relational DBs (in WMB’s view)
FOUNDATIONAL aspect
• concerned with metaphor not symbolic representation
• invokes form and content in combination
Notes on these respective issues
• metaphor: the form reflects the content
[as is true to some degree of relational models]
• cf logicism debate in AI:
A Critique of Pure Reason McDermott et seq
1/18/2011 43 CS319 Theory of Databases
Issues for database development 1
How to avoid "back to the future"?
• need theoretical foundation
• need qualities of declarative query
• need principles to handle abstraction at many levels:
data independence
• need to support interaction of agents at high-levels of
abstraction
• need to retain / replace the form-content relationships
that relational DB design theory introduces
1/18/2011 44 CS319 Theory of Databases
Issues for database development 2
Modern database demands
• enormous volumes of data
• high-performance e.g. for multi-media, real-time
• support for metaphor e.g. visual image not table
• concurrent access, distributed data
• closer integration between direct (human) and
programmed (computer) data access
• support for modern data abstractions: objects,
inheritance, aggregation
• applicability to design environment needs:
incremental intensional change
1/18/2011 45 CS319 Theory of Databases
Related docs
Other docs by ghd26630
Database Penjualan Barang RE NE PENGENALAN RENE Definisi RENE yang merupakan
Views: 164 | Downloads: 0
Get documents about "