Part_3_OO_Databases by suchenfz


									The Object Oriented Paradigm; and DB!
                              Chris Porter
                        University of Malta
1. Intro
2. Class and object instantiation mechanisms
3. Class hierarchy
4. ODMG Introduction
5. Advanced Modelling in OO
6. Aggregation
7. Methods
8. Inheritance
9. Query Language and Processing
10. Case Studies
If you‟re visiting Iceland you should know
that tipping at a restaurant is considered as
                   an insult!
           Businesses Need

High performance Processing on Complex Data

              So Why OODB?
   Telecommunications
    ◦ Taken for granted
      Especially Wireless

    ◦ Required
      High Availability
        Cell to cell hopping without dropping
      High Performance
        High volumes of transmission in voice and data in near real-
        Fault analysis and corrective action (e.g. Cell breathing)
      Data Distribution
        Distributed Base Stations
        Multiple VLRs
          Not to mention roaming!
   Malta might have over 672 GSM cells
        As at August 30th 2009

   Have a peek at the KML in Google Earth!

   It is evident, that something reliable and
    robust is required.
    ◦ Potentially capable of handling distributed data
      In almost real-time!
   Profile
    ◦ Wireless R&D company
    ◦ Mid 1990, Qualcomm developed CDMA
      Phones
      Base Stations
      Chips
        ... More phones served by fewer cells due to the code
         division multiplexing technology, unlike time multiplexing
         used in GSM
        Increased bandwidth efficiency
   Need (from Qualcomm Case Study)
    ◦ DBMS to allow its Intelligent Base Station Controller
      (ISBC) to manage hundreds of cells and the
      corresponding transmitter sites
    ◦ Real-time fault analysis was a critical component of the
      DMBS choice has a deep impact on the above requirements!

   Objectivity/DB
    ◦ Chosen while developing CDMA
    ◦ Foundation of the Management Information Base (MIB)
      for its CDMA Intelligent Base Station Controller
   Why?
    ◦ Ability of Objectivity/DB to specify objects cleanly
      According to telecommunications protocols (which
       might not be so straightforward)
    ◦ Ability to guarantee non-stop operation
    ◦ Capability of data replication in a transparent way
      (distributed DB)

   Result?
    ◦ ISBC developed while saving 3 man-months of
      development expenses
   Stanford Linear Accelerator Centre (SLAC)
    ◦ National Research Lab
    ◦ Probing the structure of matter!

   B Factory
    ◦ PEPII accelerator
    ◦ BaBar particle detector
   Objectivity/DB
    ◦ Can handle complex structures
      We‟re talking about analysis of matter here!

    ◦ Large volumes of data
      1TB per/day of data collected
      > 900TB (compressed) of data
        Largest DB in the world!

    ◦ Can make data available to research partners
      around the world
      Data distributed over 240 servers in a grid of over
       2000 application processors
   Sloan Digital Sky Survey (SDSS)
   Mapping the universe... that‟s all!
    ◦ Cataloguing everything in the sky
   $32 million project at FermiLab

   OODB
    ◦ Handles the multiplicity of object types in tremendous
      Every star, galaxy, luminosity, spectral intensity, ...
    ◦ Capacity, Analysis, Persistency and Sharing required!
    ◦ Results of analysis are stored as identified objects
      Resulting in a catalogue of 100 million objects

   Traditional Data Models and Systems
    ◦ Relational
    ◦ Network
    ◦ Hierarchical

      ... have been successful
        in developing the DB technology
          required for traditional business applications
   But a number of shortcomings exist!
    ◦ Especially when complex environments come into
        CAD
        CAM
        Computer Integrated Manufacturing
        GIS
        Multimedia,
        Telecommunications
◦ Requiring
  Complex structures for objects
  Longer duration transactions
    Might have complex ecosystem of sub-operations for one
  New data types
    Business discourse specific
  Need to define non-standard app-specific operations
   OODB model is not new...
    ◦ In development since the 1980‟s

   RDBMS won dominance in the commercial
    ◦ Lock-in occurred at many levels
      Financial
      Vendor
      Systems and Infrastructure
   30+ years of relational technology
    ◦ Is it ageing?

   It is great when we talk about
    ◦ Simple tabular data
    ◦ Simple operations
    ◦ Centralized data storage

   Company/University examples!

   What happens when we ask more of our relational
    ◦ Dr. Andrew E. Wade coined the term hitting the “Relational
   When the DB cannot scale (effectively) to our
    needs or to the needs of our analytical
    ◦ Add capacity to RDBMS. How?
      Add RAM
      Add HDD
      Add CPUs (in parallel)
    ◦ What happens?
      Costs rise
      Management is more complex
      Cannot plan on growth... your systems cannot keep
       holding on!

   What about M:N relationships?
    ◦ Can be modelled in the relational data model
      BUT have to add extra elements in the schema and
       extra code at application level
        Loss of performance
        Complex!
   What about complex data structures?
    ◦ Relational Model allows for limited complexity
      Flat tables with a relationship in between!
    ◦ The more complexity added, the slower and heavier
      the schema become, affecting operations.
   A job requires the right tool
   If job handles large volumes of complex data
    ◦ RDBMS might not be the best option
    ◦ ODBMS?
      In theory it enables
        Scalability (persists objects)
        Flexibility (as offered by the OOP)
        Performance (e.g. Insert affecting 20 tables)
                              Application Level

                               Object: Employee

                                                      Table: StaticInfo
Table: Rights

Table: Department            Table: Project          Table: Roles

                                                  Table: Country
           Table: Employee

                                              Database Level
- CPUs
- HDDs
- $$$
   Note
    ◦ Object at code level needs to be persisted
    ◦ Schema is well structured and normalized

    ◦ ... this is the normal scenario which we face on a
      daily basis

   But in reality
    ◦ At code level we‟ll be dissolving the object which
      we need to persist
   Call the relevant stored procedures
    ◦ And pass parts of the object as parameters

      SP will attempt to insert this decoupled data (which
       logically is cohesive to the Employee object)
        While making sure that no constraints are violated

    ◦ At one point or another, our object will be „neatly‟
      stored on secondary storage
◦ Shall we go about the joins needed to load the
  employee back to our application for modifications?

◦ Do you need a clearer example?
   At night you want to put your car in the garage!
    ◦ Car  Garage – How complex is that?

   Using the relational model, the car will be
    normalized down (or disassembled) to its basic
    components which are to be stored in different
    tables (or compartments in your garage).
    ◦   Screws in drawer A
    ◦   Tyres in big drawer B
    ◦   Pistons on enclosed-shelf C
    ◦   and so fort...
   Next morning before going to work
    ◦ Get all parts from different drawers
    ◦ Reassemble car
    ◦ Drive

   ... it might sound funny...
    ◦ but that‟s what we‟re doing, ballec!!
   Translation layer required
    ◦ Between RDBMS and application
    ◦ To translate object into relations

   Code generators were introduced to help out
    ◦ A good business to say the least!

   Companies spend a lot of time on their perfect
    translation layer!

   ORM have apparently solved the problem!
    ◦ BUT have they? Any comments? Any benchmarks?
   The translation layer maps objects into tables
    ◦ Errors in mapping might occur!
      Integrity Violations!

   Disassembly and re-assembly of objects
    takes time. The more complex the object it,
    the more time it takes.
At a very high level, it   Application Level

looks like so!             Object: Employee

                             Database Level
straight to
DBMS for
actual long
   Using ODBMS, an object myCar (as one instance of
    the class CAR) is stored in myGarage (as an instance
    of class GARAGE)
    ◦ How?

   ODBMS does not require any mapping between
    objects at code level and objects stored on DBMS
    ◦ Complexity of data structures and relationships (e.g.
      Classes as datatypes) are captured and handled by the
      DBMS itself.
    ◦ All developers share the same view of the data
      Integrity risk reduced!

   The more complex the structure is, the more time is
    saved by using an ODBMS rather than an RDBMS
   We will be digging deeper into this model
    ◦ We‟ll discuss
        What entry points are
        Why do we need extents
        How to create relationships
        What about redundancy which RDB Model solves?
Most lipstick brands contains fish scales!
   OO DBs provide
    ◦ Power to designers to specify
      Structure of complex objects
      Operations that can be applied to such
        ... in line with the use of OO languages and paradigm in

    ◦ Seamless integration with OO application
   Relational DBMS vendors saw these requirements
    ◦ Tried to integrate in their own products
      Object relational or extended relational DBMSs
         SQL 2003 contains such features

   Experimental and commercial OO DBMS are available
    ◦ Open OODB by Texas Instruments
    ◦ IRIS by HP
    ◦ ODE by AT&T Bell Labs

   Commercial
    ◦ DB4O (including .Net and Eclipse plugins)
    ◦ Objectivity/DB
    ◦ FastObjects (incl. .Net)
   Standardization was now required
    ◦ Rather than taking the formal (longer) path

    ◦ ODMG was formed
      Object Database Management Group
      Proposed a standard  ODMG-93
        Now revised

    ◦ OODBs adopted many OOP concepts
   Roots from SIMULA (1960s)
   Later the concept of abstract data types came
    ◦ Hiding internal data structures and specifies
      external operations that can be applied to an object
      Encapsulation

   At Xerox PARC, SmallTalk was developed
    ◦ Incorporating the concept of Inheritence
    ◦ The first pure OOPL
   Objects
    ◦ Have a state (value)
    ◦ Have behaviour (operations)
    ◦ Are transient (exist only during execution)

   That‟s why OODBs are required
    ◦   Persistence of objects
    ◦   Indexing
    ◦   Concurrency Control
    ◦   Recovery
    ◦   Interface with OOPL
         Adding persistence and object sharing to their otherwise
          transient nature
   Object Integrity and Identity
    ◦ OODBs apply a system generated OID per object
      Object Identifier (~PK in relational)
      In Relational: If PK is modified, the identity of the entity is
       also modified

   Object Structure
    ◦ Can have arbitrary complexity
      Containing all the required information
      Describing the object‟s state and behaviour
    ◦ In RDB information is generally scattered
      Loss of direct correspondence between real world object and
       DB representation
   Attributes
    ◦ In objects we have Instance Variables
      Hold the values defining object‟s state #
        E.g. Person.Name
      Encapsulated within the object
        Not necessarily visible to external users
      Can be arbitrarily complex data types
        E.g. Other objects!
   Operations
    ◦ Object behaviour or functions defined in two parts
      Signature or Interface: Operation name and arguments
        Unique unless overridden
      Method or body: Implementation of the operation
    ◦ Invoked by passing a message to an object
      Including operation name and parameters
    ◦ Encapsulation permits modifications of internal
      structure in a safe way!
      Data/Operation independence
   Class Hierarchies and Inheritance
    ◦ Specification of new types or classes that inherit structure
      and operations from parent classes
    ◦ Allow system‟s data type incremental development
    ◦ Allow definition re-use for new types/objects

   Object relationships
    ◦ Complete encapsulation leads to no explicit representation
      of such (but define methods to do so)
    ◦ But what about complex databases with many
       ODMG introduced binary relationships
         Pair of inverse-references
         OIDs of related objects within objects themselves
           Referential integrity is thus kept
   Multiple Versions (Versioning)
    ◦ Of the same Object
      Essential in engineering applications
        Tested/Verified object to be retained until a new version is
         tested and verified
        New object may have a few new versions of its component
        Others may remain unchanged
    ◦ Of the same Schema
      Schema Versioning
        Changes in type declarations
        New relationships/types are created
      Not specific to OODBs, and it is suggested that they should
       be available to all DBMSs
   Operator overloading
    ◦ Operation applied to different object types
    ◦ One operation name  many behaviours or
      implementations (object dependent)
    ◦ Polymorphism

    ◦ E.g. CalcArea() (operation name)
      Implementations for
        Triangle, circle and square
      Late binding is required
        Operation name to the appropriate method at runtime
          Type of object to which operation is applied is known
   Object Identifiers

    ◦ Unique Identity for each object on DB

    ◦ System Generated

    ◦ Not visible to external users
      Used solely by DBMS
        To create and manage inter-object reference
          Coming in later on!
   Object Identifiers (cntd...)
    ◦ Immutable
      OID should not change
        Preserving identity of real-world object

    ◦ Should be used once
      Even if removed from DB, to re-assignment is made!

    ◦ Should not be tied to physical address of object in
      Address can change after reorganization of DB
        Pre-relational models had this problem!
   Object Identifiers (cntd...)
    ◦ Should simple values be given an OID?
      E.g. Integer!

    ◦ In theory it would be great!
      50 apples or 50 years?

    ◦ In practice?
      Too many OIDs generated
   So, most OODBMSs allow for the
    representation of Objects and Values

   Object Identifiers (cntd...)
    ◦ Objects  have OIDs
    ◦ Values  do NOT have OIDs

   Value typically stored within an object
    ◦ Cannot be referenced from other objects
   Are OIDs like PKs?
    ◦ No

   So what differs OIDs from PKs?
    ◦ Let‟s start from here:
      Identities determine how entities are distinguished
       from each other
   PKs
    ◦ In an RDBMS, entities are identified by PKs
      Could be incremental number, GUID...

    ◦ Relationships in RDBMS are defined by matching PK
      with foreign key data!

    ◦ These relationships are then used in queries, and to
      join tables.

    ◦ So, Identity depends on the VALUES in key fields
   OIDs
    ◦ OODBMS stores a unique OID within each object

    ◦ OID used to indicate other objects to which it is related

    ◦ Generally a number

    ◦ Not visible to developers
      Not even to the application creating the object!

    ◦ Thus, OID is not related to the state of the object
      Not even to its properties or class definition
   OIDs (cntd...)
    ◦ Any benefits?
      Can have two distinct objects, with identical attribute
        Consider Boeing 747 with serial number 1 and Boeing 747
         with serial number 2
          They are identical!
          But they are two different instances of the class Boeing 747
   OIDs (cntd...)
    ◦ Ok, you might say, we cannot have two with the
      same serial number!
      True. That‟s a business rule, and thus the application
       should handle this!
        Might hurt a bit hearing this!

   But let‟s look at it from another perspective
    ◦ If you change all the properties of an object, the
      object should remain the same object, just with
      different values.
   Any Questions?

       “But how would you maintain uniqueness if that is

       “Shouldn’t you have key constraints to ensure data
   It seems like having no key fields is a problem in
    ◦ You may be right to think so!
    ◦ We‟re used to adding PKs and constraints to our schema!

   Moreover you cannot enforce (easily) the
    OODBMs to check for duplicate values on
    persisting your objects!
    ◦ We can have 2 objects with the same serial number or
      identifier (not OID, but application level ID)
      E.g. 2 Boeing 747s with the same serial number!
    ◦ DB will store them with 2 different OIDs, and she‟s
   Isn‟t that too much of a burden on the
    ◦ Yes it‟s true
    ◦ Checking has to be carried out at code level
      Disallowing duplicates and so on!
    ◦ If the application is happy with 2 identical objects,
      than the DBMS should be able to persist whatever is
      given to it!

   What will the cost be?
   Is the idea of having all the checks and
    constraints down at the DBMS level a
    necessity ?

   Consider this!

   RDBMS and OO Programming are two
    different worlds. One doesn‟t know the other,
    and they talk two languages!
    ◦ What do you do?
   RDBMS is decoupled from the
    application using it!
    ◦ Making the two talk to each other requires
      a „translation‟ layer, from OO to relational.
    ◦ This has risks
       To avoid this we implement the „Fail Fast
        Create rules in the DB to raise exceptions if
         the application misbehaves!
   OODBMSs are more tightly coupled with OOP

   Object instances can be stored in RAM during
    execution (e.g. Until db.close() issued)

   Checks are more straight-forward, and a
    simple if statement and a check against
    objects in RAM, will help us get results
   Graphically:
OOP Application -- RDBMS   OOP Application -- OODBMS


    Translation Layer

   If OID ≠ Any of the Object‟s Values
    ◦ There exists Equivalence and Equality

   Equivalence
    ◦ If two object references point to objects with the same
      OID they are equivalent
    ◦ Assume Employee “Joe Borg” with Mobile “+35679102030”
       E.g. 2 searches
         S1: Find Employee with name == „Joe Borg‟
         S2: Fine Employee with phone == „+35679102030‟
    ◦ Both S1 and S2 will have references to object w/ OID 345
    ◦ Thus these are equivalent
      Since same object can be updated through both references.
   Equality
    ◦ This occurs when two objects (with two different
      OIDs) have the same state (values)
      E.g.
        OID 321: Joe Borg with +35679102030
        OID 543: Joe Borg with +35679102030

    ◦ If a QBE is executed with typeof(Person), then you
      would get two references to all the Person instances
      References are not equivalent, but instances are equal!
        Recall the if myString.equal(myOtherString)clause!
   Deep Equality
    ◦ Identical States
    ◦ Identical OIDs

    ◦ E.g.
       o1 = (i1, tuple, <a1:i4, a2:i6>)
       o3 = (i3, tuple, <a1:i4, a2:i6>)

   Shallow Equality
    ◦ Equal States
    ◦ Can have different OIDs

    ◦ E.g.
      o1 = (i1, tuple, <a1:i4, a2:i6>)
      o2 = (i2, tuple, <a1:i5, a2:i6>)
   An object can be looked at as O(i, c, v):
    ◦ I : OID
    ◦ C : Type constructor (indicating how object state is to
      be constructed, and not class itself)
    ◦ V : Object state

   Type Constructors
    ◦   Atom (int, real, char, …)
    ◦   Tuple
    ◦   Set
    ◦   List
    ◦   Bag
    ◦   Array
   List is similar to Set: OIDs in a list are

   Array is finite (max size).

   Set: Set has distinct values (Bag not)

   Set, List, Array and Bag
    ◦ Collection Types (bulk types)
      State is a collection of objects: may be unordered (set,
       bag) or ordered (list, array)
   Object State V of O(i, c, v)
    ◦ Based on type constructor C
    ◦ If c == atom
      v is an atomic value

    ◦ If c == set
      v is a set of OID (typically set of objects of same type)
    ◦ If c == tuple
      v is a tuple of form <a1:i1, a2:i2, …, an:in>
        a: attribute name
        i: OID
    ◦ If c == list
      v is an ordered list [i1, i2, …, in] of OIDs of objects of the same
    ◦ If c == array
      v is a one-dimensional array of OIDs
   An object can have arbitrary nesting of
    ◦   Set
    ◦   List
    ◦   Tuple
    ◦   …

   State of an object (v) != atom, referred to by
    ◦ Actual value only appears when state of an object
      is of type Atom.
         Otherwise OID used! Let‟s use an example!
   Recall: Object (OID, Type Constructor, State)

   O1 = (i1, atom, “John”)
   O2 = (i2, atom, “Grech”)
   O3 = (i3, atom, “Mario”)
   O4 = (i4, atom, “25”)
   O5 = (i5, atom, “10/10/1910”)
   O6 = (i6, set, {i1, i2, i3}) //Contains O1 till O3
   O7 = (i7, tuple, <Name:i1, Age:i3, Mgr:i8, DOB:i5>)
   O8 = (i8, tuple <Mgr:i9, Mgr_Start_Date:i5>)
   O9 = (i9, tuple, <FName:i1, LName:i2, …>)

   … note, only atomic values display actual sate
   O6 = (i6, set, {i1, i2, i3})
       Set-valued object representing set of
        employees for O6 (e.g. department).
   {i1, i2, i3} refers to atomic objects {„John‟,
    „Grech‟, „Mario‟}

O7 = (i7, tuple, <Name:i1, Age:i3, Mgr:i8, DOB:i5>)

    O8 = (i8, tuple <Mgr:i9, Mgr_Start_Date:i5>)

            O9 = (i9, tuple, <FName:i1, LName:i2, …>)
   Model features
    ◦ An object can be represented as a graph structure
    ◦ Constructed by applying type constructors
   Used to define the data structures for an OO Schema

   Date Type is defined as tuple rather than atomic
   Integer, String, Float, … used for atomic types.
   Attributes referring to other Objects
    ◦ … are references to other objects
        Represent relationships among object types!

    ◦ E.g. Attribute Dept in EMPLOYEE
      Of type DEPARTMENT
        Refer to specific Department object
      Value of Dept would be an OID
      Binary Relationship between Employee and Department
   Binary Relationships
    ◦ One Direction
      e.g. Employee has Dept

    ◦ Inverse Reference
      e.g. Department  employees: set(Employee);
        Easier to traverse relationship (from any direction)
        Employees of DEPARTMENT
          Value == set of references (i.e. OIDs) to EMPLOYEE objects
        Inverse: Department of EMPLOYEE (as above)

   Let‟s look at a simpler example
   Taken from “The definitive Guide to DB4O” by
    Edlich, Paterson and Hörning

   In a relational DB the schema defines
    ◦   Tables
    ◦   Attributes in each table
    ◦   Relationships between tables
    ◦   ... In DDL (e.g. CREATE TABLE)
   In a OODBMS the schema defines (in ODL)
    ◦   Classes
    ◦   Properties of classes
    ◦   Possible object relationships
    ◦   ... In ODL (Defined later on)

    ◦ DB4O store object as defined in C# or Java
         Native object storage (no separate schema required)
   It is known that objects in an OO system are
    related in various ways
    ◦ Aggregation
    ◦ Association
    ◦ Inheritance

   So such relationship should be maintained when
    persisting objects

   Let‟s have a second look at Binary Relationships
    ◦ That is, how relationships are persisted in a DB
   Relationships in memory are maintained by
    object references
    ◦ Once persisted, references are lost
      Thus we need to persist OIDs of the related objects

   Look at the following:
Objects to be stored

                       name: Joe          Object References   Department
                       surname: Caruana
                       dob: 10/10/1943                        name: Sales
                       department:                            location

                       name: Jimmy
                       surname: Corunio                       Department
                       dob: 10/10/1943
                                                              name: Dev
                       name: Silvio
                       surname: Pace
                       dob: 10/10/1943
                        Employee Extent         Department Extent

Objects stored in DB

                       name: Joe          OID   Department
                       surname: Caruana
                       dob: 10/10/1943          name: Sales
                       department:              location

                       name: Jimmy
                       surname: Corunio         Department
                       dob: 10/10/1943
                                                name: Dev
                       name: Silvio
                       surname: Pace
                       dob: 10/10/1943
   Let‟s have a look at a practical example
    ◦ Create 3 employees
    ◦ Create 2 departments
    ◦ Link employee‟s attributes with each department

    ◦ Store employees on DB
    ◦ Open Object Manager and look at DB contents
      Objects
      Ids
      Relationships
   So what happened when Employees were

    ◦ For each Employee, an OID of the respective
      Department object was stored
       Department objects were also persisted in order
        to maintain integrity!
        Can be shared by many objects (e.g. 2 employees in
         the same department)
      When object is stored, it stores all reachable objects
       which are referenced in the object graph
   Inverse Reference or Relationship
    ◦ How could we deduct who works in department X?

   In the example given above, we cannot
    ◦ Why?
      Department objects do not have a reference to the OID
       of their respective Employees
◦ So we can say that we defined Unilateral

◦ The only relationships that can be used for queries
  are those which have been pre-defined by storing
  the appropriate OID
    ◦ Thus this model (OO) can be seen as a navigational
      Class Graph!

    ◦ But what if we require traversals from both
      E.g. Get all employees working in Dept X!

   Inverse Relationships
    ◦ Add respective OIDs in both related objects
      Employee has Dept OID
      Department has Employee OID
   Any immediately apparent problems?

    ◦ We have enforced a 1:1 relationship!!

    ◦ Employee works in one Department and
      Department has one Employee

   Let‟s assume 1:1 relationships for now!
    ◦ So now we can traverse the relationships in both
                        Employee Extent         Department Extent

Objects stored in DB

                       name: Joe          OID
                       surname: Caruana         name: Sales
                       dob: 10/10/1943          location
                       department:              employee

                       name: Silvio       OID   name: Dev
                       surname: Pace
                       dob: 10/10/1943
   In Inverse Relationship, it doesn‟t matter which
    object we store
    ◦ If Department instance is stored
      Related Employee instance will also be stored
    ◦ If Employee instance is stored
      Related Department instance will also be stored

   Inverse relationships are used to enforce
    relationship integrity

   When retrieving an object, the related and
    reachable objects are also retrieved
class Employee
  attribute string name;
  attribute string surname;
  attribute datetime dob;
  relationship Department dept
      inverse Department :: employee;

class Department
  attribute string name;
  relationship Employee employee
      inverse Employee :: deptarment;
   The previous definition means that:
    ◦ Employee object has a relationship to a Department
      object, labelled „dept‟
    ◦ The inverse of this relationship is labelled „employee‟ for
      the Department object

   This distinguishes the way we define classes in
    an OOPL
    ◦ We just add an attribute in both classes which is an
      object reference to the respective class type instance

   Some ODBMS allow you to use the OOPL
    definitions, with no need of an ODL schema
   Assume the following app in a NEW db4o DB!
    ◦ Department
      ArrayList of Employees
    ◦ Employee

   Assume 3 employees are created and stored in
    an ArrayList, which is in turn assigned to a new

   How many extents have I got in my new DB4O

   How many instances of objects have I got
   Examining the DB we find that:
    ◦ Employee instances are stored within the Employee

    ◦ Department Instances are stored within the
      Department extent

    ◦ ArrayList is another class, which instances are
      stored in their own specific extent
      ArrayList is given an OID
      And will contain OIDs to Employee instances
   Assuming a clean DB again
    ◦ Add a manager who can manage one department
    ◦ Create class manager with Department dept attrib

    ◦ Store Manager instance in DB

   How many extents will we have?

   How many object instances?
   Note that we can also traverse all the graph,
    starting off from the manager.

   Notice the OID next to the dept instance in

   Notice the OIDs in the dept instance‟s
    employees list

   It is a matter of object references building up
    our tree
        Common Example
         ◦ Car has many wheels

        ODL
 class Car                                 class Wheel
 {                                         {
          attribute string modelNo;                 attribute integer diameter;
          attribute string chassisNo;               attribute string modelCode;
          relationship set<Wheel> wheels            relationship Car car
                   inverse Wheel :: car                      inverse Car :: wheels
 }                                         }

If we literally implement the above in code, we‟d end up with something like this:
Employee emp1 = new Employee("Joe", "Carabott", new Department("Sales",
"SLS", new ArrayList().Add(emp1)));
                                            Anything wrong?
                                                   - Think in bodmas!
   Assume a Project and Employee relationship
    ◦ Project has many employees
    ◦ Employee works in many projects

   In a relational model, we generally use the
    third join table
    ◦ ProjectEmployee
      EmpID and ProjID as a composite PK

   What happens in the OODB model?
   Each object has a set of OIDs for the related
    object (Edlich, Paterson, Hörning)
             class Project
   In ODL   {
                    attribute string name;
                    attribute string costCode;
                    relationship set<Employee> employees
                           inverse Employee :: projects
             class Employee
                    attribute string name;
                    attribute string dob;
                    attribute string phone;
                    relationship set<Project> projects
                           inverse Project :: employees
   So what do we do in practise?

   Do we need the inverse relationship in our
    C#/Java classes?
    ◦ Depends on the flexibility required!
    ◦ Depends on the navigability required
   Inverse relationship is NOT necessary
    ◦ May be useful though

   If you need a report through which you want
    to find Employees for a given Project, but
    NOT the Projects for a given Employee
    ◦ Inverse Relationship would be useless
      Employee won‟t need to have a list of Projects

   But is M:N lost then?
    ◦ Any guesses?
   No, M:N is still maintained
    ◦ Many Project instances can still hold the OID for the
      same Employee, several times within their
      Employees list

   Let‟s try it out!
   Create 2 projects
    ◦ Create 2 collections of projects
       Collection 1: P1
       Collection 2: P1 and P2

   Create 3 employees
    ◦ E1 and E2 are instantiated with Project Collection 1
    ◦ E3 is initiated with Project Collection 2

   Create inverse relationship
    ◦ Create 2 collections of employees
       Collection 1: E1, E2 and E3
       Collection 2: E3
    ◦ Store Collection 1 in p1.Employees
    ◦ Store Collection 2 in p2.Employees

   Persists in DB
   Result?
    ◦ Graph can be navigated from either Employee or
      from Project

   How about complexity?
    ◦ Inverse relationships may be avoided unless
      explicitly required, to keep things simple.
                                           We have an inverse relationship,
                                           whereby we can find Projects by
ArrayList (PCol1)   Project                Employee and Employees by Project
                    name: Project 1
Entries             costCode: PRJ1
                                           E.g. May has PCol2‟s OID in projects,
                                           which in turn has 2 OIDs to PRJ1 and
                                           PRJ2 instances. PRJ2 has an OID to
ArrayList (PCol2)
                    Project                ECol2 (employees), which has May‟s
Entries             name: Project 2        OID stored.
                    costCode: PRJ2
                                                               Look at resultant

                           ArrayList (ECol1)     ArrayList (ECol 2)

                           Entries               Entries

                           Employee             Employee              Employee
                           name: Joe            name: Jim             name: May
                           dob: 10/10/1932      dob: 10/10/1932       dob: 10/10/1932
                           phone: 123456        phone: 123456         phone: 123456
                           projects             projects              projects
   BUT!
    ◦ It is the up to the developer to ensure that
      relationships are maintained!

    ◦ One could easily create incorrect M:N links if
      choosing an inverse relationship strategy

    ◦ Having circular references, we can traverse
      relationships from any instance, reaching any other

    ◦ Thus, we can store the entire object graph‟s content
      by choosing any one object! E.g. May!
   Trivia!
    ◦ What happens if we remove the collection of
      Projects from the Employee class definition?

    ◦ What is stored on the DB?

    ◦ Is the object graph still traversable?

    ◦ Answer: All objects will be stored, but we have less
      traversability! We can only reach Projects through
      the Projects extent, but we can reach Employee
      instances from anywhere else! Employees won‟t
      store Project OIDs
See you in 5 minutes
   We agree that we‟ll always have an extent
    associated to an object
    ◦ Holding persistent objects of that sub/type
   Every Object sits in an extent
    ◦ Corresponding to a subtype
      Must be a member of the extent corresponding to its

   Root of all objects?
    ◦ ROOT class in an OODBMS
      It‟s extent contains ALL the other objects in the
   Classification of types and subtypes, starting
    from the ROOT/OBJECT class
    ◦ Produces a Class/Type Hierarchy
      With the primary extent for ROOT/OBJECT object types
      And all other extents as subset of suc

   Two types of inheritance
    ◦ Type Inheritance (interface Time: Object)
    ◦ Extent Inheritance (class FACULTY extends PERSON)
   Let‟s settle the basics first

   Inheritance == Central Concept of OOP!

   Is it possible in an RDBMS?
    ◦ No

   Do OODMBSs support inheritance?
    ◦ They MUST
      Otherwise it‟s not OO at all!
   ODL supports multiple inheritance

   But C# and Java do not handle multiple
    ◦ What‟s the use?

   C++!
   Let‟s keep it simple for now!

   A company has 3 types of employees
    ◦ Operator
    ◦ Manager

   Both employees have a name, dob and phone
    ◦ Manager has Department field
    ◦ Operator has Skill field

   Let‟s implement such!
   Create 3 instances of Operator
   Create 1 Manager

   How many objects should we have in the DB?
    ◦ 4 instances!
    ◦ 3 classes!

   Notice that the 4 object instances, fall under the
    Employee class type!
    ◦ Thus having 1 extent (Employee)
      Holding 2 other extents, that is, Operator and Manager!
   Assume the previous case

   Instances of classes at different levels within
    the ancestry will have a separate extent in the
    ◦ DB will hold an extent for each class


Worker        Manager
   One of the main motivators towards OO
    ◦ Desire to represent complex objects

   Two main types
    ◦ Structured
    ◦ Unstructured
   DBMS permits Unstructured Complex Object
    ◦ Storage and retrieval of large objects
    ◦ Requires a large amount of storage
      e.g. BLOBs (bitmap images)
      e.g. CLOBs (character large objects)

   Unstructured?
    ◦ DBMS does not know what their structure is!
    ◦ Only applications know how to use them!
   Object Structure is defined and known to

   Live in an extent of their own type

   May have relationships defined

   Have a known state... can be read by the
   Two types of reference semantics
    ◦ Ownership Semantics
    ◦ Reference Semantics
   Sub-objects of a complex objects are
    encapsulated within the complex object (as part
    of that object)

   Also known as
    ◦ Is-Part-Of relationship
    ◦ Is-Component-Of relationship

   Objects can only be accessed by the methods
    defined in the complex object

   Objects don‟t need an OID

   Deleted if complex object is deleted
   Reference Semantics
    ◦ Components of complex object are themselves
      independent objects referenced from the object

      E.g. DEPARTMENT‟s Employee, where
              EMPLOYEE is an independent object

   Also known as
    ◦ Is-Associated-With relationship
   Independent Objects with own OID and

   Not encapsulated within object itself, but

   Define Relationships among independent

   Referenced component may be referenced by
    more than one complex object
   Transient Object v.s. Persistent Objects

    ◦ Transient: exist in execution process

    ◦ Persistent: Outlive application execution
   How do we make objects persistent?
    ◦ Typically using Naming and Reachability

   OODBMS generally closely coupled with OOPL
    ◦ E.g. db40 Eclipse or FastObjects.Net
    ◦ OOPL used to specify method implementation
    ◦ Object created by invoking object constructor

   Since we‟re persisting objects, and their
    states in our DB.
    ◦ How will we access them later on?
   Naming mechanism
    ◦ Giving a unique persistent name to an object
    ◦ Name used to retrieve object later on by same or
      other programs

   Named persistent objects
    ◦ Used as entry points to the DB
    ◦ Users/Applications start their DB access through a
      named object.
   BUT
    ◦ Can we name ALL objects in a large DB?
      Take Boeing‟s 747‟s OODB!

      NO! So we use Reachability
   Reachability mechanism
    ◦ Making object reachable from some persistent

    ◦ Object B is said to be reachable from object A
      If a sequence of references in the object graph lead
       from A to B.

    ◦ E.g. o1, o2 and o3 are reachable through o6
      Hence if o6 is made persistent, o1 through o3 also
       become persistent
   Mechanism details
    ◦ Named persistent object N whose state is a set/list
      of objects of class C.

    ◦ We can make objects of C persistent by ADDING
      them to N‟s set/list
      C‟s instances are reachable through object N

    ◦ Thus N defines a persistent collection of objects of
      class C.
      E.g. Class DEPARTMENT_SET with state of type
    ◦ Assume ALL_DEPARTMENTS object is created (of type
       Made persistent through Naming

    ◦ Adding DEPARTMENT object to the set of ALL_DEPARTMENTS
       add_dept()

    ◦ Each added DEPARTMENT instance becomes persistent
       Since it is now reachable from ALL_DEPARTMENTS

   ALL_DEPARTMENTS object == Extent of class
    ◦ Holds all persistent objects of type DEPARTMENT

   ODMG ODL defines ways of naming an extent as part of
    the class definition

   Recall our examples
    ◦ If we stored a Project instance myProj, and myProj
      has several other objects associated, then all
      „reachable‟ objects in the graph are stored!

   If we store an object, being the root of a large
    and complex tree, db4o will store the whole
    ◦ E.g. Storing Project, will also store the ArrayList
      object (referenced in Project)
    ◦ Clear example of persistence by reachability
   An OODBMS should let you define how far it can „reach‟
    down the object tree while it is determining what to persist
    ◦ DB4O == update depth
       Infinite by default
          Limitation is set by the available memory

   Note, that when we specify, db4o will
    ◦ Check if there is an active transaction, otherwise a new one is

 includes (but hides) StartTransaction, Store and
       DB4O works on Invisible Transactions, which guard for integrity

   Db.close() will terminate the transaction
   As seen earlier
    ◦ Class definition DEPARTMENT specifies
      Type
      Operations

   Persistent object for DEPARTMENT objects
    ◦ Defined separately in a class of type
      set(DEPARTMENT) with values as a collection of
      references to all persistent DEPARTMENT objects.
    ◦ Certain OODBMs automatically create Extents for
   Queries in a DB context
    ◦ Retrieve stored or computed entities and relationships in
      two [access] methods
      Direct: get entity at location X
      Associative: get employee where name == X

   Each DB Query Model should give us
    ◦ Declarative or Procedural Languages
    ◦ High level (abstracting from host system
    ◦ Independence from software and data
    ◦ Accountability
    ◦ Optimisability
   At times this is assumed
    ◦ Authors define it as a fundamental DB capacity

   The relational model was successful due to
    the standardized SQL query language
    ◦ Easy to use (using humane constructs)
    ◦ Designed for ad-hoc queries
   Returns tabular record sets according to the
    criteria set

   Record-by-record analysis is also possible

   Not efficient when loading objects from DB!
    ◦ If entities are structured, more than one query
      might be required to build up your object instance
      E.g. Load Employee Leave Entries: you might need to
        Load Employee details (in one or more tables)
        Load leave entries by employee ID
        Create leave entry object instances at code level
   Queries return an object reference to your
    ◦ Single object
    ◦ List of objects

   Single object
    ◦ Other object reachable through this object

    ◦ E.g. Load Employee instance
      Loads a list of Leave Entry instances (if referenced)
         List of objects!

   Tight coupling between Data Model and Objet
    Model used to build the application
   Recall that when we store an object
    ◦ We also store all the objects on the traversable
      object graph

    ◦ Same thing happens in querying

   Attention required
    ◦ Computational effort might be too expensive
      We might end up loading all the Airbus 380
       components simply because we wanted a screw!
   Solutions? 2!
    ◦ Reduce the depth (Activation Depth): Limit the
      depth of object references to be followed
      Active objects are those instances stored/retrieved in
       the current transaction

    ◦ Re-structure your model: E.g. Remove inverse
        Limits query flexibility
   Do we have SQL equivalence in OODBs?

   Let‟s start off from QBE
    ◦ Query By Example
       // Create template object with name field filled in
       Employee templateEmp = new Employee(“John“, null, null);

       // Execute the query
       ObjectSet result = db.Get(templateEmp);

    ◦ In DB4O the Get() is deprecated and replaced by
       ObjectSet result = db.QueryByExample(templateEmp);
   Examples

    ◦ Query by Class Type
       IObjectSet allEmps = db.QueryByExample(new
        Classes.Employee(null, null));

    ◦ Query by Class Type and 1 Value
      db.QueryByExample(new Classes.Department(null,
       lbDepts.SelectedValue.ToString(), null));

    ◦ Query by Class Type and n Values
       IObjectSet allEmps = db.QueryByExample(new
        Classes.Employee(“Joe”, “Caruana”));
   Advantages

    ◦ NULL fields in templates do not participate in QBE

    ◦ QBE is also Type Safe!
      Since queries are created in Java/C#/...
         Then compiler won‟t let you store a values into incorrectly
          typed fields (e.g. Cannot store String into Float variable)

    ◦ Simple to comprehend since you‟re tightly coupled with
      your DB

    ◦ Coding is simple, since Query Language IS your OOPL!
   Disadvantages (in DB4O)

    ◦ Only equivalence queries can be carried out!

    ◦ NULLs and 0 values indicate non-participating
      fields (0 indicates numeric fields)
      Thus how could I get all the students with 0 marks?

   Is QBE sophisticated enough?
    ◦ Not quite!
   Object Data Management Group
    ◦ 1991 – 2001
    ◦ Goal: to provide a set of specifications for
      developers to write portable applications for OODB
      and ORM products
    ◦ Dismantled in 2001 after the last revision (ODMG v.
      3.0) was published.

   ODMG created a query language for OODBs
    ◦ OQL – Object Query Language
   Declarative Language
    ◦ Set of conditions describing a solution

    ◦ Opposed to imperative styles of C#/Java
      Programmer specifies list of instructions in a particular

   E.g.
    ◦ SELECT e FROM Employee e. WHERE = “Mosta”
   Similar to SQL?
    ◦ At first glance, yes, BUT
      „e‟ is an object instance
      Uses dot notation to navigate through object
       references (accessing city field of Adress instance in e)

    ◦ OQL can be used to return tables of values
      Specify fields in the query
        E.g. Select, e.address.street FROM Employee e

    ◦ No INSERT or UPDATE available. Can use OQL to
      retrieve data
   Q1: Find all blue vehicles manufactured by a
    company located in Malta and whose
    manager is under 50 years of age

SELECT vehicle
FROM    vehicle
WHERE (colour = “blue”
    AND manufacturer.loc = “Malta”
    AND manufacturer.manager.age < 50)
   Q2: Assume class Employee with drives_car
    attribute whose domain is the class Vehicle. So
    find all blue vehicles driven by the Manager of
    the company that manufactured them

FROM   vehicle V
WHERE (colour = “blue”
     AND manufacturer.pres.drives_car = V

   Returning all Vehicles which are blue and
    produced by the same company
   Q3: Assume class Employee has another
    attribute called manager whose domain is the
    class Employee. Find all managers of an
    employee named “Joe”

SELECT employee
FROM   employee
RECURSE employee.manager (Name = “Joe”)

   … with cycles in the aggregation graph
   Central features (from ODMG 2.0)
    ◦   Attributes and Relationships
    ◦   Object operations (behavior) and exceptions
    ◦   Multiple Inheritance
    ◦   Extents and keys
    ◦   Object Naming, lifetime and Identity
    ◦   Atomic, structured and collection literals
    ◦   List, set, bag and array collection classes
    ◦   Concurrency Control and object locking
    ◦   DB Operations
   The Object Model of ODMG
    ◦   Provides a standard model for object databases
    ◦   Supports object definition via ODL
    ◦   Supports object querying via OQL
    ◦   Supports a variety of data types and type

   In Summary
    ◦ The basic building blocks of the object model are
         Objects
         Literals
   An object has four characteristics
    ◦ Identifier: unique system-wide identifier
    ◦ Name: unique within a particular database and/or
      program; it is optional
    ◦ Lifetime: persistent vs. transient
    ◦ Structure: specifies how object is constructed by the
      type constructor and whether it is an atomic object
   A literal has a current value but not an

   Three types of literals
    ◦ atomic: predefined; basic data type values (e.g.,
      short, float, boolean, char)

    ◦ structured: values that are constructed by type
      constructors (e.g., date, struct variables)

    ◦ collection: a collection (e.g., array) of values or
   Interface == ODMG‟s key word for Class/Type

interface Date:Object {
 enum weekday{sun,mon,tue,wed,thu,fri,sat};
 enum Month{jan,feb,mar,…,dec};
 unsigned short year();
 unsigned short month();
 unsigned short day();
 boolean is_equal(in Date other_date);
   A collection object inherits the basic
    collection interface, to include methods
    such as:

        cardinality()  No of elements
        is_empty()
        insert_element()
        remove_element()
        contains_element()
        create_iterator()  To iterate in Collection
   Collection objects are further specialized into
    types like a set, list, bag, array, and dictionary

   Each collection type may provide additional
    interfaces, for example, a set provides:
        create_union()
        create_difference()
        is_subset_of(
        is_superset_of()
        is_proper_subset_of()
   Atomic objects are user-defined objects
    and are defined via keyword class

   Example:

class Employee (extent all_emplyees key ssn) {
     attribute string name;
     attribute string ssn;
     attribute short age;
     relationship Dept works_for;
     void reassign(in string new_name);
   An ODMG object can have an extent defined
    via a class declaration

    ◦ Each extent is given a name and will      contain all
      persistent objects of that class

    ◦ For Employee class, for example, the      extent is
      called all_employees

    ◦ This is similar to creating an object of type
        Set<Employee> and making it persistent
   A class key consists of one or more unique

   For the Employee class, the key is ssn
    ◦ Thus each employee is expected to have a unique

   Keys can be composite,
    ◦ Example:(key dnumber, dname)
   An object factory is used to generate
    individual objects via its operations

interface ObjectFactory {
     Object new ();

   new() returns new objects with an

   One can create their own factory interface by
    inheriting the above interface
   ODMG supports two concepts for specifying
    object types:
    ◦ Interface
    ◦ Class

   There are similarities and differences between
    interfaces and classes

   Both have behaviors (operations) and state
    (attributes and relationships)
   An interface is a specification of the abstract
    behavior of an object type

    ◦ State properties of an interface (i.e. its attributes
      and relationships) cannot be inherited from

    ◦ Objects cannot be instantiated from an interface
   A class is a specification of abstract behavior
    and state of an object type

    ◦ A class is Instantiable

    ◦ Supports “extends” inheritance to allow both state
      and behavior inheritance among classes
   ODL supports semantics constructs of ODMG

   ODL is independent of any programming

   ODL is used to create object specification
    (classes and interfaces)

   ODL is not used for database manipulation
   A very simple, straightforward class definition

class Degree {
    attribute string college;
    attribute string degree;
    attribute string year;
   A class definition with “extent”, “key”,
    and more elaborate attributes;

class Person (extent persons key ssn) {
     attribute struct Pname {string fname …} name;
     attribute string ssn;
     attribute date birthdate;
     short age();
   Note extends (inheritance) relationship
   Also note “inverse” relationship
class Faculty extends Person (extent faculty) {
    attribute string rank;
    attribute float salary;
    attribute string phone;
    relationship Dept works_in inverse
    relationship set<GradStu> advises inverse
    void give_raise (in float raise);
    void promote (in string new_rank);
   Inheritance via “:”

interface Shape {
  attribute struct point {…} reference_point;
  float perimeter ();

class Triangle: Shape (extent triangles) {
  attribute short side_1;
  attribute short side_2;
   OQL is DMG‟s query language

   OQL works closely with programming
    languages such as C++

   Embedded OQL statements return objects
    that are compatible with the type system of
    the host language

   OQL‟s syntax is similar to SQL with additional
    features for objects
   Basic syntax: select…from…where…

    FROM d in departments
    WHERE = „Engineering‟;

   An entry point to the database is needed for
    each query

   An extent name (e.g., departments in the
    above example) may serve as an entry point
   Iterator variables are defined whenever a
    collection is referenced in an OQL query

   Iterator d in the previous example serves as
    an iterator and ranges over each object in the

   Syntactical options for specifying an iterator:
    ◦ d in departments
    ◦ departments d
    ◦ departments as d
   The data type of a query result can be any
    type defined in the ODMG model

   A query does not have to follow the
    select…from…where… format

   A persistent name on its own can serve as a
    query whose result is a reference to the
    persistent object. For example, departments;
    whose type is set<Departments>
   A path expression is used to specify a path to
    attributes and objects in an entry point

   A path expression starts at a persistent object
    name (or its iterator variable)

   The name will be followed by zero or more
    dot connected relationship or attribute names
    E.g., departments.chair;
   The define keyword in OQL is used to specify
    an identifier for a named query

   The name should be unique; if not, the
    results will replace an existing named query

   Once a query definition is created, it will
    persist until deleted or redefined

   A view definition can include parameters
   A view to include students in a department
    who have a minor:

    define has_minor(dept_name) as
    select s
    from   s in students
    where s.minor_in.dname=dept_name

   has_minor can now be used in queries
   An OQL query returns a collection

   OQL‟s element operator can be used to
    return a single element from a singleton
    collection that contains one element:

element (select d from d in departments
where d.dname = „Software Engineering‟);

   If d is empty or has more than one elements,
    an exception is raised
   OQL provides membership and quantification operators:
    ◦ (e in c) is true if e is in the collection c
    ◦ (for all e in c: b) is true if all e elements of collection c
      satisfy b
    ◦ (exists e in c: b) is true if at least one e in collection c
      satisfies b

   To retrieve the names of all students who completed
from       s in students
where      'CS101' in
  from c
  in s.completed_sections.section.of_course);
   Collections that are lists or arrays allow
    retrieving their first, last, and ith elements

   OQL provides additional operators for
    extracting a sub-collection and concatenating
    two lists

   OQL also provides operators for ordering the
   OQL also supports a grouping operator called
    group by

      To retrieve average grade of majors in
    each department having >100 majors:
      group by deptname:

select deptname, avg_gpa:
  avg (select p.s.gpa from p in partition)
  from s in students
  group by deptname: s.majors_in.dname
  having count (partition) > 100
   Object Database (ODB) vs. Relational
    Database (RDB)
    ◦ Relationships are handled differently
    ◦ Inheritance is handled differently
    ◦ Operations in ODB are expressed early on
      since they are a part of the class
   Relationships in ODB:
    ◦ Relationships are handled by reference attributes that
      include OIDs of related objects
    ◦ Single and collection of references are allowed
    ◦ References for binary relationships can be expressed
      in single direction or both directions via inverse

   Relationships in RDB:
    ◦ Relationships among tuples are specified by attributes
      with matching values (via foreign keys)
    ◦ Foreign keys are single-valued
    ◦ M:N relationships must be presented via a separate
      relation (table)
   Inheritance Relationships (ODB vs RDB)
    ◦ Inheritance structures are built in ODB (and achieved
      via “:” and extends operators)
    ◦ RDB has no built-in support for inheritance

   Early Specifications of Operations
    ◦ Another major difference between ODB and RDB is the
      specification of operations
      ODB: Operations specified during design (as part of class
      RDB: Operations specification may be delayed until

To top