Docstoc

Object Oriented Query Languages and Views

Document Sample
Object Oriented Query Languages and Views Powered By Docstoc
					  ADBIS - DASFAA'2000
  Fourth International Symposium on Advances in Databases and Information Systems
  September 5-8, 2000, Prague, Czech Republic



                                                                Tutorial:
           Object-Oriented Query Languages and Views
                  Part 1: Basic concepts and issues


                             Lecturer:         Kazimierz Subieta
                             Polish-Japanese Institute of Information Technology,
                             Warsaw, Poland

                             Institute of Computer Science
                             Polish Academy of Sciences, Warsaw, Poland
                                          subieta@ipipan.waw.pl
                                          http://www.ipipan.waw.pl/~subieta
K.Subieta. Object-Oriented Query Languages and Views, slide 1                       Sept. 2000
                                              What Is a Query Language?
                 A lot of views...

                 User-friendly: A language for a person who does not want to knows anything
                 about databases, but wants to operate with them.
                 Theoretical achievement: A syntactic variant of some famous mathematical
                 theory, e.g. logic. Used mainly to produce a next paper to a next conference (:-)).
                 Ad hoc interactive database language: A facility for quick retrieval and simple
                 updates through formalized commands (select...from..., update...set..., ...) or
                 through some simple visual interfaces: forms, graphs, menus, etc.
                 A very-high-level programming construct to be embedded into a popular
                 programming language, e.g. SQL embedded into C.
                 A very-high-level programming construct integrated with a                     new
                 programming language, e.g. PL/SQL, many 4GLs, and (recently) SQL3.



K.Subieta. Object-Oriented Query Languages and Views, slide 2                                  Sept. 2000
                                        Properties of Query Languages
             Abstract, conceptual, data independent: no concepts related to physical level of
             data (file organizations, indices, moving data between disk and main memory, etc.
             Declarative (non-procedural): determines what is to be done rather than how.
             Macroscopic: from the point of the user a query determines simultaneous
             (parallel) actions on many data.
             Natural: supporting a natural way of thinking of the user, supporting conceptual
             modeling, easy to learn and use.
             Efficient: acceptable response time (performance) through automatic query
             optimization.
             Universal: giving the possibility to express every (reasonable) request.
             Independent of an application domain: supporting all applications of the given
             DBMS.
             Lately bound, interpreted: supporting ad hoc queries and various abstractions
             stored in a database (views, stored procedures, active rules, etc.).

K.Subieta. Object-Oriented Query Languages and Views, slide 3                            Sept. 2000
         A Query Language in a Database Environment

              A tool for an end user enabling him/her interactive (ad hoc) querying,
              generating reports and updating data stored in a database.
              Very-high-level conceptual language constructs for database programming.
              Defining integrity constraints preventing illegal operations/states.
              Defining subschemas and access restrictions.
              Defining virtual views, materialized views, derived (replicated) data, and
              procedures stored in a database.
              Components of scripts in 4GL-s and RAD tools.
              Defining active rules (triggers) and deductive rules.
              Determining data to be selected and transmitted in distributed databases;
              interoperability between heterogeneous/remote databases (ODBC, JDBC, ...).
              ...... several other applications


K.Subieta. Object-Oriented Query Languages and Views, slide 4                            Sept. 2000
                                                                Query Optimization
             A query language is unacceptable without automatic optimization of queries.
             Typical optimization methods:
             Rewriting. A query q1 is substituted with a semantically equivalent query q2
             promising better performance. E.g. performing selections before joins or
             removing dead (not used) parts of queries. The methods have no negative side
             effects. They require, however, regularity and homogeneity of the language
             definition and deep understanding of formal semantics;
             Auxiliary data structures and special data organizations: indices, tables of
             pointers, hash coding, etc;
             Caching results of queries and then reusing them (materialized views);
             Simultaneous optimization of many queries;
             Selecting an optimal query evaluation plan.

    Optimization of object-oriented QLs is in infancy. There is a lot of wishful thinking
    and poorly justified hopes concerning the optimization potential of particular theories.
    This is one of the reasons of slow adoption of object QLs in the commercial world.
K.Subieta. Object-Oriented Query Languages and Views, slide 5                               Sept. 2000
                  Optimizable and Non-optimizable Queries

                                                                An entire query language

                                                                Efficient (optimizable) queries



                                                                                    Most useful queries




               In any query language there are non-optimizable queries.
               The non-optimizable part of a QL should be as small as possible.
               In real QLs not all useful queries are optimizable. Poorly defined, informal or
               irregular features of a QL decrease the optimization potential.
K.Subieta. Object-Oriented Query Languages and Views, slide 6                                             Sept. 2000
                                            OQL and SQL3 on One Slide
           OQL is a part of the ODMG standard. It is claimed to be a compatible extension
           of SQL, but actually OQL retains some syntactic patterns of SQL only.
           Semantically OQL is very different from SQL, because it follows an object model,
           which is incompatible with the relational model. OQL does not deal with updating
           and does not define SQL-like facilities such as views, triggers and stored
           procedures. OQL statements can be embedded into Java, C++ and Smalltalk, with
           a lot of impedance mismatch. The semantics of OQL is defined poorly and
           inconsistently, thus probably it is not fully implementable.
           SQL3 is a new SQL standard developed by ANSI and ISO. In contrast to its
           predecessors SQL3 is assumed to be a programming language with full
           computational power. The main data structure is a table, equipped however with a
           lot of options (thus using the term “relational” makes no sense). SQL3 supports
           user-defined abstract data types, including methods, object identifiers, subtypes,
           inheritance and polymorphism. Further facilities include control statements and
           parameterized types. Together with an extremely rich collection of various
           features, SQL3 is claimed to be downward compatible with SQL-92 and follows
           the sweet select...from...where... syntax. The standard is eclectic and extremely
           huge (more than 1000 pages), thus probably it is not fully implementable.
K.Subieta. Object-Oriented Query Languages and Views, slide 7                             Sept. 2000
               Is Java an Alternative to Query Languages?
                There is a lot of discussion around the role of Java in database programming.
                Java is a very important language.
                But...
                Java offers rather classical low-level (object-oriented) programming.
                The portability of Java bytecode is low-level. There are many details of database
                interfaces outside the Java bytecode standard.
                The Java object model is not powerful enough to be a database model.
                The Java database interfaces (JDBC, SQLJ,...) are not object-oriented. They
                present SQL interfaces to relational databases, wrapped into the Java syntax.
                Java does not solve the problem of storing important abstractions within a
                database (views, database procedures, triggers, etc.)

                Java itself is not an answer to database problems. It presents much lover
                level of database programming than programming via query languages.
                Java + any query language do not avoid the impedance mismatch.

K.Subieta. Object-Oriented Query Languages and Views, slide 8                                   Sept. 2000
                  Requirements to Object Query Languages
               Conceptual simplicity, generality and minimality: clean and precise
               semantics, a small set of semantic primitives, no redundant constructs.
               Pragmatic universality: the possibility to formulate any request.
               Universality of an approach to semantics: no black holes in the semantics,
               treating every semantic inconsistency or irregularity as a very big problem.
               Compositionality and orthogonality: no big syntactic constructs, every
               reasonable combination of constructs should be allowed.
               Modularity: the possibility to create complex encapsulated reusable units.
               Homogeneous and consistent approach to all concepts of the underlying
               object model: complex objects, classes, types, interfaces, ADTs, inheritance,
               methods, encapsulation, polymorphism, etc.
               Homogeneous and consistent approach to integration of the query language
               with programming constructs (updating, etc.) and abstractions (views,
               methods, stored procedures, parameters of procedures, etc.).
               High potential for query optimization.

K.Subieta. Object-Oriented Query Languages and Views, slide 9                               Sept. 2000
                                                    Coupling a QL with a PL
              Loose coupling (“embedding”): A QL is developed independently of a PL. An
              additional interface (“glue”) is implemented, enabling the use of QL within the
              underlying PL. The complexity of the interface presents a problem. The
              incompatibility between QL and PL is referred to as impedance mismatch. Typical
              cases: SQL + C, JDBC+Java, OQL + C++, OQL + Java
              Advantages: the programmers can deal with their favorite PL, the API to a
              database is independent of PLs.
              Disadvantages: aesthetically ugly, a lot of limitations, programs are longer than
              necessary, tricky programming, poor maintainability.

              Tight coupling (“seamless integration”): Queries are building blocks for
              programming constructs. No special interface between QL and programming
              constructs is provided.
              Typical cases: Oracle PL/SQL, many 4GLs, DBPL, LOQIS, SQL3
              Advantages: a consistent homogeneous solution, no impedance mismatch.
              Disadvantages: implies a new programming language, which is difficult to
              promote in the current commercial world.
K.Subieta. Object-Oriented Query Languages and Views, slide 10                            Sept. 2000
                                         What is Impedance Mismatch?
              Incompatibility between a PL and a QL to be embedded. It concerns:
              Syntax: Two different grammars within one programming interface;
              Type systems: different types, lack of bulk types in PL, no static typing of QL;
              Semantics and pragmatics: declarative QL vs. procedural PL;
              Abstraction levels: data independence of QL vs. deep data dependence of PL;
              Binding phases and mechanisms: late binding of QL vs. early binding of PL;
              Name spaces and scoping rules: two incompatible name spaces the programmer
              deals with, stack based scoping rules of PL are not respected by QL;
              Null values: they are ignored in PL, special mapping tricks are required;
              Iteration mechanisms: implicit in QL (selection, projection,...), explicit in PL;
              Persistence: QL - only persistent data, PL - only volatile data; in PL special
              mechanisms are required to copy persistent data into volatile memory and v/v.
              Generic programming: reflection in QL (see dynamic SQL), other techniques
              (e.g. casting, templates) in PL.
              Looking at the above, claims that ODMG Java + OQL avoid the impedance
              mismatch are at least dubious.
K.Subieta. Object-Oriented Query Languages and Views, slide 11                             Sept. 2000
                                   Object-Orientedness in Databases
                 The commercial world: manifestos, standards and products - no agreement.
                 OODB Manifesto, 3rd Generation DB Manifesto, Third Manifesto, ODMG standard,
                 SQL3 standard (SQL 1999, SQL 2000), persistent Java, XML as a database model,
                 CORBA as a database model, Gemstone, Versant, O2, ObjectStore, Poet, Objectivity/DB,
                 Uni SQL, Oracle 8, Informix Dynamic Server, and others.
                 Useful. But... Eclectic solutions, legacy burden, design monsters, underspecified
                 semantics, non-universal solutions, redundant constructs, many inconsistencies.
                 The academic world: theories and prototypes - no agreement.
                 Nested relational algebras, F-logic, comprehensions, monoid calculus, object algebras,
                 object calculi, structural recursion, functional approaches, and others.
                 Theories neglect vital aspects of object-orientedness, present wishful thinking
                 (e.g. concerning a mapping from OQL), are limited, are conceptually or
                 mathematically invalid (e.g. object algebras). False stereotypes inherited from the
                 relational model (e.g. concerning the role of an algebra in query optimization).

                 The today‟s state-of-the-art is PREMATURE (despite thousands of papers).
                 The stability - not sooner than after 5-10 years. Thus in this tutorial we will
                 discuss concepts without referring to a particular standard, product or theory.
K.Subieta. Object-Oriented Query Languages and Views, slide 12                                    Sept. 2000
                                                                 Complex Objects
                                                                  Many data hierarchy levels (no limitations);
Conceptual                                                        Nested repeating attributes (collections);
 modeling                                                         Large objects (BLOBs) as regular values;
                                                                  References (pointers) to other objects.

                                                                 EMPLOYEE
                                                                                        COMPANY IBM
                             ENO E127
                                                                              PREV_JOB
                                                           JOB designer               WHEN 1975-77
                   NAME Smith
                                                           JOB analyst
                                                                                        COMPANY ICL
                           PHOTO                                              PREV_JOB
                                                                                     WHEN 1977-90
                                                                 WORKS_IN

K.Subieta. Object-Oriented Query Languages and Views, slide 13                                                    Sept. 2000
                                                                 Classes
             Bad definitions:
             A class is a collection of objects (wrong: what about methods, inheritance, ...?);
             A class is a blueprint for objects (wrong: only creation of objects is regarded).
             Correct definition:
             A class is a conceptual (imaginary) entity storing invariant properties of objects.
             (Invariant properties are factored out from objects to their classes.)
             Typical invariant properties include:
             • Names and types of objects’ attributes (i.e. a type of an object);
             • Methods that can be applied to an object.
             But invariant properties can be of various kind:
             • A name of an object (ODMG);
             • Rules for events or exceptions that can occur on objects (CORBA, ODMG);
             • Links (pointers, references relationships) to other objects (ODMG);
             • Active rules (triggers) and integrity constraints;
             • Default values for attributes;
             • ...
             Sometimes a class is a regular run-time object (Self).
K.Subieta. Object-Oriented Query Languages and Views, slide 14                               Sept. 2000
                                                    Encapsulation, Interfaces
            Encapsulation and information hiding
            are basic principles of any engineering, including software engineering.
            A TV set encapsulates a lot details, which the user does not need to know.
            De-encapsulation of these details may result in an electric shock of the user.
            A similar threat concerns a programmer working on non-encapsulated software.

         Interface - general definition: It consists of everything that the programmer can
         use or has to know in order to correctly process the object. An interface should not
         include unnecessary (physical) details of objects’ construction or operation.
         Interface - particular definition (CORBA, ODMG, Java): An interface a
         specification of all public properties (attributes and methods) of an object that the
         programmer can use in a particular context.
         Interface is a concept different from class (e.g. classes can be sold, interfaces - not);
         Interface is a concept different from type (e.g. exceptions are not relevant to types).

         Bad understanding of encapsulation (all attributes are private, only some methods
         are public) has led to the absurd thesis on contradiction between encapsulation
         and query languages.
K.Subieta. Object-Oriented Query Languages and Views, slide 15                                   Sept. 2000
                                                                 Inheritance
       If two or more classes have common invariants, then they can be factored out to
       another class. Hence classes are organized in a hierarchy.
       An object inherits invariants from its class and from all its superclasses.
       Multi-inheritance – many superclases are allowed.

                                                                                      Person
      Employee                            Student                                     name
      name                                name                                        date of birth
      date of birth                       date of birth                               age
      salary                              faculty
      age                                 grades
      salary net                          age
                                          average grade                        Employee     Student
                                                                               salary       faculty
                                                                               salary net   grades
                                                                                            average grade



K.Subieta. Object-Oriented Query Languages and Views, slide 16                                              Sept. 2000
                                                         Methods and Messages
              A method is a procedure stored within a class.
              It acts on an environment consisting of:
              internal environment (attributes) of the currently processed object;
              private and public properties of the same class and public properties of all its
              superclasess;
              base environment, which includes database, volatile variables/objects of the user
              session and global environment (environment variables, libraries);
              public properties of currently active program modules.
              A message is a call of a method.
              Message passing does not mean parallel asynchronous communication between
              autonomous agents (this was false association made by OO pioneers).
              The usual syntax for messages: object . methodName [( parameters )]
              e.g.                                               (Person where name = “Smith”). age
              Query languages can introduce other syntax for messages, e.g.
              e.g.                                               (Person where age > 30) . name
K.Subieta. Object-Oriented Query Languages and Views, slide 17                                        Sept. 2000
                                                                 Types
            A type is an expression, which constraints the content of objects or value.
            Types determine input/output properties of operators, functions, procedures
            and methods. Specification of types is obligatory in strongly typed languages.
            Types formally restrict a context of the use of objects, operators, methods, etc.
            Strong (static) type checking (strong typing): every use of objects, operators,
            functions, methods, etc. is checked at compilation time.
            Typing safety: more than 80% of programmer’s errors are detected at compilation.
            Dynamic type checking (dynamic typing): types are checked at run-time. Less
            efficient w.r.t. detecting errors.
            ODMG OQL is strongly typed, but the typing system is inconsistent (S.Alagic).
            SQL and SQL3 are dynamically typed.

            Strong typing decreases the power of a language (generic programming) and
            presents a difficulty for developers of DBMS. Thus the attitude of the commercial
            world to strong typing is undetermined (officially approved, practically neglected).

K.Subieta. Object-Oriented Query Languages and Views, slide 18                                Sept. 2000
                                                Links (references, pointers)
     Objects can be connected by explicit pointer links. Links support conceptual
     modeling (see OMT, UML, etc.). Links can be directly used in queries (through path
     expressions). They much simplify queries and are more efficient than joins.
          EMPLOYEE                                                EMPLOYEE          EMPLOYEE
          NAME Brown                                             NAME Jones         NAME Smith
        SALARY 3500                                              SALARY 2500       SALARY 2000
        WORKS_IN                                                 WORKS_IN          WORKS_IN


           COMPANY
      BOSS    EMPLOYS
      NAME Syntex                                 EMPLOYS
      LOC London                                             EMPLOYS           A path expression in SBQL:
                                                                               Name of the Smith’s boss:
(EMPLOEE where NAME = “Smith”).WORKS_IN.COMPANY.BOSS.EMPLOYEE.NAME
K.Subieta. Object-Oriented Query Languages and Views, slide 19                                              Sept. 2000
                                                                 Class Extents
          An extent is a named collection of objects being current members of a class.
          The concept has roots in the relational model, where a declaration of a table (a
          protoplast of the class concept) is sticked with creation of the table (i.e. extent).
          In PLs declarations of types/classes are separated from declaration/creation of
          corresponding variables/objects.
          An extent is a bit doubtful concept. For example, having a class Person and its
          subclass Employee the extent for Person and the extent for Employee has a non-
          empty intersection: some parts of objects Employee are “virtual” parts of the extent
          for Person. This can be the source of inconsistencies and programmers’ errors.
          It is also easy to imagine the situation, where a class must have not one but many
          extents. E.g. the class EmployeePhotoAlbum has an extent for each employee.
          Probably, the extent concept gives very little for the database designers and requires
          additional attention of programmers. In my opinion, it should be dropped.



K.Subieta. Object-Oriented Query Languages and Views, slide 20                                    Sept. 2000
                                                                 Collections
                The relational model deals only with one collection - a table (relation).
                Object models (ODMG) assume more collections, in particular:
                sets (no duplicates, no order);
                bags (duplicates are allowed, no order);
                sequences (duplicates are allowed, the order is informative);
                arrays - as sequences, no inserting/deleting elements except top, access through
                order numbers.

        Collections can be nested, e.g. collection-valued attributes are allowed.
        Moreover, collection-valued pointer links are allowed too.

        Collections have a big meaning for conceptual modeling.
        Nested collections simplify queries (navigations instead of joins).
        Typical object-oriented programming languages have no explicit collection types;
        collections must be modelled by some tricks.

K.Subieta. Object-Oriented Query Languages and Views, slide 21                              Sept. 2000
                                                            The OODBMS Ideals
                 Orthogonal persistence: the same types can be applied to persistent and
                 volatile objects. Ergo: a query language should process persistent and volatile
                 data uniformly. The ideal much reduces the complexity of a QL.
                 Object relativism: each object consists of objects; there are no other concepts
                 that are used for description of objects. Ergo: (atomic) attributes are objects,
                 links are objects, each object can be a component of a higher-level object, etc.
                 The ideal much reduces the complexity of a QL.
                 Total internal identification: each run-time program entity, which can be
                 separately retrieved, bound, updated, inserted, indexed, protected, locked, etc.,
                 must possess an unique internal identifier. Internal identifiers can be used as
                 references (l-values) by various language constructs (e.g. updating). The ideal
                 much simplifies semantics of a QL and makes it consistent.


                 Unfortunately, the above ideals are not respected by commercial OODBMS-s
                 and standards.

K.Subieta. Object-Oriented Query Languages and Views, slide 22                               Sept. 2000
      Syntax, Semantics and Pragmatics of Languages
         Each language, including query languages, has three aspects:
         Syntax: describes how to build correct expressions of the language from the
         alphabet (basic symbols).
         Semantics: describes what expressions of the language denote. Semantics is the
         basis for implementation, in particular for query optimization. Semantics is usually
         syntax-driven: semantic rules are built on top of syntax rules.
         Pragmatics: describes how to use the language to accomplish practical needs. It
         deals with mapping concrete problems or tasks onto expressions of the language.
         Pragmatics is informal, important for teaching, frequently the most difficult for
         users. (Even the developers of SQL3 have problems how to use their own creature!)
         Many languages are explained through syntax and pragmatics. Few languages are
         specified through precise formal semantics.

         If semantics is underspecified, then each implementation of the language augments
         the specification on its own way. This is the reason of low portability of languages.

K.Subieta. Object-Oriented Query Languages and Views, slide 23                           Sept. 2000
                                         Semantics of Query Languages
    Semantics of a query is a function which maps a state into a result.
    For any query language, what we have to define?
                Query - a syntactic domain of all queries;
                Result - a domain containing all possible results of queries;
                State - a domain containing all possible states.

    General definitions of semantics:
               sem : Query  (State  Result)                    For queries with no side effects;
               sem : Query  (State  (Result  State)) For queries with side effects;
               sem : Query  (State  State))                    For imperative queries (e.g. the
                                                                 update clause of SQL).

    Formal semantics can be different from implementation (see SQL).
    Stateless approaches to query languages (logic, algebra) have severe limitations.
K.Subieta. Object-Oriented Query Languages and Views, slide 24                                      Sept. 2000
                                          Four sides of a query language
           As follows from the previous slide, description of any query language must involve
           four sides:

           Description of data stored in a database that are to be queried, i.e. the description
           of data/object model (formal definition of the set State);

           Description of syntax (formal definition of the set Query);

           Description of results returned by queries (formal definition of the set Result);

           Description of the mapping from queries into results (formal definition of the
           mapping sem).

           Usually in practical languages, these definitions are incomplete, inconsistent or
           even sloppy. If one has to be serious with implementation and query optimization
           all these definitions must be as clean and precise as possible.

           Unfortunately, it is not easy to present all sides in detail during 180 minutes
           of the tutorial.
K.Subieta. Object-Oriented Query Languages and Views, slide 25                                 Sept. 2000
                                                                 What is “state”?
         For real QLs state is more than database state. A state includes:
         database: all data, objects, classes, methods, etc. in the database;
         volatile objects/variables/... of the run-time environment of a user session;
         local objects/variables/... of all currently executed procedures, functions, methods;
         global environment: environment variables, libraries, files, etc.

      A state includes temporary internal structures of the query processing machine:
      Get 10 best-paid employees:
      select * from Employee as x
      where count( select * from Employee as y where y.salary > x.salary ) < 10

      If the subquery select * from Employee as y where y.salary > x.salary
      has to be evaluated independently of the context, then the “free” variable x must be
      stored on an internal structure of the query processing machine.
      This internal structure (stack) augments the concept of “state”.
K.Subieta. Object-Oriented Query Languages and Views, slide 26                            Sept. 2000
                                                           The Closure Property
           It is a property of a query language (or a theoretical framework) saying that
           the input and output of queries should belong to the same formal domain.
           For relational QLs, the input consists of tables and the output is a table. For object-
           oriented QLs the input is a set of objects and the output is a set of objects too.
           According to its advocates, the closure property is a condition for nesting queries.
           I disagree. Essentially, the closure property is a false inconsistent stereotype
           inherited from the relational model. The closure property does not hold even for
           SQL. E.g. input tables are named and output tables are unnamed; hence there is a
           big semantic difference between input and output tables.
           For object-oriented QLs the closure property is a conceptual nonsense.
           In particular, it leads to subdivision of queries onto “object preserving” and
           “object generating”, which is a nonsense too.


           We will formulate QLs semantics in consistent and formally correct terms,
           without subdividing queries onto “object preserving” and “object generating”.
K.Subieta. Object-Oriented Query Languages and Views, slide 27                               Sept. 2000
                                                                 Results of Queries
               In ODMG terms, queries return literals. We generalize this concept.
               The recursive definition below defines the domain Result:
               Each atomic value  Result.
               Each reference (to an object, attribute, link, method, view, etc.)  Result.
               If v  Result, n is a name, then n(v)  Result. Such results will be called
               binders.
               If v1, v2, v3, ...  Result, then row{ v1, v2, v3, ...}  Result. In general, the order
               of elements in the row is essential. This construct generalizes a tuple known from
               relational systems.
               If v1, v2, v3, ...  Result, then set{ v1, v2, v3, ... }, bag{ v1, v2, v3, ... }, sequence{
               v1, v2, v3, ... }, ...  Result.
               There is no other results.

       In our terms queries never return objects, but can return references to objects,
       or more precisely, some structures built upon references, values and names.

K.Subieta. Object-Oriented Query Languages and Views, slide 28                                       Sept. 2000
                                                  Example results of queries
                         Atomic:
                         25, "Smith", i11, i18
                         Complex:                                                          i1, i2 ,... - references
                         row{i1, i21}
                         bag{ row{i1, i21}, row{i7, i26} }
                         bag{row{ 2, Lecture(i26), Stud( bag{
                                                    row{ n("Russel"), y(i36) },
                                                    row{ n("Black"), y(i30) }})}
         We present bags of rows as rectangular tables (similarly to relational tables),
         for example:

                     4           i33           p(i1)             “Russell”   i15   i2   52 lect(i21)
                                 i40           p(i7 )            “Jones”           i8   44 lect(i26)
                                 i47                             “Black”


K.Subieta. Object-Oriented Query Languages and Views, slide 29                                                        Sept. 2000
ADBIS - DASFAA'2000
Fourth International Symposium on Advances in Databases and Information Systems
September 5-8, 2000, Prague, Czech Republic



                                                                 Tutorial:
           Object-Oriented Query Languages and Views
                 Part 2: The Stack-Based Approach


                             Lecturer:         Kazimierz Subieta
                             Polish-Japanese Institute of Information Technology,
                             Warsaw, Poland

                             Institute of Computer Science
                             Polish Academy of Sciences, Warsaw, Poland
                                          subieta@ipipan.waw.pl
                                          http://www.ipipan.waw.pl/~subieta
K.Subieta. Object-Oriented Query Languages and Views, slide 30                      Sept. 2000
          Why the stack-based approach (SBA) to QLs?
The motto of SBA (frequently neglected by database researchers):
Each, even apparently small semantic problem is a big problem.

             We would like to achieve:
             Universality concerning both data structures an QL/PL functionalities;
             Modularity and compositionality (the hierarchy of conceptual abstractions);
             Regularity, full orthogonality of the concepts;
             Minimality of semantic primitives;
             Clean and precise (formal) semantics;
             Uniform approach and full integration with procedural capabilities:
             updating, procedures, views, methods, etc.;
             Precise treatment of object-oriented concepts (classes, encapsulation, ...);
             High potential for query optimization.

K.Subieta. Object-Oriented Query Languages and Views, slide 31                              Sept. 2000
                                             The Environment Stack (ES)
             In PLs the environment stack is a basic mechanism to accomplish:
             Abstraction: the programmer can abstract from internal details of procedures.
             Semantic independence and program reuse: the meaning and the behaviour of
             procedural abstractions is independent on the context of its use.
             Recursion: A procedure (function, method, view) can call other procedures, in
             particular, can call itself. Encapsulation of local environments is preserved.
             Consistent binding: Name x is bound to the most local definition or declaration
             of x. Other definitions or declarations of the name x should be allowed.
             Parameter passing: The stack makes it possible to store and manage parameters
             of procedures and to accomplish consistently parameter passing methods.
             Proper scoping: a program entity should act only on the data environment and
             name space that the programmer who has programmed the entity was aware of.
             In SBA the environment stack has a new role: consistent semantic
             mechanism for definition and implementation of query operators.


K.Subieta. Object-Oriented Query Languages and Views, slide 32                            Sept. 2000
                                       Assumptions of the SBA: syntax
         Unification of PL expressions and queries:

         2+2                                                                                  EMP, NAME, SAL
         (x + y) * z                                                                          persistent data
                                                                                              x, y, z
         (x + (EMP where (NAME = “Smith”)).SAL) * z
                                                                                              volatile data

         2, “Smith”, 1000,...
         x, y, z, EMP, NAME, SAL, ...                            }   atomic queries
         +,- ,*, /, =, >, where, ., ...                              binary operators
         sin, sqrt, sum, count, distinct,...                         unary operators
         q1, q2 are queries,  is a binary operator                                    q1  q2 is a query
         q is a query,        is a unary operator                                     (q)    is a query

          Abstract syntax and compositionality: no big syntactic/semantic constructs,
          e.g. no famous select ... from ... where ... group by ... having ... order by ...
          Big constructs decrease orthogonality, maintainability, reusability and optimization
          potential, are more difficult in implementation, are the reason of irregularities.
K.Subieta. Object-Oriented Query Languages and Views, slide 33                                               Sept. 2000
                                Assumptions of the SBA: semantics
      The naming-scoping-binding principle:
      Each name occurring in a query is bound to run-time database/program entities
      (persistent objects, volatile objects, attributes, procedures, parameters, views,
      methods,...) according to the actual scope for the name.
       This concerns:
       • names of persistent objects;
       • names of objects’ attributes, sub-attributes, ...;
       • auxiliary names (“variables”) defined within a query;
       • names of transient objects, programming variables and their attributes;
       • names of procedures, methods, operators, ...;
       • names of parameters of procedures, methods,...;
       • ..... any other name.
        • Scopes are organized in ES with the “search-from-the-top” rule.
        • ES is separated from the object store.
        • Binding of a name can be multi-valued (macroscopic binding).

    In SBA the domain State consists of: Object Store + Environment Stack.
K.Subieta. Object-Oriented Query Languages and Views, slide 34                        Sept. 2000
                                                     An Abstract Store Model
       The component of a „state”.

       I - a set of internal identifiers                                     < i, n, v > atomic object
       N - a set of external names                                           < i1, n, i2 > link object
       V - a set of atomic values, blobs, compiled                           < i, n, T > complex object
       procedures, ...                                                           T is a set of objects

                                                                               some obvious constraints
                                          A set of objects +
       Store:                             A set of identifiers (roots)   +     (uniqueness of identifiers,
                                                                                referential integrities)


             No record, tuple, array, set, and bag constructors in the model: essentially all
             of them are collections of objects (“environments”).
             No uniqueness of external names on any level of data hierarchy: modeling
             bulk data.
             Uniform treatment of relational, object-relational and pure object databases.
             Classes, inheritance and encapsulation require extension.                      later

K.Subieta. Object-Oriented Query Languages and Views, slide 35                                               Sept. 2000
                                                                 Tiny Database

                                                                                                  EMP[0..*]
       i1 EMP                                             i5 EMP               i9 EMP             NAME
       i2 NAME Brown                                      i6 NAME Smith        i10 NAME Jones     SAL
                                                                                                  JOB[0..1]
       i3 SAL 2500                                        i7 SAL 2000          i11 SAL 1500       age
       i4 WORKS_IN i13                                    i8 WORKS_IN i17      i12 WORKS_IN i17        WORKS_IN



                  i13 DEPT                                                    i17 DEPT            DEPT[0..*]
            i14 DNAME Toys                                                  i18 DNAME Sales       DNAME
                                                                                                  LOC[1..*]
               i15 LOC Paris                                                 i19 LOC Berlin

             i16 LOC London




K.Subieta. Object-Oriented Query Languages and Views, slide 36                                                 Sept. 2000
                                        Universality of the Store Model
                Complex hierarchical objects can be defined (no limits of levels);
                programming variables are treated as objects;
                Orthogonal persistence: we abstract from the persistence status of objects and
                variables, i.e. we define in the same way persistent and transient objects;
                Object relativity: no difference in treatment of objects on any hierarchy level -
                big advantage for the universality, minimality and simplicity of semantics.
                Total internal identification: each entity stored in the store model, including
                attributes, links, BLOBs, methods, views, etc. has a unique internal identifier;
                Binary relationships (associations) can be defined via link objects; as in the
                ODMG model we do not deal with ternary and higher order relationships and
                attributes of relationships (they must be reduced to binary ones);
                Bulk data: we deal with sets/bags. They are modeled by the same name assigned
                to many objects on the same hierarchy level;
                Relational structures: each tuple is understood as an object with subobjects;


K.Subieta. Object-Oriented Query Languages and Views, slide 37                                 Sept. 2000
                                                                 What is binding?
   Binding is substituting a name occurring in a query or a program
   by a run-time program entity (entities).

   For example:
   • procedure name occurring in a program is substituted by a call of a machine code;
   • variable name is substituted by an address of a main memory
   • attribute name is substituted by an offset relatively to the beginning of a structure;
   • object name is substituted by an object identifier.

                 Binding is early or static, if the substitution is made before the program is
                 executed (i.e. during compilation and linking).

                 Binding is late or dynamic, if the substitution is made during run time.


    In query languages binding is usually dynamic, because of dynamic database features
    (inserting new data, removing data, creating/removing views, etc.)
    Static binding is sometimes used for optimization.
K.Subieta. Object-Oriented Query Languages and Views, slide 38                              Sept. 2000
                                                                 What is binder?
Binder is an internal structure to determine bindings.
A binder consists of two parts:
      An external name defined by the                                        An internal run-time program entity,
          application designer or                                             e.g. an object identifier, a value, a
               programmer                                                              procedure code.
   This is an abstract view. In implementation binders may be not so explicit.

   Binding: for each external name occurring in a query a proper binder is found;
   then the name is substituted by the corresponding internal entity.

   Binders are written as n(x), where n is an external name, x is an internal entity.
   For a binder n( i ) name n may be different from the name of the object identified by i.

   In query languages binders have additional roles.
   Binders can be nested, i.e. x may consist of binders.
   In general, a binder will be understood as a query result equipped with a name.
            General definition of binders:                              n  N, r  Result  n( r ) is a binder
K.Subieta. Object-Oriented Query Languages and Views, slide 39                                                   Sept. 2000
                                                       The Environment Stack
       It consists of sections. Each section is a set of binders.
       The stack is growing and shrinking according to program/query nesting.

       The most local data                                                   ......
       are at the top.
                                                                 Binders to local entities of
                                                                 currently executed method
                                                                             .....
                                                                                                The section of the
                                                                    NAME(i2) SAL(i3)            currently processed object
                                                                     WORKS_IN(i4)
                                                                 Binders to global entities
                                                                    of the user session
                                                                                                The section of the
   The most global data
                                                                 EMP(i1) EMP(i5) EMP(i9)        “Tiny database”
                                                                  DEPT(i13) DEPT(i17)
   are at the bottom.
                                                                   Binders to entities of
                                                                  the global environment
K.Subieta. Object-Oriented Query Languages and Views, slide 40                                                        Sept. 2000
                        Binding through the environment stack
Binding a name -
search from the top:
                                                                          ......

                                                                    G(“Mary”) X(i221)
                                                                          .....            Binding ( G ) = “Mary”
                                                                                           Binding ( X ) = i221
                                                                   NAME(i2) SAL(i3)        Binding( SAL ) = i3
                                                                    WORKS_IN(i4)           Binding( EMP ) = {i1, i5, i9 }
                                                                                           Binding( DEPT ) = {i13, i17}
                                                                           ...
                                                                 EMP(i1) EMP(i5) EMP(i9)
                                                                  DEPT(i13) DEPT(i17)
                                                                           ...

         • First the top section is visited, then lower sections are visited.
         • The search is finished after a binder with the proper name is found.
         • All binders with the proper name form the result of the search.
K.Subieta. Object-Oriented Query Languages and Views, slide 41                                                        Sept. 2000
                    Opening a new scope by a query operator
      In PLs opening a new scope (an activation record) at the top of an environment
      stack is associated with an activation of a block or a procedure.
      In SBA a new scope at the top of the environment stack is opened to evaluate a
      query component in the context determined by another component.

       A context                                                 Operator   A subquery evaluated in the context

                  EMP                                                  where           SAL > 1000
                                                                                           The ES state (in one iteration):
                                                                      The new scope
                                                                                           NAME(i2) SAL(i3)
                                                                      opened
  The ES state:                                                                            WORKS_IN(i4)
                                                                      by where
  EMP(i1) EMP(i5) EMP(i9)                                                                  EMP(i1) EMP(i5) EMP(i9)
  DEPT(i13) DEPT(i17)                                                                      DEPT(i13) DEPT(i17)

  EMP is bound to {i1, i5, i9}                                                             SAL is bound to i3
K.Subieta. Object-Oriented Query Languages and Views, slide 42                                                        Sept. 2000
                                                                 Function “nested”
       Given identifier i of a complex object, the function nested returns binders
       to direct sub-objects of the object identified by i.
       For “Tiny database”:

         A context                                               A subquery evaluated in the context

                          EMP where SAL > 1000

                                                                 nested( i1 ) = { NAME( i2 ), SAL( i3 ), WORKS_IN( i4 )}
     yields {i1, i5, i9}                                         nested( i5 ) = { NAME( i6 ), SAL( i7 ), WORKS_IN( i8 )}
       Function nested determines a new scope - the environment in which the subquery
       will be evaluated. The scope is pushed at the top of the environment stack.

       Function nested is naturally generalized for any r  Result.
       If l is a link, then nested( l) returns the binder to the entity that the link points to.
       If b is a binder, then nested ( b ) returns b (no change).
       For rows, nested ( row{v1, v2, ...} ) = nested(v1)  nested(v1) ...
K.Subieta. Object-Oriented Query Languages and Views, slide 43                                                      Sept. 2000
                                                             The Language SBQL
        A formalized variant of SQL-like languages, including ODMG OQL and SQL3.
        It is relevant to relational, object-relational and object-oriented models.
        Abstract syntax, free of sugar.

        Syntax:                            Literals, names, unary or binary operators, parentheses.

        Orthogonality:                                           Examples queries:
        1000                  EMP                   SAL          2+2    SAL > 1000   EMP where (SAL > 1000)

        ((( EMP where (SAL > 1000)) . WORKS_IN ) . DEPT ) . (DNAME, LOC)

        Relativity:
        Each query is evaluated relatively to the state of the environment stack.
        Thus queries such as SAL > 1000 have semantics independent from the
        context (providing ES contains a SAL binder).

        Query operators are subdivided into algebraic and non-algebraic.
        Algebraic operators do not deal with the environment stack.
        Non-algebraic operators operate on the environment stack.
K.Subieta. Object-Oriented Query Languages and Views, slide 44                                                Sept. 2000
                                             SBQL - Algebraic Operators

          • Numerical and string comparisons, operators and functions:
             =, <, +, *, concatenation, sqrt, sin, log, ...
          • Boolean and, or, and not
          • Aggregate arithmetic functions sum, max, min, avg
          • Function count, function for removing duplicates, function exists
          • Equality of complex query results (shallow, deep)
          • Dereferencing operator (usually implicit)
          • Coercion operators (changing representation and types);
          • Operators for bags (union, intersections, difference, equality, ...)
          • Operators for sets (union, intersections, difference, equality, containment)
          • Operators for sequences (concatenation, i-th element, sorting,...)
          • Cartesian product
          • ... many other operators
         Examples of use: 2+2                                    SAL > 1000   NAME = “Smith” and SAL > 1000


                                                                     implicit dereferencing
K.Subieta. Object-Oriented Query Languages and Views, slide 45                                                Sept. 2000
                   SBQL - Declaration of an Auxiliary Name
       Unary algebraic operator
                                                                 n - auxiliary name,
       Syntax:                       q as n                      q - a query returning a single-column table,
                                                                      e.g. bag{x1, x2, ... xn}

       Semantics:          x1
                           x2
                                               n(x1 )
       Let q return a bag: ...                 n(x2 )
                           xk                                                             Each value returned by q is
                                               ...
       Then q as n returns the bag of binders: n(xk )                                     equipped with the name n.

       Applications:
       SQL, OQL correlation variables (“synonyms”):                                   EMP as e     DEPT as d
       Variables bound by quantifiers:                                                 EMP as e ( ... )
       Cursors in “for each” statements:                                              for each EMP as x do ...
       Virtual names in SQL-like views.
K.Subieta. Object-Oriented Query Languages and Views, slide 46                                                     Sept. 2000
                                    SBQL - Non-algebraic Operators
  If  is a non-algebraic operator, then in q1  q2 the evaluation order of q1 and q2 is
  essential. Hence non-algebraic operators do not follow basic properties of algebraic
  expressions. In contrast to relational/object algebras, our non-algebraic operators are
  not indexed by (informal) meta-language expressions. No informal treatment of
  names: each name in a query, including names of attributes, links, etc. precisely
  follows the same scoping-binding discipline.
                    q1 where q2        q1 ( q2 )     q1 order by q2 (ordering)
  Syntax:
                                          q1 . q2                    q1 ( q2 )    q1 closed by q2 (transitive closure)
                                          q1 join q2             dependent join   (plus possibly other operators)
  Semantics - the uniform homogeneous idea:
  For each element r of the collection returned by q1 the query q2 is evaluated
  with the environment stack augmented by nested( r ).
  A partial result is a combination of r and the result returned by q2.
  All partial results are merged into the final result.
  All these operators are implemented in SBQL.
K.Subieta. Object-Oriented Query Languages and Views, slide 47                                                      Sept. 2000
                                                                 SBQL: Selection
                                           For each row r returned by q1 query q2 is evaluated
    q1 where q2                            with the stack ES augmented by nested( r ).
                                           The row r belongs to the final result, iff q2 returns TRUE for it.

                                                                                                                     The
                                                  EMP                  where                SAL > 1800              final
                                                                                                                    result
                                                                  NAME( i2 ) SAL( i3 )
                                                                  WORKS_IN( i4 )
                                                       i1                                      i3    1800   TRUE      i1
                                                                  EMP(i1) EMP(i5) EMP(i9)
                                                                  DEPT(i13) DEPT(i17)
                                                                                            (2500)

                                                                  NAME( i6 ) SAL( i7 )
                                                                  WORKS_IN( i8 )
                                                       i5                                      i7    1800   TRUE      i5
                                                                  EMP(i1) EMP(i5) EMP(i9)
                                                                  DEPT(i13) DEPT(i17)
     ES                                                                                     (2000)
     EMP(i1) EMP(i5) EMP(i9)
                                                                  NAME( i10 ) SAL( i11 )
     DEPT(i13) DEPT(i17)
                                                                  WORKS_IN( i12 )
                                                       i9                                     i11    1800   FALSE
                                                                  EMP(i1) EMP(i5) EMP(i9)
                                                                  DEPT(i13) DEPT(i17)
                                                                                            (1500)
K.Subieta. Object-Oriented Query Languages and Views, slide 48                                                             Sept. 2000
                                            SBQL: Projection, navigation
                                           For each row r returned by q1 query q2 is evaluated
                                           with the stack ES augmented by nested( r ).
        q1 . q2
                                           The final result is the union of tables returned by   q2 .


                                               EMP                         .               SAL
                                                                                                          The
                                                                 NAME( i2 ) SAL( i3 )
                                                                 WORKS_IN( i4 )
                                                                                                         final
                                                      i1                                    i3           result
                                                                 EMP(i1) EMP(i5) EMP(i9)
                                                                 DEPT(i13) DEPT(i17)


                                                                                                   i3
                                                                 NAME( i6 ) SAL( i7 )
                                                                 WORKS_IN( i8 )
                                                      i5                                    i7     i7
                                                                 EMP(i1) EMP(i5) EMP(i9)
                                                                 DEPT(i13) DEPT(i17)
    ES                                                                                             i11
    EMP(i1) EMP(i5) EMP(i9)
                                                                 NAME( i10 ) SAL( i11 )
    DEPT(i13) DEPT(i17)
                                                                 WORKS_IN( i12 )
                                                      i9                                   i11
                                                                 EMP(i1) EMP(i5) EMP(i9)
                                                                 DEPT(i13) DEPT(i17)



K.Subieta. Object-Oriented Query Languages and Views, slide 49                                               Sept. 2000
                           SBQL: Navigational (Dependent) Join
                                           For each row r returned by q1 query q2 is evaluated
                                           with the stack ES augmented by nested( r ).
        q1 join q2
                                           A partial result is a concatenation of r with each row returned
                                           by q2 . The final result is the union of partial results.

                                               EMP                      join               SAL
                                                                                                      The
                                                                 NAME( i2 ) SAL( i3 )
                                                                 WORKS_IN( i4 )
                                                                                                     final
                                                      i1                                     i3      result
                                                                 EMP(i1) EMP(i5) EMP(i9)
                                                                 DEPT(i13) DEPT(i17)


                                                                                                  i1 i3
                                                                 NAME( i6 ) SAL( i7 )
                                                                 WORKS_IN( i8 )
                                                      i5                                     i7   i5 i7
                                                                 EMP(i1) EMP(i5) EMP(i9)
                                                                 DEPT(i13) DEPT(i17)
    ES                                                                                            i9 i11
    EMP(i1) EMP(i5) EMP(i9)
                                                                 NAME( i10 ) SAL( i11 )
    DEPT(i13) DEPT(i17)
                                                                 WORKS_IN( i12 )
                                                      i9                                    i11
                                                                 EMP(i1) EMP(i5) EMP(i9)
                                                                 DEPT(i13) DEPT(i17)



K.Subieta. Object-Oriented Query Languages and Views, slide 50                                             Sept. 2000
                                            Examples in OQL and SBQL

     OQL:
               select e.NAME, d.DNAME, count(select x from d.EMPLOYS as x)
               from EMP as e, e.WORKS_IN as d
               where e.JOB = "manager"

     SBQL:
               ((((EMP as e) join (((e.WORKS_IN). DEPT) as d))
               where ((e.JOB) = "manager"))) .
               (((e.NAME), (d.DNAME)), count((((d.EMPLOYS). EMP) as x) . x))


         Syntactic differences, but striking similarity of the ideas.



K.Subieta. Object-Oriented Query Languages and Views, slide 51                 Sept. 2000
             Classes, inheritance and encapsulation on ES
         If some operator opens a section on ES for an object,
         then sections of its classes and super-classes are inserted onto ES, in proper order
         (shown on the picture below).

                                             PERSON
                                               ....                                   The query: EMP where ... X ...
                                                                                      The order of search during binding name X
                                                                                              Binders to attributes of the tested EMP object
                                                   EMP          DEPT                          Binders to properties of the EMP class
                                                    ....         ....                         Binders to properties of the PERSON class
                                                                                              Some invisible sections (for lexical scoping)
                        i1 EMP                           i17 DEPT
                                                                                              Binders to properties of the current session
                            i6 EMP                            i13 DEPT
                        ...
                            ...
                           ...
                                i1... EMP                 ...
                                                            ...
                                                                          ...                 Binders to database objects, views, ...
                                             ...              ...
                                                                                ...
                                 ...                                ...
                                       ...
                                                   ...
                                                                                              Binders to global procedures, variables,...


  The state of the object store                                                            The state of the environment stack
      Encapsulation:
      some sections inserted on ES contain only binders to public properties.
      Polymorphism - a a side effect of the above ES rules.
K.Subieta. Object-Oriented Query Languages and Views, slide 52                                                                                 Sept. 2000
                                                                 Object views
    Expectations:

               Customization, conceptualization, encapsulation. The user works with a part of a
               database that is relevant to his/her area of interest in a way that is convenient for
               his/her everyday processing routines and concepts. Views have the potential of
               significant improvement of programmer’s productivity.
               Security, privacy, and autonomy. Views restrict possible accesses only to a
               relevant part of a database. In federated databases this restriction is required to
               enable the autonomy of local databases
               Interoperability, heterogeneity, schema integration, legacy applications. Views
               enable integration of heterogeneous databases, allowing understanding and
               processing foreign, legacy or remote databases according to a common, unified
               schema.
               Data independence, schema evolution. Views enable the user to change database
               organization and schema without affecting already written applications.

    The current state-of-the-art is immature, the above expectations are actually wishes.
K.Subieta. Object-Oriented Query Languages and Views, slide 53                                 Sept. 2000
                      Views are Stored Functional Procedures
          Function                                               function RichEmp
          declaration:                                             begin return EMP where SAL > 1800 end;

          Function                                               (RichEmp where JOB = "designer"). NAME
          invocation:
      The name RichEmp is bound and the function is invoked. A new ES section contains
      only a return address. Then, the query within the body of this function is executed.
      It returns bag{i1,i5}, containing identifiers of Brown and Smith objects. This bag is a
      function output.
                                                                                NAME(i2) JOB(..)
                     The section for                                          SAL(i3) WORKS_IN(i4)       The section for
                     view invocation                                                                     EMP class
                                                                                      age(...)



     EMP(i1) EMP(i5) EMP(i9)                        EMP(i1) EMP(i5) EMP(i9)   EMP(i1) EMP(i5) EMP(i9)   EMP(i1) EMP(i5) EMP(i9)
      DEPT(i13) DEPT(i17)                            DEPT(i13) DEPT(i17)       DEPT(i13) DEPT(i17)       DEPT(i13) DEPT(i17)
                                                                                                                                  .....
                                                     After invocation of       After where within         After return from
        The initial ES state
                                                    the RichEmp function      the RichEmp function          the function


K.Subieta. Object-Oriented Query Languages and Views, slide 54                                                                      Sept. 2000
                 Two points of view on the database content

                                    DEPT                                              DEPT      RichEmp
              EMP                    DEPT                        RichEmp      EMP      DEPT      RichEmp
               EMP                                                             EMP
                EMP                                                Stored       EMP
                                                                                                Imaginary
                           Objects                               functional      Objects         objects
                                                                 procedure
                            a) The database seen                                 b) The database seen
                             by the view definer                                    by a view user


        The imaginary RichEmp objects adopt the entire semantics of EMP objects:


                                 (RichEmp where age < 30).
                                        WORKS_IN.DEPT.DNAME

                                 RichEmp as r (r.age > 50
                                        and "Rome"  r.WORKS_IN.DEPT.LOC)

K.Subieta. Object-Oriented Query Languages and Views, slide 55                                              Sept. 2000
                                                           Generalized functions
     The stack-based semantics makes no restriction in calling functions from inside of
     functions, i.e. to define a view through other views. Loqis makes it possible to define
     arbitrary stored functional procedures, including procedures with parameters (being
     arbitrary queries), with local objects, recursive, with side effects, and higher-order
     (procedures with parameters being procedures).

     For example, the WellPaid function has a parameter JobPar and a local variable a:

     function WellPaid( JobPar )
       begin                                                                   The algorithm can be
         local a: real;                                                        complex.
         a := avg(EMP.SAL);
         return EMP where JOB = JobPar and SAL > 2*a;
       end;

     Get names of departments of well paid clerks:

     LowPaid( "clerk "). WORKS_IN. DEPT. DNAME

K.Subieta. Object-Oriented Query Languages and Views, slide 56                                        Sept. 2000
                               Query modification to process views
       Macro-substitution of view invocation by the view body.


            Function                                        function RichEmp
            declaration:                                      begin return EMP where SAL > 1800 end;

            Function                                        (RichEmp where JOB = "designer"). NAME
            invocation:
            After
            macro-                                          ((EMP where SAL > 1800) where JOB = "designer"). NAME
            substitution:

                      Because of uniformity and universality of the stack-based semantics
                      query modification in SBA is very easy to implement.

                      Query modification makes it possible to use other optimization methods:
                      rewriting, access through indices.
K.Subieta. Object-Oriented Query Languages and Views, slide 57                                                Sept. 2000
                                                            Performance of views
         Various approaches to object views do not consider this issue.
         If this issue is unsolved, there will be no chance for object views.

         What can be done?

         Caching view results (materialized views): there is a side problem of keeping
         materialized views consistent with stored data (incremental updating algorithms).
         Query modification: macro-substitution of invocation of views by bodies of view
         definitions. Applicable only to views defined by single queries. The method can be
         difficult for more complex views.
         Predicate-move-around: a predicate occurring in a query invoking a view modifies
         the view definition. The method is not tested practically (probably).




K.Subieta. Object-Oriented Query Languages and Views, slide 58                          Sept. 2000
                                                                 Updatable views

           A challenging theme.
           Typical theoretical approaches (algebras, calculi, logic, functional...) are stateless,
           hence updating operations are non-expressible. There are naive believes that
           translations of view updates can be done in some auto-magical way, e.g. through
           employing some constraints.
           Oracle, with „do instead” triggers, is perhaps on the right way.
           Updating through methods encapsulated in a class of virtual objects (no generic
           updating operations). An idealistic view on object-orientedness and bad
           understanding of encapsulation.
           Updating through references returned by view invocations (undesirable side
           effects, no universality).
           Overriding generic updating operations by codes written by the view definer
           (the mentioned Oracle way).



K.Subieta. Object-Oriented Query Languages and Views, slide 59                                Sept. 2000
                                                                 Conclusions
            Object-oriented and object- relational query languages are a very important feature
            of future DBMS.
            Query languages have the potential to increase significantly the efficiency of
            programming, as well as the maintainability and reusability of programs.
            The current state-of-the-art of object query languages is immature. For us,
            researchers, this is optimistic opportunity – we have a lot of work to do.
            Commercial achievements, in particular OQL and SQL3, are a big progress, but
            suffer from inconsistencies, underspecified semantics and eclecticism.
            Theoretical proposals have many limitations, are mathematically and/or
            conceptually inconsistent (object algebras), present wishful thinking.
            Object views are still not developed to a satisfactory degree, both theoretically and
            practically.
            Optimization of object queries and views is not developed to a satisfactory degree
            The stack-based approach to object query languages and views is a right
            paradigm, which is able to make significant progress in the area.
K.Subieta. Object-Oriented Query Languages and Views, slide 60                               Sept. 2000

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:5
posted:8/8/2011
language:English
pages:60