Docstoc

Precise Enterprises and Imperfect Data

Document Sample
Precise Enterprises and Imperfect Data Powered By Docstoc
					               Precise Enterprises and Imperfect Data
                                    P. Chountas & I. Petrounias
                                 Department of Computation, UMIST,
                                PO Box 88, Manchester M60 1QD, UK
                              e-mail: {chountap, ilias}@sna.co.umist.ac.uk
                            Fax: + 44 161 200 3324, Tel: + 44 161 200 3386

                                                Abstract
          One of the main uses of an information system is the representation and management of large
          amounts of indicative information from multiple sources describing the state of some enterprise.
          Most current information systems model enterprises that are crisp. A crisp enterprise is defined as
          one that is highly quantifiable; all relationships are fixed, and all attributes are atomic valued.
          The premises on which this paper is based are precise enterprises, where data are imperfect. In
          such cases information can be certain, imprecise or uncertain, temporal, or any possible
          combination of each two of them, depending on the application domain. Additionally, in domains
          where the information is perfect, all information sources are absolutely reliable and trustworthy.
          In more speculative domains, like diagnosis, where, information may be asserted relatively to
          some time intervals in which it is possibly defined and probably believed. In such domains
          different sources of information may be assigned different degrees of reliability. This paper is
          presenting a framework for the conceptual integration and uniform treatment of all these types of
          information.

          Keywords
          Value imperfection, temporal imperfection, multiple information sources, consistency principle,
          conceptual modelling, nested relational databases.

1. Introduction
       Imperfect information is the partial knowledge of the true value of the real world. It is an
epistemic property caused by lack of information. Elements of the enterprise ontology, involved in
information imperfection are:
   • The might happen ability of things, or the tendency of things to occur.
   • The concept of time
   • The information source or provider
       A database attempts to represent an abstract version of the enterprise reality; the level of it is
determined by the expected applications. Work in this area is considering imperfect information
arising from elements of the enterprise ontology but always in isolation form each other. This paper
is suggesting that the elements involved in information imperfection are related and affecting each
other and concepts at the specification level. More than one information provider might be
describing the same fragment of information, expressing logical views about the same facts that are
defined over the 3-dimensional space of time, belief and reliability, named as the
multidimensionality of the source. Facts are expressing associations between objects but not in
isolation, always in some form of relationship with each other.
       The rest of the paper is organised as follows. Section 2 evaluates existing work dealing with
imperfect information. Section 3 presents a framework for dealing with imprecise temporal
information. Section 4 presents a formalism for capturing and representing temporal and value
imperfection in a multisource environment, where each source is carrying a degree of reliability.
Section 5 provides the mapping of the formalism to a nested relational database. Section 6 suggests a
set of NF2 algebraic operators. Section 7 points at work in progress.
2. Related Work
      The main stream of researchers is dealing with either temporal or value imperfection. The
difference between temporal and value imperfection can be characterised as that between “do not
know what” and “do not know when” information. Approaches for representing temporal or value
imperfection can be weighted or unweighted. Weights are normalised values in the range [0,1].
Weights are assigned to alternatives or possibilities of an imperfect value. For a possibility a weight
deals with the might happen ability of the possibility to be the actual value. Unweighted imperfect
information may be either restricted or unrestricted [1]. Other researchers are considering the
existence of multiple conflicting sources accommodated in a database. All these are assuming that
there are no intentional inconsistencies between different sources. It is assumed that the point of
conflict is that two or more information sources provide different answers to the same query
(extensional inconsistencies), provided that those sources do not have internal disharmonies (internal
extensional inconsistencies).
      This section evaluates approaches in the literature according to the following criteria:
relevance to real world representation of concepts, power offered by the proposed model, minimality
of concepts (formalisms should not contain overlapping concepts), formality in the representation to
avoid ambiguities. In addition, it examines whether support for temporal uncertainty is offered by the
models and whether uncertainty about both temporal and value aspects is supported. Finally, models
are evaluated as to whether they offer support for multiple information sources.

2.1 Possibilistic or Fuzzy Databases
      When considering uncertain information enterprises are considered as either precise or vague.
In vague enterprises it is assumed that attribute values are not precise and are presented as linguistic
terms. Early possibilistic approaches extended the relational model with the acceptance of fuzzy
functional [2] and fuzzy multivalued dependencies [3]. Based on the definition of a fuzzy
resemblance relation EQ(UAL) over domains of attributes, a set of inference rules for fuzzy
functional and multivalued dependencies is proposed ([2], [3]). The main disadvantage of both is
that they support only a limited number of possible associations between the elements of the
application domain in order to keep the models in 1NF. Recently these ideas moved towards the
support of similarity relations in a nested relational model for representing uncertain, complex data
[4].
      In the case of precise enterprises, information can be certain, uncertain or imprecise. The key
notion is that while one value applies to the enterprise, the database extension may contain a set. The
application of imprecise or uncertain information to the precise enterprise means that the value in the
database is a possibility distribution. This is taken to show the limits of knowledge concerning the
actual value and the significance of ordering. Other research has made attempts to identify whether
the uncertainty property can be presented as part of ER diagrams [5]. This is influenced by
approaches for treating uncertainty proposed by the AI community [6] and is trying to embody
linguistic terms, as part of the ER formalism or as part of the relational theory.

2.2 Probabilistic Databases
       The key notion here is that while one value applies to the certain enterprise, the database
extension may contain a set (probabilistic distributions). In that way value imperfection is
accommodated. The field of probabilistic databases covers a wide spectrum of different approaches.
Probabilistic weights are used to express that an attribute value can be a set of alternative data values
([7], [8]) or to express the likelihood that a tuple belongs to a relation. Other approaches [9] are
using separate probabilistic weights to express the logic view that a tuple belongs to a relation and
different probabilistic weights to express the intent that an attribute value may be a set of alternative
data values.
      There is a debate on whether an interval of probabilities or a single probability is better for
expressing the tendency of things to occur ([9], [10]). There is also concern if events should be
considered dependent or independent. However, uncertainty is treated only at the database level
ignoring the specification level, leading to complex probabilistic reasoning with no knowledge of the
primitive notions of the model that can produce imperfect information.
      The main issue is if the model is in 1NF or in NF2. It is argued in this paper that imperfection
should firstly be considered at the conceptual level. If imperfection is not accommodated by
conceptual modelling formalisms (i.e. ER diagrams, object role modelling approaches) then it cannot
appear in the resulting databases (which are after all the result of a mapping from the conceptual
schema).

2.3 Temporal Probabilistic Databases
       While in reality a time interval applies to an event, in temporal probabilistic databases the
database extension may contain a set of possible intervals.
       In valid time indeterminacy [1] it is known that an event did in fact occur but is not known
exactly when. The model is presented as an extension of the SQL data model. If a tuple k in relation
R is timestamped with the interval [t1...t2], then this is interpreted as tuple k holds at some point t in
interval [t1...t2]. Query constructs are defined to specify belief (Correlation credibility) in the
underlying data and their plausibility (Ordering Plausibility) in the relationships among the data.
However, valid time indeterminacy is treated at the database level instead of arising from the
conceptual level that states exactly which primitive notions may be involved in valid time
indeterminacy.
       A probabilistic temporal algebra is suggested in [11] for expressing information of the following
type: tuple d is in relation R at some point of time in interval [t1, t2] with probability between p1 and
p2. A range of probability distributions is supported to allocate the probability measure over the set
of time points of the interval. Different valid times related to a tuple may have different probability
distributions in nature.
       The main problem in ([1], [11]) is that if the type of the probability distribution is known then it
is known beforehand that some time points in an interval have greater probability, thus a subinterval
of the initial time interval is more probable. Therefore, there is a fact somewhere that makes our
knowledge about the real world more explicit but it is not present in the conceptual schema or in the
database. Temporal probabilistic databases are a natural extension of probabilistic databases.
Imperfection of the information is treated only at the database level ignoring the specification level,
leading to complex probabilistic reasoning with no explicit specification of concepts.

2.4   Databases with Multiple Information Sources
      The key notion here is the representation of a certain enterprise, where only one value applies,
while the database extension may contain a set because of different conflicting sources. The IST
approach [12] is using information source vectors to accommodate multiple conflicting sources and
define the conditions under which a tuple is valid. Each attribute value in a tuple is associated with
an information source vector to state whether an attribute value is valid, therefore certainty about
certainty can be expressed. [13] assumes that data models can be mapped, resolving only existential
inconstancies. Both approaches are ignoring intentional inconsistencies between different sources
since both approaches are treating conflicting values at the database level ignoring the specification
level. Furthermore, it is assumed that there are no internal extensional inconsistencies. Both are
modelling the certain world. Models are trying to resolve information coming from different sources,
which are conflicting.

3.     Considerations for a Dynamic Conceptual Model
       In any enterprise environment of multiple information sources it is undeniable that more than
one sources describe the same portion of the enterprise world differently. The conceptual model is
acting as a gateway between different sources, permitting different sources to express information in
a single and highly abstract level, the level of concepts (metamodel level).
       The approach followed here is based on a type of object role modelling formalism [14]. A fact
is a true logical proposition about the modelled world. Each fact instance is a semantically irreducible
proposition in the real world about one or more entity instances. Irreducible means that the fact
cannot be split into facts involving fewer entities without loss of information. A dynamic database
environment is presenting certain or plausible information about the past and present of the modelled
world. Therefore, there is a need to express imperfect information as a part of the fact formalism and
to identify the impact of the time and belief dimension on it, before proceeding with database
considerations:
    • If a fact is related to the belief dimension, with a degree of belief less than one it is simply
         declared that an association between objects possibly stands in the enterprise world (value
         imperfection).
    • If a fact is linked with the time dimension, it is simply declared that a certain association
         between objects is valid for a certain time period. However, if the time dimension is
         associated with the belief dimension it is simply declared that a certain fact is possibly defined
         over that period (temporal imperfection). In most of the research proposals only value or
         temporal imperfection can be expressed. This paper suggests that both kinds of imperfection
         (value and temporal) can be represented in a database environment.
    • If a fact is associated with the belief dimension and the time that a fact is defined over (valid
         time according to the temporal database literature [1]) is also allied to the belief dimension,
         then both value and temporal imperfection can be expressed. Expressing temporal and value
         imperfection simultaneously, permits the representation of statements arising from everyday
         enterprise activities.
    • In cases of either value or temporal imperfection the belief dimension is affected by the
         reliability of the source. The reliability of the source is expressing the humans’concern about
         the identity and trustiness of the source that is responsible for a particular piece of
         information.

4.    The Temporal Multisoucre Belief Model (TMBM)
      The basic items that one wishes to reason about are objects in terms of the roles that they play
within a domain [14]. In general a fact type is composed of the arguments shown in Figure 1, where
n is the arity of the fact type. The way that one can refer to specific entities is through reference
labels.
      If the modelled world is certain then the label value is a single value. The time interval ∆t that a
fact instance is defined over has an explicit duration since both ends of the interval are
unambiguously defined in the time line. If the modelled world is imperfect then a label value may be a
possible multiset of values (π_Label Type) [15]. Each member (value) of the possible multiset is an
alternative value with an indicative belief. The time interval that a fact is defined over, is an
alternative from a multiple set of time intervals, with an indicative belief. In either the certain or
imperfect modelled world the reliability of the source is forming the conclusive belief for the
timestamped fact. A graphical representation of the concepts is shown in Figure 2. Fact types can be
of any degree [14]. For example, (Figure 3) in a ternary fact type, there will be three entity types
involved with three different roles. The relationship between two entity types of the ternary fact can
be regarded as an entity type itself (objectified fact type).

                 F = {{{<E 1 , L 1 , R 1 >,… , <E n , L n , R n > },{ ∆ t ≤ T} }, m}

                 w here :
                 F is a fact type consisting of k fact instances
                 T is the time interval that an irreducible fact type is defined in the real world.
                 E i is the ith entity type playing a role in the fact type
                 L i is the label type (referencing Ei)
                 R i is the ith role of the fact type
                 ∆ t is the time interval that an irreducible fact instance is defined in the real world
                 m is the reliability of the source that circulates a particular fact type. The reliability of
                 the source is a domain independent variable.


                                                          Figure 1: Fact Types



                                           Role-1
                     Entity type                               T                       Entity type

                     Label Type                                           Role-2       π-Label Type
                                                                m



                                    Figure 2: Graphical Notation of a Fact type

      In Figure 3, the entity types (E) are (Supplier, Product, Location) and the corresponding
reference labels are (Supplier-Name, π (Product–Name), City-Name). Supplier-Name and City-
Name are deterministic label types. Product-Name is a stochastic label type (π). The meaning of a
stochastic label type is that a label value can take a possible (π) set of values and each member of the
set is an alternative value with an indicative belief (probability, (p)). Based on the possibility/
probability consistency principle [16] a connection between the measure of randomness (p) or
observation and compatibility (π) can be achieved. In this way a fact is presenting information that is
observed and testified by one or more information sources therefore a set of alternatives is defined
with p > 0 and π = 1. A fact may also represent information that is compatible with its domain, p = 0
and 0 < π < 1 based on some specified or unspecified criterion. However, the information source
cannot testify these values, but does not have any reason to reject them. In that way information that
is more elementary and less context dependent can be represented.
      Any instance of the (SALE-LOCATION) in Figure 3 must exist during the period (or at the
same period) that the corresponding SALE fact type exists. The following relationship between T
and T1 must exist: (T1 during T) or (T1 same as T). A time period is defined as a temporal constraint
over a linear hierarchy of time units, denoted Hr. H r is a finite collection of distinct time units, with
linear order among those units. For instance H1 = day ⊆ month⊆ year, H2 = minute⊆ hour⊆ day⊆
month ⊆ year are all linear hierarchies of time units defined over the Gregorian Calendar. A time
point in a linear hierarchy is simply an instantiation of each time unit in Hr. A calendar r consists of a
linear hierarchy Hr of time units and a validity predicate that specifies a non empty set of time points.
In that way an application may assume the existence of an arbitrary but fixed calendar.
                                         m
                SUPPLIER (S)   Sells              T                              PRODUCT (P)

               Supplier-Name                                     Is related to   π (Product –Name)
                                                  SALE




                                                          T1



                                                      Committed in
                                             LOCATION

                                             City -Name




                                 Figure 3: Uncertain Timestamped Fact

      In case of temporal imperfection, the time interval over which a fact instance is valid, is
accompanied by an indicative belief (probability) that the relationship between T and T1 still holds.
The validity lifespan that a concept is defined (e.g. SALE) over is the union of the time intervals that
‘sale’ instances are believed to be valid. If an entity type is involved in non-timestamped facts, the
interval [now - t1, now] is awarded to non-timestamped fact types, where t1 is the smallest
granularity of all timestamped facts that the entity type participates in. A snapshot fact keeps only
current information.
      The reliability of a provider or source (m), (Figure 3) is acting as a creditor of trust towards
facts expressed by this particular source and is associated with an instantaneous event e. The event
supplies the system with the reliability of the source. The time point (te) in the time line associated
with the particular event is recorded. The interval [te, now] is the valid period that the reliability
measure is defined for a particular source. The ‘ now’upper bound, will updated to te1, when te1 is the
time point another instantaneous event e1 is triggered and subsequently modifies the reliability
measure of the source.

5.    Mapping to a Nested Relational Model
       The 1NF relational model is simple and mathematically tractable but not rich enough to model
complex objects. In order to represent complex objects hierarchical structures are used instead of flat
tables [17]. A relation schema R is recursively defined as:
   i.      If {A1 , … , An}⊂ U and A1… An are atomic valued attributes then R={A1,..., An} is a
           relation schema.
   ii.     If {A1, … , An}⊂ U and A1… An are atomic valued attributes and R1,… ,Rn are relation
           schemas then R=( A1,… ,An,R1… .Rn ) is a relation schema .
       The atomic valued attributes A1,… ,An are called zero order attributes. R1,… ,Rn are called
relation-valued attributes or high order attributes.

                                              m
                                  Supplies         T
              SUPPLIER                                                              PRODUCT
              Supplier-Name                       Sale         Is related to        π (Product –Name)




                                 Figure 4: The Sale Fact type
       Consider fact type Sale in Figure 4. Supplier-Name is an atomic value attribute. π(Product –
Name) declares that a single label value can be a possible (π) set of values and each member of the
set is an alternative value with an indicative belief (probability, (p)). Product is a relation schema or a
high order attribute. The time interval ∆t⊂ T that a fact instance is defined over may be explicitly
known or may be a set of possible time intervals (π) where each interval is accompanied by an
indicative belief. T is represented by a high order attribute. Another separate relation represents the
                                               1).
information source and its reliability (0< M≤ M affects only the probability measure. In Figure 5, a
sample population for the fact type Sale is presented. In it two kinds of value imperfection found in
                                                                              1)
the real world can be modelled: the possible (π=1) and probable (0<p≤ or the possible (0<π<1)
and unexpected, improbable (p=0). In Figure 6 Sale is presented as a hierarchical structure. A node
can be either an atomic value attribute or a relation.

                                              Timestamped Fact Type Sale                             Valid Time
                   Multisource Fact Type Sale                                         ∆t / (p)                          Source
       Supplier-Name                   π(Product-Name)
                       Name / Probability(p) Source                     Possibility
                       Water / 0.5             Ivi
                       Wine / 0.2              John                           1
                       Oil / 0.3               Minerva                                [10/06/99,15/08/99] / 0.6 Ivi, John
                                                                                      [10/07/99,15/10/99]/ 0.4 Minerva
       Amber Smith

                       Cigarettes / 0                 Paul                   0.5


                                    Figure 5: Multisource Timestamped Fact Type Sale

6.    A Recursive NF2 Algebra
     A set of relational operators (Select, Project, Cartesian product and Join) is presented with the
emphasis on processing queries which include join operations in the nested relational model.
Operators are recursively defined so that each operator can be applied to subrelations at all levels.
     Selection (σ): For all nodes ∈ node S where Sa≠ Sb, if node Sa is a child of an ancestor of a
node Sb, then Sa, Sb are called selection comparable nodes (Sa σ→ Sb).

                                            Timestamped Relation Sale (S)




                                           Valid Time (R1)             Sale (R2)




                         Valid ∆ t / (p)    Source (SP)          Supplier -Name       π (Product –Name)




                                                          Name / Probability (p)       Source (SP)        Possibility




               Figure 6: Nested schema tree for value and temporal imperfection
       For example, in figure 6 (Valid Time(R1)σ→ Supplier–Name) and (Valid
Time(R1)σ→ π(Product –Name)) are selection comparable notes. Since there is a path between
π(Product–Name) and Valid Time (R1) then (R1) is also comparable to Name/Probability (p).
However Valid ∆t/(p), Supplier–Name are not selection comparable nodes. Selection conditions are
comparisons between attributes and constants and may include also membership operators.
                       ):
       Projection (π′ A projection operation is a way of accessing attribute values or relation
schemas from the outermost level to the innermost level. A projection can be defined as a nesting of
multiple projections in the attribute domains of a relation schema.
       Many project operators have been proposed in the context of nested relational models [18] but
existing projection operators deal only with projection of attribute values based on a selection
condition that is defined on the attribute domain (e.g. π′ (Supplier Name = ‘  John Smith’)).
       For all nodes ∈ node (S) if two nodes are selection comparable notes then the projection
operator is defined. In figure 6 (Supplier Name π′ π(Product–Name), (π(Product–
                                                                  →
        π′
Name) → Possibility) are selection comparable notes. In this case the project operator is defined as
an ordered sequence of zero level attributes and relation valued attributes (section 5).
       Projection operators can be either simple or complex. A simple projection involves a one level
vertical or horizontal path (e.g. Supplier-Nameπ′ π(Product–Name)). In this example, for a
                                                       →
Supplier-Name instance the whole relation valued attribute π(Product–Name) is derived.
       A complex projection involves the derivation of values through paths in the tree hierarchy (e.g.
Supplier Nameπ′ SP (Source Identity/reliability). Duplicates are not eliminated in the case that the
                  →
values of the timestamps are different or the conclusive beliefs are different.
       Cartesian Product (×ε): The idea behind the extended Cartesian product is to combine
relations with common high order attributes not only at the top level but also at the subschema level.
Let R be the relational relation schema and T be the schema tree of R, the path Pr = (M1...Mk) is a
join-path of R if M1 is a child of root (T) and Mk is a non-leaf node of T.
       Path expressions describe routes along the composition hierarchy and expressions describe
links between attribute domains. They flatten any nested relation structure in one way – no need to
break paths in the schema into several expressions and apply a fold up operator to each one. The
idea is to combine to high order relational attributes not only at the top level but also at the
subschema level. The definition of the Cartesian Product does not have any major practical value,
since it is clearly a mathematical operation. However, it underlines the theoretical framework for
defining the P Join operator.
       P Join (ρ×): The same attribute names in two join relations may appear in multiple subtrees.
The P join can be extended with multiple path joins, which exploit the more general situation. In
Figure 5 only the information sources, are stated and not they reliability. The way that the reliability
measure is changing throughout time has been discussed in section 4. Assuming that the following
relation describes the information source (Figure 7), a P Join can be used to relate the reliability (m)
of the source and the belief expressed for the value or temporal part of a fact instance.
       The source attribute presented by the Source relation (SP) in Figure 7, is evident in two
relational subschemes R1, R2 of Figure 6. A join path between R1, R2 and SP can be defined.
Subrelation R1 is expressing the time dimension of the fact type sale. In defining the path join
between the relation SP and R1 the following relationship must exist: Valid time (SPi)∩ Valid time
(Rti)≠∅ (1).
       If and only if (1) is true then a path join between R2 and SP can be defined. Otherwise, it is
accepted that a source can be temporally imperfect. If the time that an event occurred is not known
(no matter if it is known to what extent an event did occur), the provided information is still
incomplete.
                                                            Source (SP)



                         Source Identity / Source Reliability             Source Valid time (SP)




                                  Figure 7: Relation SP for the Information provider

       Maybe P Join (mρ×): A maybe P Join is defined as an extension of the P-Join and is also an
extended natural join. With the maybe P Join attribute values defined using a probability can
encapsulate the belief of the source and express the conclusive belief in a single complex value, thus
forming higher level attributes with complex data types. The conclusive belief for a possible value is
defined as the product of reliability and probability measure Cp=(p×m). The same applies when two
probabilities have to be joined (p1×p2). When elements of different possibility distributions are joined
then the min (π1...πn) possibility is the common one. If n sources are expressing the same belief (p)
for an attribute value, having different degrees of reliability (m) then the conclusive belief (Cp) of the
attribute value is defined by the following formula:
                         Cp =min (m1× p, m2×p… mn×p)                   (2)
       The time interval that the conclusive belief is defined is the intersection of the time intervals
that the sources (SP1… SPn) are defined.        ∆t Cp = ∆t1SP1∩ ∆t2SP2∩ … ..∩ ∆tnSPn (3)
It should be stated that (1) must be always true.
       Figure 8 presents relation (Smρ×SP) after applying the Maybe P Join. Relation S is from
figures 5 and 6. Relation SP is from figure 7.
       Union (∪ ): Union compatibility in the fact formalism (section 4), means that two relations are
union compatible if and only if they have the same arity or degree and their corresponding attributes
are based on the same domain. Attribute names may not be the same. Attributes may be zero level or
(e.g. supplier Name) or relation value attributes π(Product–Name). If two relations are not union
compatible the project operator can be used to identify the union compatible attributes of the
relation. The defined projection operator is a way of accessing attribute values or relation schemas
from the outermost level to the innermost level, thus projecting zero or high order attributes at
different levels in the nested schema tree.

                                                                                ((∆t / Cp1),   (∆t Cp1, Source Identity))

        Supplier-Name π (Product-Name)

                      (((Name /C p),(∆tCp,Source Identity))Possibility)




                                  Figure 8: S mρ×SP Maybe P Join Example

      In Figure 8 projecting the source identity from the outermost level, the sources supplying the
possible set of times that the fact sale occurred are known. Applying the project operator in the
π(Product–Name), a relation-valued attribute, the sources giving the possibility that a fact instance is
defined are also known. Applying the union operator the total population of sources involved in
either value or temporal imperfection, or both are derived. Instances of a timestamped fact with the
same entity instances involved are considered different if the values of the timestamps are different or
the conclusive beliefs are different.
       Intersection (∩ ): The intersection operation is defined in analogy with the union operator.
Relations must be union compatible. Similarly as in the case of the union operator the project can be
applied to guarantee union compatible zero level or relation value attributes. In figure 8 projecting
the source identity from the outermost level, and intersecting them with the source identity after
applying the project operator in the π(Product–Name), the members from the population of sources
that are involved in both temporal and value imperfection are derived.
       Difference (− ): The difference operation accepts as inputs two zero level or relation value
attributes and returns instances that it will be members of the population of the first operand that are
not members of the population of the second operand. The definition is based on the intuition that
two attributes (zero level, relation valued) r, s represents information that two different actors
(sources) have about the same world then r− s should represent the information about the real world
that r has and s does not

7.    Conclusions
      A single conceptual framework has been proposed for treating either value or temporal
information imperfection. A conceptual model describes the real world and descriptions at the
database level must be defined according to the conceptual model that states the static and dynamic
elements of an application domain that may generate imperfect information. Certainty about certainty
can be expressed. An algebra was presented in order to manipulate imperfect information at the
database level. Extensional inconsistencies between different sources can be represented and queried.
Wok is carried out in order to model and query internal extensional inconsistencies of a source.
Further on dependencies between information sources, affecting the reliability property of an
individual information source have to be considered.

8.    References

[1]   C.E. Dyreson, R.T. Snodgrass, Support Valid-Time Indeterminacy, ACM Transactions on
      Database Systems, Vol. 23, No. 1, pp. 1-57, 1998
[2]   Dey Li, Dongbo Liu, A Fuzzy Prolog, Database System Research Studies, John Wiley, 1988

[3]   T Bhattacharjcee, A. Mazumdar. Axiomatisation of Fuzzy Multivalued Dependencies in a
      Fuzzy Relational Data Model, Elsevier, Fuzzy Sets and Systems, 1996
[4]   A. Yazici, A. Soyal, B. Buckles, F. Petry Uncertainty in a Nested Relational Database Model,
      Journal of Data and Knowledge Engineering, Vol. 30, 1999, pp. 275-302
[5]   R.Vandenberghe, N.Van Gyseghem, A, Van Schooten, R. De Caluwe,Integrating Fuzziness in
      Database Models, in P. Bosc, J. Kacprzyk (eds), Fuzziness in Database Management Systems,
      Physica-Verlag, pp. 71-114, 1995
[6]   E.H. Mamdani, On the Classification of Uncertainty Techniques in Relation to the Application
      needs in A. Motro, P. Smets (ed), Uncertainty Management in Information Systems: from
      Needs to Solutions, Kluwer Academic, pp. 397-408, 1997
[7]   D. Dey, S. Sarkar, A Probabilistic Relational Model and Algebra, ACM Transactions on
      Database Systems, Vol. 21, No. 3, 1996
[8]   D. Barbará, H. Garcia-Molina, D. Porter, The Management of Probabilistic Data, IEEE
      Transactions on Knowledge and Data Engineering, Vol. 4, No, 5, 1992
[9]   N. Fuhr, T. Rölleke, A Probabilistic NF2 Relational Algebra for Imprecision in Databases,
      Technical Report, University of Dortmund, 1995
[10] L.V.S. Lakshmanan, N. Leone, R. Ross, V. S. Subrahmanian, ProbView: A Flexible
     Probabilistic Database System, Department of Computer Science, Technical Report, Concordia
     University, Canada, 1997
[11] A. Dekhtyar, R. Ross, V. S. Subrahmanian, TATA Probabilistic Temporal Databases, I:
     Algebra, Department of Computer Science, University of Maryland, USA, 1999
[12] V.S. Alagar, F. Sadri, J. N. Said, Semantics of an Extended Relational Model for Managing
     Uncertain Information, Proceedings of ACM Conference on Information and Knowledge
     Management (CIKM), 1995
[13] A. Motro, A Formal Framework for Integrating Inconsistent Answers from Multiple
     Information Sources, Technical Report ISSE-TR-93-106, Department of Information and
     Software Systems Engineering, George Mason University, 1993
[14] I. Petrounias, A Conceptual Development Framework for Temporal Information Systems,
     Proceedings of Conceptual Modelling - ER '97, 16th International Conference on Conceptual
     Modelling, Los Angeles, California, USA, November 1997
[15] P. Chountas, I. Petrounias, Representing and Querying Multiple Information Sources in a
     Single Database Environment, Proceedings of 12th International Conference on Software &
     Systems Engineering and Applications (ICSSEA’   99), Paris, December 1999.
[16] H.C Liu, K. Ramamohanarao, Algebraic Equivalences Among Nested Relational Expressions,
     Proceedings of ACM Conference on Information and Knowledge Management (CIKM), 1994.
[17] M. Delgado, S. Moral, On the concept of Possibility-Probability Consistency in Fuzzy Sets For
     Intelligent Systems, D.Dubois, H.Prade and R. Yager (eds), Morgan Kaufman Publishers, pp
     247-250, 1993.
[18] L. Golby, A Recursive Algebra and Query Optimisation for Nested Relations, Proceedings of
     ACM SIGMOD International Conference on Management of Data, 1989.

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:12
posted:12/15/2013
language:Unknown
pages:11