Formal Theories for Layered Models

Document Sample
Formal Theories for Layered Models Powered By Docstoc
					Introduction to the Workshop
Axiomatically              Terminology Ontology
Grounded Ontology          Application Ontology
(Axion
LAO-CNR, IFOMIS,           Focused on classes,
L&C, FMA)                  concepts
                           ‘conceptual models’
Axioms, Theorems,
Definitions                DAML+OIL
                           Semantic Web
support reasoning,         Gene Ontology
reliability of curation,
                           UMLS
Axiomatically Grounded      Terminology Ontology
Ontology                    Application Ontology

Good definitions            Description logic
Rich logic                  ensures good
Poor computability          computability
Slow, largely manual        Poor definitions
population of information   (often trivial,
systems                     often circular)
Axiomatically Grounded      Terminology Ontology
Ontology                    Application Ontology

Good definitions            Description logic
Rich logic                  ensures good
Poor computability          computability
Slow, largely manual        Poor definitions
population of information   (often trivial,
systems                     often circular)
Axiomatically Grounded   Terminology Ontology
Ontology                 Application Ontology

Needed for ontology      Absence of real
alignment good           definitions means that
definitions of           the same terms are
foundational relations   used in different
like ‘is_a’, ‘part_of’   systems to mean
                         different things – which
                         prevents ontology
                         alignment
Parts and Classes in Biomedical
           Ontology
            Barry Smith
       http://ontologist.com
GO:0003673: cell fate commitment


 Definition: The commitment of cells to
 specific cell fates and their capacity to
 differentiate into particular kinds of cells.
  GO: asymmetric
protein localization
involved in cell fate
   commitment
The intended meaning of part-of
 as explained in the GO Usage Guide is:

 “part of means can be a part of, not is always a
 part of: the parent need not always encompass
 the child. For example, in the component
 ontology, replication fork is a part of the
 nucleoplasm; however, it is only a part of the
 nucleoplasm at particular times during the cell
 cycle”
          So, GO ‘part of’
means:
can be a part of, not is always a part of
         But what about:
GO: a flagellum is part-of cells

here ‘part of’ means:

some kinds of cells always have
flagella as parts
And what about:
GO: Cellular Component Ontology is part-of
 Gene Ontology
GO: Biological Process Ontology is part-of
 Gene Ontology
GO: Molecular Process Ontology is part-of
 Gene Ontology

here ‘part of’ means: one vocabulary is
  included in another vocabulary
  GO’s three meanings of part-of

   1. A time-dependent mereological inclusion
   relation between instances
A sometimes_part_of B =def t x y
(inst(x, A, t) & inst(y, B, t) & part(x, y, t)).
   2. Some (types of) Bs have As as parts:
A part_ofGO B =def C (C is_a B & A part_of C)
   3. Inclusion relations between vocabularies
 GO’s use of ‘part of’ illustrates the
        following problems
One term being used to represent a plurality
  of different relations
One lexically simple term being used to
  represent lexically complex concept
A term with an established use (inside and
  outside biomedical ontology) being used
  with a new non-standard use
Because we want to use GO
   to support reasoning
           GO’s Usage Guide

lists four ‘logical relationships’ between ‘is a’ and
   ‘part of’:
(1) (A part_of B & C is_a B)  A part_of C
(2) is_a is transitive
(3) part_of is transitive
(4) NOT: (A is_a B & C part_of A)  C part_of B
  Of these four logical relationships,
                  only
            (2) is_a is transitive

is valid, and even this law is mis-expressed by GO as:
if A is an instance of B
and B is an instance of C
then A is an instance of C

so that GO confuses classes with instances
      (3) part_ofGO is transitive
fails because of
         plastid part_ofGO cytoplasm
         cytoplasm part_ofGO cell (sensu
         Animalia)
But not: plastid part_ofGO cell (sensu Animalia).
        GO built by biologists
 who deliberately did not want to take
 account of any of the results of non-
 biologists working in fields such as
 ‘ontology’

But still: GO belongs to the world of KR

The ‘K’ of KR is characteristically a very odd
 fragment of what (e.g. scientists) would
 recognize as ‘knowledge’
The world of KR is world of classes
    exclusively (e.g. WordNet)

Dictionary makers live in a world of classes
  exclusively
Terminologists live in a world of classes
  exclusively
Description logic lives in a world of classes
  (almost) exclusively
   GO’s confusion about part-of

1. A time-dependent mereological inclusion
    relation between instances
A sometimes_part_of B =def t x y
(inst(x, A, t) & inst(y, B, t) & part(x, y, t)).
2. Some (types of) Bs have As as parts:
A part_ofGO B =def C (C is_a B & A part_of C)
3. Inclusion relations between vocabularies
Entities
                     Entities

universals (classes, types, roles …)




particulars (individuals, tokens, instances …)



   Axiom: Nothing is both a universal and a particular
 Two Kinds of Elite Entities

classes, within the realm of universals

instances within the realm of particulars
Entities

classes
Entities

classes*




*natural, biological
   Entities

classes of objects




different axioms for classes
of functions, processes, etc.
Entities

 classes




instances
  Classes are natural kinds

Instances are natural exemplars of
natural kinds
(problem of non-standard instances
must be dealt with also)
                Entities

                  classes




                  instances
                instances



penumbra of borderline cases
             Entities

               classes



junk                              junk
             instances

                    junk


       example of junk: beachball desk
    Primitive opposition between
     universals and particulars


variables A, B, … range over universals
variables x, y, … range over particulars
          Primitive relations:
             inst and part
 inst(Jane, human being)
 part(Jane’s heart, Jane’s body)

A class is anything that is instantiated
An instance as anything (any individual) that
  instantiates some class
Entities

 human


    inst


  Jane
  Entities

    human




Jane’s heart part Jane
                Axioms for part
Axioms governing part (= ‘proper part’)
  (1) it is irreflexive
  (2) it is asymmetric
  (3) it is transitive
  (+ usual mereological axioms)

part is the usual mereological relation among
  individuals
                   Definitions

class(A) =def x inst(x, A)

instance(x) = defA inst(x, A)


Theorem: Nothing can be both an instance
 and a class
    Axiom of Extensionality
Classes which share identical instances
are identical
(need to take care of the factor of time)
          Entities

            classes

differentiae (roles, qualities…)



             x, y, …
             Differentiae
Aristotelian Definitions
  An A is a B which exemplifies C
C is a differentia
No differentia is a class
exemp(individual, differentia)
exemp(Jane, rationality)
objects exemplify roles
             A is_a B
             genus(A)
            species(A)



classes




instances
 A is_a B =def x (inst(x, A)  inst(x, B))
 genus(A)=def B (B is_a A & B  A)
 species(A)=def B (A is_a B & B  A)


classes




instances
            nearest species
nearestspecies(A, B)=def A is_a B &
C ((A is_a C & C is_a B)  (C = A or C = B)



                      B

                      A
     Definitions


     highest genus




lowest species
 lowest species and highest genus
lowestspecies(A)=def
         species(A) & not-genus(A)
highestgenus(A)=def
         genus(A) & not-species(A)

Theorem:
class(A)  genus(A) or lowestspecies(A)
                   Axioms
 Every class has at least one instance

 Distinct lowest species never share instances

SINGLE INHERITANCE:
 (nearestspecies(A, B) & nearestspecies (A, C))
          B=C
      Axioms governing inst
genus(A) & inst(x, A) 
    B nearestspecies(B, A) & inst(x, B)
EVERY GENUS HAS AN INSTANTIATED
  SPECIES

nearestspecies(A, B)  A’s instances are
  properly included in B’s instances
EACH SPECIES HAS A SMALLER CLASS
  OF INSTANCES THAN ITS GENUS
                 Axioms
nearestspecies(B, A)
   C (nearestspecies(C, A) & B  C)
  EVERY GENUS HAS AT LEAST TWO
  CHILDREN

nearestspecies(B, A) & nearestspecies(C, A) &
  B  C)  not-x (inst(x, B) & inst(x, C))
  SPECIES OF A COMMON GENUS NEVER
  SHARE INSTANCES
                Theorems
(genus(A) & inst(x, A))  B (lowestspecies(B) & B
  is_a A & inst(x, B))
  EVERY INSTANCE IS ALSO AN INSTANCE OF
  SOME LOWEST SPECIES

(genus(A) & lowestspecies(B) & x(inst(x, A) &
  inst(x, B))  B is_a A)
  IF AN INSTANCE OF A LOWEST SPECIES IS AN
  INSTANCE OF A GENUS THEN THE LOWEST
  SPECIES IS A CHILD OF THE GENUS
               Theorems
A is_a B & A is_a C
           (B = C or B is_a C or C is_a B)
  CLASSES WHICH SHARE A CHILD IN
  COMMON ARE EITHER IDENTICAL OR
  ONE IS SUBORDINATED TO THE
  OTHER
               Theorems
(genus(A) & genus(B) & x(inst(x, A) &
  inst(x, B)))  C(C is_a A & C is_a B)

 IF TWO GENERA HAVE A COMMON
 INSTANCE THEN THEY HAVE A
 COMMON CHILD
                Theorems

class(A) & class(B)  (A = B or A is_a B or
  B is_a A or not-x(inst(x, A) & inst(x, B)))
  DISTINCT CLASSES EITHER STAND IN
  A PARENT-CHILD RELATIONSHIP OR
  THEY HAVE NO INSTANCES IN
  COMMON
 The axioms and theorems above
          are non-trivial
Almost all of them can be found in Aristotle
Taken over by the Foundational Model of
  Anatomy
Forgotten because of dominance of set
  theory and its dark progeny (KR,
  description logic, model-theoretic
  semantics, ‘conceptual modeling’, etc.)
         Definition of is_a


A is_a B =def A and B are classes &
  x (inst(x, A)  inst(x, B))
  Part_of as a relation between
   classes is more problematic

testis part_of human being ?


heart part_of human being ?
                WordNet
can’t deal with optional body parts, like
  warts, freckles, etc.
Nor with temporary optional body parts like
  pony-tail or five-o'clock shadow.
            Part_for and Has_part
    from Smith and Rosse, “The Role of Foundational
   Relations in the Alignment of Biomedical Ontologies”
A part_for B =def
       x ( inst(x, A)  y ( inst(y, B) & part(x, y) ) )

B has_part A =def
      y ( inst(y, B)  x ( inst(x, A) & part(x, y) ) )

  human testis part_for human being,
     But not: human being has_part human testis.
  human being has_part heart,
     But not: heart part_for human being.
                Part_of
A part_of B =def A part_for B & B
   has_part A
This defines an Egli-Milner order
It guarantees that As exist only as parts of
   Bs and that Bs are structurally
   organized in such a way that As must
   appear in them as parts.
part_of NOT best understood as a relation
   between classes!
Foundational Model of Anatomy


distinguishes canonical anatomy – deals
with classes and with instances (generically)
plus instantiated anatomy (deals with
individual cases)
plus various variant anatomies to deal with
standard sorts of deviant instances
Relations in Foundational Model of
             Anatomy

clinical part of         part of
constitutional part of   possible part
forms                    regional part of
general part of          related part
member of                segmental composition of
necessary part           segmental contribution to
necessary whole          systemic part of
 Recall: problems with GO’s use of
               ‘part of’
One term used to represent a plurality of
   different relations
A term with an established use adopted with
   a new non-standard use
Lexically simple term is used to represent
   lexically complex concepts
LEADS TO CIRCULARITY:
if ‘part of’ means ‘can be part of’ then ‘can be
   part of’ means ‘can be can be part of’ …
         Circular definitions

endemic in biomedical terminology systems
  (GO, Snomed, etc.)

Circular definitions can be cheaply produced
  in large numbers to impress funding
  agencies
       UMLS Semantic Type
Semantic Type: Idea or Concept
Definition: An abstract concept, such as a
  social, religious or philosophical concept.
• problem: circularity
• (many other problems: Florence is an idea
  or concept)
  UMLS-SN Semantic Relation
Semantic Relation:
 conceptual_part_of
  Definition: Conceptually a portion,
  division, or component of some larger
  whole.
  Inverse: has_conceptual_part
• definition is semantically incoherent
  UMLS-SN Semantic Relation
Semantic Relation:
 associated_with
 Definition: has a significant or salient
 relationship to.
 Inverse: associated_with

• confuses entity with our cognition of the
  entity
  UMLS-SN Semantic Relation
Semantic Relation: isa
 Definition: The basic hierarchical link in
 the Network. If one item "isa" another item
 then the first item is more specific in
 meaning than the second item.
 Inverse: inverse_isa

• confuses class with concept/meaning
• is not a definition
  UMLS-SN Semantic Relation

Semantic Relation: part_of
  Definition: Composes, with one or more
  other physical units, some larger whole.
  This includes component of, division of,
  portion of, fragment of, section of, and
  layer of.
  Inverse: has_part
• bad inverse
  UMLS-SN Semantic Relation

Semantic Relation: location_of
 Definition: The position, site, or region of
 an entity or the site of a process.
 Inverse: has_location

• danger of confusing instance/class levels
  UMLS-SN Semantic Relation
Semantic Relation: occurs_in
  Definition: Takes place in or happens under
  given conditions, circumstances, or time periods,
  or in a given location or population. This
  includes appears in, transpires, comes about, is
  present in, and exists in.
  Inverse: has_occurrence

• same problem
• confusion of objects and processes (exists in)
  UMLS-SN Semantic Relation
Semantic Relation: prevents
  TUI: T148
  Definition: Stops, hinders or eliminates an
  action or condition.
  Inverse: prevented_by

• bad inverse:
  contraception prevents pregnancy
  pregnancy prevented by contraception
  UMLS-SN Semantic Relation

Semantic Relation: process_of
 Definition: Action, function, or state of.
 Inverse: has_process

• avoids circularity by introducing confusion
  UMLS-SN Semantic Relation

Semantic Relation: produces
  Definition: Brings forth, generates or creates.
  This includes yields, secretes, emits,
  biosynthesizes, generates, releases, discharges,
  and creates.
  Inverse: produced_by
• bad inverse:
  artificial insemmination produces pregnancy
  pregnancy produced by artificial insemmination
The UMLS Semantic Network
is ‘an upper-level ontology … in which all
concepts are given a consistent and
semantically coherent representation’.

Alexa McCray, “An upper level ontology for the
biomedical domain”. Comp Functional Genomics
2003; 4: 80-84.
               Conclusion
Work on biomedical ontologies and terminologies
  has focused almost exclusively on classes (often
  confusingly referred to as ‘concepts’).
The class-orientation of KR goes with the
  assumption that all that need be said about
  classes can be said without appeal to formal
  features of instantiation of the sorts described
  above.
KR-facts (e.g. about ‘conceptual parts’) are pulled
  out of the air in an unprincipled way.
This leads to an impoverished regime of definitions
  in which the use of identical terms (like ‘part’)
  masks underlying incompatibilities.
http://ontologist.com




     the end
              Conclusion 1/2
Matters have not been helped by the fact that
   description logic has been oriented primarily around
   reasoning with classes.
Certainly if we are to produce information systems with
   the requisite computational properties, then this
   entails recourse to a logical framework like that of
   description logic.
At the same time we must ensure that the data that
   serves as input to such systems is organized formally
   in a way that sustains rather than hinders successful
   alignment with other systems.
There are two complementary tasks: REFERENCE
   ONTOLOGY and APPLICATION ONTOLOGY
         Classes vs. Sums
Classes are distinguished by granularity: they
divide up the corresponding domain into whole
units or members, whose interior parts and
structure are traced over. The class of human
beings is instantiated only by human beings as
single, whole units.

A mereological sum is not granular in this sense.
            Classes vs. Sets
Both classes and sets are marked by granularity –
  but sets are timeless
Each class or set is laid across reality like a grid
  consisting (1) of a number of slots or pigeonholes
  each (2) occupied by some member.
But a set is determined by its members. This means
  that it is (1) associated with a specific number of
  slots, each of which (2) must be occupied by some
  specific member. A set is thus specified in a double
  sense.
A class survives the turnover in its instances, and so it
  is specified in neither of these senses, since both (1)
  the number of associated slots and (2) the individuals
  occupying these slots may vary with time.
A class is not determined by its instances as a state is
  not determined by its citizens.
         Classes vs. Sets
A set with n members has in every case
exactly 2n subsets
The subclasses of a class are limited in
number
(which classes are subsumed by a larger
class is a matter for empirical science to
determine)
           Classes vs. sets

A set is an abstract structure, existing outside time
and space. The set of human beings existing at t is
(timelessly) a different entity from the set of human
beings existing at t because of births and deaths.
A class can survive changes in the stock of its
instances because classes exist in time. (An
organism can similarly survive changes in the stock
of cells or molecules by which it is constituted.)

D1* A is_a B =def t x ( inst(x, A, t)  inst(x, B, t)
),

D1* will take care of false positives such as adult is_a
child
We can prove: is_a is reflexive and antisymmetric

Axiom: part_of is irreflexive

We can prove that part_of is asymmetric

We can prove that both is_a and part_of are
 transitive
GO’s curators accordingly now consider removing the corresponding assertion
   from its Usage Guide.
As concerns (3), consider:
   plastid part_ofGO cytoplasm
   cytoplasm part_ofGO cell (sensu Animalia)
But not: plastid part_ofGO cell (sensu Animalia).
While ‘cell (sensu Animalia)’ is not a term in GO, it does conform to GO’s rules
   for term formation, and this suggests reason for some uncertainty also as to
   the validity of (3).
GO justifies its rejection of (4) with the following example:
meiotic chromosome is_a chromosome
          synaptonemal complex part_ofGO meiotic chromosome
But not necessarily:
   synaptonemal complex part_ofGO chromosome.
On the reading of GO’s ‘part of’ as meaning ‘can be part
  of’, however, it seems that synaptonemal complex is
  ‘part of’ chromosome. And if the reading of GO’s ‘part
  of’given in D11 is correct, then (4) can indeed be proved
  as a matter of logic.
We suggest that it is only by appeal to formal definitions
  that these and related uncertainties (detailed in [[19]])
  can be resolved. Formal definitions would help to ensure
  also that when the terms of controlled vocabularies like
  GO are mapped into the UMLS Metathesaurus then this
  is done in ways that support the drawing of reliable
  inferences concerning relations between these terms
  and existing terms in the Metathesaurus.
The growth of bioinformatics has led to an increasing
  number of evolving ontologies which must be correlated
  with the existing terminology systems developed for
  clinical medicine. A critical requirement for such
  correlations is the alignment of the fundamental
  ontological relations used in such systems, above all the
  relations of class subsumption (is_a) and partonomic
  inclusion (part_of). To achieve this end, however,
  existing clinical and evolving bioinformatics terminologies
  need to call upon formalisms whose significance was not
  evident at the time these resources were originally
  conceived.
Both is_a and part_of are ubiquitous in bioinformatics ontologies and
    terminologies. Yet their treatment is inconsistent and problematic,
    and in some cases the two relations are not clearly distinguished at
    all. SNOMED-RT, for example, has: both testes is_a testis. UMLS
    has: plant leaves is_a plant.
In this communication we argue that a coherent treatment of is_a and
    part_of must be based on explicit formal definitions which take into
    account not only the classes involved as terms of these relations but
    also the instances of these classes. We base our arguments on the
    lessons we learned during the evolution of the Digital Anatomist
    Foundational Model of Anatomy (FMA, for short [[1]]) in which we
    have refined the treatment of these relations over time and
    distinguished between classes and instances in terms of canonical
    and instantiated anatomy. [[2]]
Our objectives are to define canonical and instantiated anatomy before
  giving formal definitions of is_a and part_of in terms of a theory of
  instantiation. We then discuss in this light issues of universal
  relevance to ontologies, such as classes vs. wholes and sets,
  granularity, idealization, and the role of time and change. After
  illustrating problematic usage of is_a and part_of we draw
  conclusions for ontology alignment, pointing to the need for
  supplementing Description Logic-based reasoning implementations
  with rigorous manual auditing of underlying data-sources based on
  formal analyses in terms of instance-level relations and on clear and
  intuitive principles of curation.
Canonical and Instantiated Anatomy
Canonical anatomy is a field of anatomy (science) that comprises the
  synthesis of generalizations based on anatomical observations that
  describe idealized anatomy (structure). These generalizations have
  been implicitly sanctioned by their usage in anatomical discourse.
  Instantiated anatomy is a field of anatomy (science) that comprises
  anatomical data pertaining to individual instances of organisms and
  their parts. Instantiated anatomy is needed to support the application
  of biomedical knowledge in clinical care and in fields such as image
  analysis. The corresponding instance-data is not incorporated into
  the FMA, which deals with idealizations at a higher level of
  abstraction. In introducing the relation between canonical and
  instantiated anatomy, however, the FMA provides the key to an
  adequate formal treatment of is_a and part_of, for the latter can be
  defined and formally interrelated only when the relation of
  instantiation between instances and classes is taken into account.
Formal Theory of Is_a and Part_of
We use the term entity as a universal ontological term of art
  embracing objects, processes, functions, structures,
  times and places, and we distinguish among entities in
  general two special sub-totalities, called instances and
  classes, respectively. Instances are individuals
  (particulars, tokens) of special sorts. Thus each is a
  simply located entity, bound to a specific (normally
  topologically connected) location in space and time. [[3]]
  Classes (also called universals, kinds, types) are multiply
  located; they exist in their respective instances.
To formalize these notions we use standard first-order logic with variables x, y, x1, etc. ranging over
    instances, and A, B, A1, etc. ranging over classes. Our system rests on two primitive relations of
    inst and part. Inst is the relation of instantiation between instances and classes, illustrated by:
    Jane is an instance of human being. Part is the relation of parthood among instances, illustrated
    by: Jane’s heart is part of Jane’s body. We define a class as anything that is instantiated; an
    instance as anything (any individual) that instantiates some class.
The principal axioms governing inst are: (1) that it holds in every case between an instance and a
    class, in that order; and (2) that nothing can be both an instance and a class.
The axioms governing part (also called ‘proper part’) can be specified as follows [[4]]. It is (1)
    irreflexive (no entity is part of itself), (2) asymmetric (if part(x, y) then not-part(y, x)), and (3)
    transitive (if part(x, y) and part(y, z), then part(x, z)). In addition, it satisfies: (4) a principle
    governing the formation of sums of parts (for example of binary sums x+y), and (5) a remainder
    axiom, to the effect that if part(x, y) then there is some part z of y which does not share parts in
    common with x.
We use the standard quantifiers of first-order logic: , abbreviating for some value of, and ,
    abbreviating for all values of. The device of quantification allows us to take account of instantiation
    in generic fashion, i.e. without the need to take specific instances into account. The full formalism
    requires general axioms specifying the properties of classes as natural kinds (rather than arbitrary
    collections) [[5]], together with more specific axioms dealing with the different sorts of classes (of
    objects, functions, processes, pathways, sites, etc.) in the different domains of biomedical
    ontology. It also requires an axiom of extensionality, to the effect that classes which share
    identical instances are themselves identical.
We can now define is_a, the relation of class subsumption:
D1 A is_a B =def x ( inst(x, A)  inst(x, B) )
where ‘’ abbreviates: if ... then .... To say that A is_a B is to say that
   every instance of A is an instance of B.
To define part_of is more tricky. We start by defining:
D2 A part_for B =def
         x ( inst(x, A)  y ( inst(y, B) & part(x, y) ) ).
D2 provides information primarily about As; it tells us that As do not
   exist except as instance-level parts of Bs. Conversely:
D3 B has_part A =def
         y ( inst(y, B)  x ( inst(x, A) & part(x, y) ) )
provides information primarily about Bs; it tells us that Bs do not exist
   except with As as instance-level parts.
Because there are female as well as male human beings, we can state: human
   testis part_for human being, but we cannot state: human being has_part
   human testis. Because non-human vertebrates also have hearts, we can
   state human being has_part heart, but not: heart part_for human being.
We now define the relation part_of by combining D2 and D3:
D4 A part_of B =def A part_for B & B has_part A
Thus A part_of B if and only if: (i) for any instance x of A there is some instance
   y of B which is such that x stands to y in the instance-level part relation, and
   vice versa: (ii) for any instance y of B there is some instance x of A which is
   such that x stands to y in this same relation. This yields a strong structural
   mereological tie between the classes A and B (defining a so-called Egli-
   Milner order [[6]]). It guarantees that As exist only as parts of Bs and that Bs
   are structurally organized in such a way that As must appear in them as
   parts. That partonomies like those associated with the FMA are structured
   by the full part_of relation is ensured by the fact that here all terms for body
   parts are assumed to have an implicit prefix designating the type of
   organism involved.
Sometimes we need to capture mereological
  relations involving specific numbers of instances.
  Thus in a case like human being has_part brain,
  we need to express that each instance of human
  being has exactly one instance of brain as part:
inst(x, human)  yz((inst(z, brain) & part(z,
  x))  z = y)
with generalizations to represent a human being’s
  canonical organization as having two lungs, ten
  fingers, and so on.
Both is_a and part_of are standardly treated as relations between
   classes. The formal structure of D4 makes it clear, however, that the
   latter does not signify that classes stand in some special class-level
   mereological inclusion relation. Rather, it expresses more
   fundamental part-relations – captured in D2 and D3 – between the
   underlying instances.
A distinction analogous to that between D2, D3 and D4 is indispensable
   to the formal definition of many other foundational relations of
   biomedical ontologies – including 53 of the 54 relations contained in
   the UMLS Semantic Network (UMLS-SN, Version 2003AB) [[7]]. In
   particular, reference to instances is a necessary first step in the
   rigorous implementation, in systems like the FMA, of
   mereotopological relations such as spatial occupation and spatial
   adjacency, as also of concepts such as junction, boundary, cluster,
   and the like. [1,[8],[9]]
We can then prove that is_a is reflexive (for every class A we have: A is_a A), and
   antisymmetric (if A is_a B and B is_a A, then A and B are identical).
We need to add as axiom that part_of is irreflexive (that no class is part_of itself). From
   this we can prove that part_of is also asymmetric (if A part_of B then not-B part_of A).
We can prove also that both is_a and part_of are transitive: thus if A is_a B and B is_a C,
   then A is_a C, and if A part_of B and B part_of C then A part_of C.
Classes vs. Wholes: Granularity and Idealization
has been allowed to mask underlying incompatibilities. Matters have not been helped by
   the fact that description logic, the prevalent framework for terminology-based
   reasoning systems, has with some recent exceptions (e.g. [[20]]) been oriented
   primarily around reasoning with classes.
Certainly if we are to produce information systems with the requisite computational
   properties, then this entails recourse to a logical framework like that of description
   logic. At the same time, however, we must ensure that the data that serves as input
   to such systems is organized formally in a way that sustains rather than hinders
   successful alignment with other systems. The way forward is to recognize, as does
   the FMA, that these are two distinct tasks, both of which are equally important to the
   construction of biomedical ontologies and terminologies.
A rigorous system of formal definitions to support biomedical ontology
   alignment must clarify also the relations between the concept of
   class, mereological whole and set. Here, too, the reference to
   instances is indispensable. For classes are distinguished by the fact
   that they capture their instances in a way which involves the factor
   of granularity, which means: in such a way as to divide up the
   corresponding domain into whole units or members, whose interior
   parts and structure are traced over. [[10]] A mereological sum is not
   granular in this sense. The mereological sum of human beings
   comprehends also all instance-level parts (including organs, cells,
   molecules, and so on). The class of human beings, in contrast, is
   instantiated only by human beings as single, whole units.
The instances (units, members) in a class are marked out by the fact
  that, in the Aristotelian terms used by the FMA, they share a
  common essence. [[11],[12]] Which classes exist in a given domain
  is a matter for empirical research. Hence a good first clue to the
  existence of a class is provided by the fact that there exists a
  corresponding term that has either been sanctioned by (in our case
  anatomical) science or can be inferred from terms so sanctioned by
  the need to fill gaps in the taxonomy or partonomy (for example
  terms for higher-level classes and for not previously named classes
  instantiated by macroscopic parts of the body) [[13]]. In anatomy and
  related disciplines a supplementary clue may be provided through
  the association of given classes with the structural genes whose
  coordinated expression gives rise to the corresponding instances.
Each class-definition in the FMA specifies the essence shared by the
  corresponding instances via the specification of (i) a genus, which is
  some wider class to which the given class belongs, together with (ii)
  the differentiae which mark out its instances within this wider class.
Biological classes are marked always by an opposition between standard or prototypical
    instances and a surrounding penumbra of non-standard instances (not all instances
    of the class human being are marked by the presence of amputation stumps or
    pituitary tumors). To do justice to these matters FMA introduces the factor of
    idealization, which means (in first approximation) that the classes of the FMA’s
    Anatomy Taxonomy AT include only those instances to which canonical anatomy
    applies.
This means that we need to revise definitions D1–D4 by restricting the range of variables
    x, y, ... to the realm of individuals which satisfy the generalizations of canonical
    anatomy, so that the same abstraction of anatomy (structure) will be represented in
    all the instances of any given AT-class. This device of specifying different ranges of
    variables gives us the means also to represent the generalizations belonging to the
    different branches of canonical anatomy, for example to canonical anatomy for male
    vs. female human beings, for human beings at various developmental stages, and for
    organisms in other species. It can allow us also to represent the generalizations
    governing the anatomical variants yielded by the presence of, for example, coronary
    arteries or bronchopulmonary segments, which deviate from canonical anatomical
    patterns of organization.
Classes vs. Sets: Granularity and Time
Sets in the mathematical sense, too, are marked by the factor of granularity, which means that each
    set comprehends its members as single, whole units. A class or set is laid across reality like a grid
    consisting (1) of a number of slots or pigeonholes each (2) occupied by some member. (This
    informal talk of grids and slots is formalized in [[14]] in terms of the theory of granular partitions.)
    Classes are distinguished from sets, however, by the fact that a set is determined by its members.
    This means that it is (1) associated with a specific number of slots, each of which (2) must be
    occupied by some specific member. A set is thus specified in a double sense. A class, in contrast,
    survives the turnover in its instances, and so it is specified in neither of these senses, since both
    (1) the number of associated slots and (2) the individuals occupying these slots may vary with
    time.
Sets are distinguished from classes also in this: a set with n members has in every case exactly 2n
    subsets, constituted by all the combinations of these members. The subclasses of a class, on the
    other hand, are limited in number, and which classes are subsumed by a larger class is a matter
    for empirical science to determine. Leaves (lowest nodes) in the taxonomy are (changing)
    collections of instances. As we move up the taxonomy we encounter in succession collections of
    such collections of instances, collections of collections of such collections, etc., organized in a
    nested hierarchy reaching up to the maximal class or ‘root’. We can visualize the classes at
    different levels as being analogous to geopolitical entities (towns, counties, states) as represented
    on a map. Instances correspond in this analogy to the corresponding populations: a class is not
    determined by its instances as a state is not determined by its citizens.
Classes are distinguished from sets also by their relation to time. A set is an abstract
    structure, existing outside time and space, and this is so even when its members are
    parts of concrete reality. Since each set is determined by its members, the set of
    human beings existing at t is (timelessly) a different entity from the set of human
    beings existing at t because of births and deaths.
Matters are different with regard to classes. The class human being can survive the
    change in the stock of its instances which occurs when John and Jane die, because
    classes exist in time. John and Jane themselves can similarly survive changes in the
    stock of cells or molecules by which they are constituted.
To do justice to the fact that classes in the biological domain endure even when their
    extensions change, a full definition of the is_a relation must involve a temporally
    indexed reading of inst (with variables t, t, etc., ranging over times):
D1*A is_a B =def t x ( inst(x, A, t)  inst(x, B, t) ),
so that A is_a B means: at all times t, if x is an instance of A at t then x is an instance of
    B at t. D1* will also take care of false positives such as adult is_a child, which an
    untensed reading of D1 would otherwise allow. In general, all statements of inst and
    part relations involving objects in biomedical ontologies, like all the data of
    instantiated anatomy, are indexed by times.
Taxonomy and Partonomy
A taxonomy such as AT is formally speaking a tree in the mathematical sense.
    It satisfies axioms to the effect that (1) it has a root or unique maximal
    genus (here: anatomical entity) and (2) all other classes are connected to
    this root via finite chains of is_a relations satisfying a principle of single
    inheritance. A partonomy, in contrast, is a partial order in the mathematical
    sense, with top (here: organism – the class instantiated by mereologically
    maximal entities), to which all other classes are connected via chains of
    part_of relations.
We can then define the concepts of root and leaf of a taxonomy and top and
    bottom of a partonomy as follows.
D5 root(A) =def B (B is_a A)
D6 leaf(A) =def B (B is_a A  A = B)
D7 top(A) =def
                     B (A = B or B part_of A) & not-B (A part_of B)
D8 bottom(A) =def not-B (B part_of A).
We can then postulate axioms to the effect that every class includes some leaf
   as subclass, and that every instance of every class instantiates some leaf:
          AB ( leaf(B) & B is_a A )
          Ax ( inst(x, A)  B (leaf(B) & inst(x, B) ) )
The taxonomical union AÈB of classes A and B is defined as the minimal class
   satisfying the condition that it contains both A and B as subclasses. Such a
   class always exists, since A and B are in any case subclasses of the root.
   The taxonomic union of femur and liver, for example, is organ. The
   partonomic union of two classes A+B is the class, if it exists, whose
   instances are sums x+y of instances of classes A and B respectively. While
   every pair of classes has a taxonomic union, only some classes have a
   partonomic union, since entities of the form x+y are instances of classes
   only in some highly restricted cases, for example: left lung = upper-lobe-of-
   left-lung + lower-lobe-of-left-lung. Such examples characteristically involve
   the phenomenon of fiat boundaries. [[15],[16]]
As concerns taxonomic intersection, a class is never immediately subordinated to more
    than one higher class within a tree. This means that if two classes overlap in sharing
    some common sub-class, then this is because one is a subclass of the other. AB,
    the taxonomic intersection of A and B, if it exists, is then simply the smaller of these
    two classes. We can add further an axiom to the effect that, if two classes are such
    as to overlap in sharing some common instances, then this, too, is because one is a
    subclass of the other:
x (inst(x, A) Ù inst(x, B))  A is_a B or B is_a A.
Classes can overlap partonomically, on the other hand, in such a way that there is a
    class which stands in the part_of relation to both, though neither stands in this
    relation to the other:
D9 A1 partonomic_overlap A2 = def
                                 A (A part_of A1 & A part_of A2).
For example: pelvis and vertebral column overlap in the sacrum and coccyx. Most
    classes in the biomedical domain do not overlap partonomically in this sense, yet it is
    this difference in behavior between taxonomic and partonomic overlap which
    captures the essential difference between the tree structure of taxonomies and the
    partial order structure of partonomies.
Conclusion
Practitioners in the biomedical sciences move easily between the realm of classes and the realm of
    instances existing in time and space. For historical reasons, however, work on biomedical
    ontologies and terminologies – which grew out of work on medical dictionaries and nomenclatures
    – has focused almost exclusively on classes (or ‘concepts’) atemporally conceived. This class-
    orientation is common in knowledge representation, and its predominance has led to the
    entrenchment of an assumption according to which all that need be said about classes can be
    said without appeal to formal features of instantiation of the sorts described above. This, however,
    has fostered an impoverished regime ofof definitions in which the use of identical terms in different
    systems has been allowed to mask underlying incompatibilities. Matters have not been helped by
    the fact that des-crip-tion logic, the pre-valent frame-work for ter-mi-no-logy-based reas-on-ing
    sys-tems, has with some recent exceptions (e.g. [[i]]) been oriented primarily around reasoning
    with classes.
Certainly if we are to produce information systems with the requisite computational properties, then
    this entails recourse to a logical frame-work like that of description logic. At the same time,
    however, we must ensure that the data that serves as input to such sys-tems is organized formally
    in a way that sus-tains rather than hinders successful alignment with other systems. The way
    forward is to recognize, as does the FMA, that these are two distinct tasks, both of which are
    equally important to the construction of biomedical ontologies and terminologies.
The problem of ontology alignment
GO
SCOP
SWISS-PROT
SNOMED
MeSH
FMA
   …
all remain at the level of TERMINOLOGY (two reasons:
   legacy of dictionaries + DL)
What we need is a REFERENCE ONTOLOGY = a
   formal theory of the foundational relations which
   hold TERMINOLOGY ONTOLOGIES and
   APPLICATION ONTOLOGIES together
  Analogous distinctions required for nearly all
foundational relations of ontologies and semantic
                    networks:
 A causes B
 A is associated with B
 A is located in B
 etc.

 Reference to instances is necessary in
 defining mereotopological relations such
 as spatial occupation and spatial
 adjacency
Instances are elite individuals
Which classes (and thus which instances)
exist in a given domain is a matter for
empirical research.

Cf. Lewis/Armstrong “sparse theory of
universals”
D   extension(A) = {x | inst(x, A)}

D9 differentia(A) =def BC
 nearestspecies(B, C) & A  B & A  C &
 extension(C) = extension(B) 
 extension(C)
The genus together with the differentia of a
 species constitutes the essence of the
 species.

differentia (A)  not-class(A)
    Mathematical Structure
Each class hierarchy constitute a
supremum-semilattices with respect to
is_a.
              Axioms (Berg)
A1 lowestspecies(A)  x inst(x, A)
A2 lowestspecies(A) & lowestspecies(B) & A  B
          (not-x inst(x, A) & inst(x, B))
A3 nearestspecies(A, B) & nearestspecies (A, C)
         B=C
A4 genus(A) & inst(x, A) 
         B nearestspecies(B, A) & inst(x, B)
A5 nearestspecies(A, B)  the extension of A
 is a subset of the extension of B
           Axioms (Berg)
genus(A) & inst(x, A) 
    B nearestspecies(B, A) & inst(x, B)
EVERY GENUS HAS AN INSTANTIATED
  SPECIES

nearestspecies(A, B)  the extension of A
  is a subset of the extension of B
EACH SPECIES HAS A SMALLER CLASS
  OF INSTANCES THAN ITS GENUS
               Axioms (Berg)
nearestspecies(B, A)
   C (nearestspecies(C, A) & B  C
EVERY GENUS HAS AT LEAST TWO CHILDREN

nearestspecies(B, A) & nearestspecies(C, A) & B  C)
   not-x (inst(x, B) & inst(x, C))
SPECIES OF A COMMON
A8 There is no infinite sequence <A1, A2, …> such
  that nearestspecies(Ai, Ai+1) for all i  1
A9 There is no infinite sequence <A1, A2, …> such
  that nearestspecies(Ai+1, Ai) for all i  1
           Theorems (Berg)
T1 nearestspecies(A, B)  the extension of A is
  a proper subset of the extension of B
T2 A x inst(x, A)
T3 nearestspecies(A, B)  not-C
  (nearestspecies(A, C) & nearestspecies(C, B))
T4 lowestspecies(A1) & lowestspecies(A2) &
  nearestspecies(A1, B)
       not-C(nearestspecies (B, C) &
  nearestspecies (C, A2)
           Theorems (Berg)
T5 (genus(A) & inst(x, A))  B
  (lowestspecies(B) & B is_a A & inst(x, B))
T6 (genus(A) & lowestspecies(B) & x (inst(x,
  A) & inst(x, B))  B is_a A
T7 A is_a B & A is_a C
             (B = C or B is_a C or C is_a B
T8 (genus(A) & genus(B) & x(inst(x, A) &
  inst(x, B)))  C(C is_a A & C is_a B)
T9 class(A) & class(B)  (A = B or A is_a B or
  B is_a A or not-x(inst(x, A) & inst(x, B)))
                      WordNet
NOT: wheel PART OF car

  WordNet represents part-of quite sparingly
  It normally gives trivial holonymic relations which are just
  true by definition).

wheel PART OF wheeled vehicle
steering wheel PART OF steering system
                     WordNet
With has_part relations it is more generous:

car, auto, automobile, machine, motorcar --
  HAS PART: air bag
  HAS PART: glove compartment
  etc.
            Circular definitions

and associated problems in general endemic in biomedical
  terminology systemsConfusion of use and mention
Confusion of concepts and objects
Confusion of concepts and classes
Confusion of terms and objects
Confusion knowledge with what is known
Confusion of object-level with machine-level
Simple stupidity
… all of which lead to poor coding
UMLS-SN
 UMLS-SN Semantic Relations
Semantic Relation: affects
  TUI: T151
  Definition: Produces a direct effect on. Implied
  here is the altering or influencing of an existing
  condition, state, situation, or entity. This includes
  has a role in, alters, influences, predisposes,
  catalyzes, stimulates, regulates, depresses,
  impedes, enhances, contributes to, leads to, and
  modifies.
  Inverse: affected_by
 UMLS-SN Semantic Relations
Semantic Relation: carries_out
 TUI: T141
 Definition: Executes a function or
 performs a procedure or activity. This
 includes transacts, operates on, handles,
 and executes.
 Inverse: carried_out_by
 UMLS-SN Semantic Relations
Semantic Relation: causes
 TUI: T147
 Definition: Brings about a condition or an
 effect. Implied here is that an agent, such
 as for example, a pharmacologic
 substance or an organism, has brought
 about the effect. This includes induces,
 effects, evokes, and etiology.
 Inverse: caused_by
 UMLS-SN Semantic Relations
Semantic Relation: consists_of
 TUI: T172
 Definition: Is structurally made up of in
 whole or in part of some material or
 matter. This includes composed of, made
 of, and formed of.
 Inverse: constitutes
 UMLS-SN Semantic Relations
Semantic Relation: contains
 TUI: T134
 Definition: Holds or is the receptacle for
 fluids or other substances. This includes is
 filled with, holds, and is occupied by.
 Inverse: contained_in
 UMLS-SN Semantic Relations
Semantic Relation: derivative_of
 TUI: T178
 Definition: In chemistry, a substance
 structurally related to another or that can
 be made from the other substance. This is
 used only for structural relationships. This
 does not include functional relationships
 such as metabolite of, by product of, nor
 analog of.
 Inverse: has_derivative
 UMLS-SN Semantic Relations
Semantic Relation:
 developmental_form_of
 TUI: T179
 Definition: An earlier stage in the
 individual maturation of.
 Inverse: has_developmental_form
 UMLS-SN Semantic Relations
Semantic Relation: evaluation_of
 TUI: T161
 Definition: Judgment of the value or
 degree of some attribute or process.
 Inverse: has_evaluation
 UMLS-SN Semantic Relations
Semantic Relation: exhibits
 TUI: T145
 Definition: Shows or demonstrates.
 Inverse: exhibited_by
 UMLS-SN Semantic Relations
Semantic Relation:
 functionally_related_to
 TUI: T139
 Definition: Related by the carrying out of
 some function or activity.
 Inverse: functionally_related_to
 UMLS-SN Semantic Relations
Semantic Relation: indicates
 TUI: T156
 Definition: Gives evidence for the
 presence at some time of an entity or
 process.
 Inverse: indicated_by
 UMLS-SN Semantic Relations
Semantic Relation: ingredient_of
 TUI: T202
 Definition: Is a component of, as in a
 constituent of a preparation.
 Inverse: has_ingredient
 UMLS-SN Semantic Relations
Semantic Relation: issue_in
 TUI: T165
 Definition: Is an issue in or a point of
 discussion, study, debate, or dispute.
 Inverse: has_issue
 UMLS-SN Semantic Relations
Semantic Relation: manifestation_of
 TUI: T150
 Definition: That part of a phenomenon
 which is directly observable or concretely
 or visibly expressed, or which gives
 evidence to the underlying process. This
 includes expression of, display of, and
 exhibition of.
 Inverse: has_manifestation
 UMLS-SN Semantic Relations
Semantic Relation: property_of
 TUI: T159
 Definition: Characteristic of, or quality of.
 Inverse: has_property
 UMLS-SN Semantic Relations
Semantic Relation: result_of
 TUI: T157
 Definition: The condition, product, or
 state occurring as a consequence, effect,
 or conclusion of an activity or process.
 This includes product of, effect of, sequel
 of, outcome of, culmination of, and
 completion of.
 Inverse: has_result
 UMLS-SN Semantic Relations
Semantic Relation: surrounds
 TUI: T176
 Definition: Establishes the boundaries for,
 or defines the limits of another physical
 structure. This includes limits, bounds,
 confines, encloses, and circumscribes.
 Inverse: surrounded_by
 UMLS-SN Semantic Relations
Semantic Relation: traverses
 TUI: T177
 Definition: Crosses or extends across
 another physical structure or area. This
 includes crosses over and crosses
 through.
 Inverse: traversed_by
 UMLS-SN Semantic Relations
Semantic Relation: performs
 TUI: T188
 Definition: Executes, accomplishes, or
 achieves an activity.
 Inverse: performed_by
 UMLS-SN Semantic Relations
Semantic Relation: physically_related_to
 TUI: T132
 Definition: Related by virtue of some
 physical attribute or characteristic.
 Inverse: physically_related_to
  UMLS-SN Semantic Relations
Semantic Relation:
 conceptually_related_to
 Definition: Related by some abstract
 concept, thought, or idea.
 Inverse: conceptually_related_to
            Prototypicality
Biological classes are marked always by an
opposition between standard or prototypical
instances and a surrounding penumbra of non-
standard instances
How solve this problem: restrict range of
instance variables x, y, to standard instances?
Recognize degrees of instancehood? (Impose
topology/theory of vagueness on classes?)

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:1
posted:7/22/2012
language:
pages:143