; ontology-design_bsmith
Documents
Resources
Learning Center
Upload
Plans & pricing Sign in
Sign Out
Your Federal Quarterly Tax Payments are due April 15th Get Help Now >>

ontology-design_bsmith

VIEWS: 5 PAGES: 215

  • pg 1
									        Principles of
(Biomedical) Ontology Design

                Barry Smith
Department of Philosophy, University at Buffalo
   National Center for Biomedical Ontology
                (http://ncbo.us)




                                                  1
A methodology for building and
    evaluating ontologies
applied thus far in the biomedical domain to:
  – FMA
  – GO + other OBO Ontologies
  – NCI Thesaurus
  – UMLS Semantic Network
  – FuGO
  – SNOMED
  – ICF (International Classification of Functioning,
    Disability and Health)
  – BirnLex, RadioLex, Neuronames
  – ISO Terminology Standards
  – HL7-RIM
                                                        2
Some Examples




                3
  Foundational Model of Anatomy
              (FMA)
Pro
  Clear statement of scope: structural human
  anatomy, at all levels of granularity, from the whole
  organism to the biological macromolecule.
  Powerful treatment of definitions
  Single inheritance is_a hierarchy
Con
  Some unfortunate artifacts in the ontology deriving
  from its specific computer representation (Protégé)
                                                     4
                                                        Anatomical
        Anatomical Space
                                                         Structure


Organ Cavity          Organ
Subdivision                               Organ                         Organ Part
                      Cavity


 Serous Sac          Serous Sac                          Organ            Organ
   Cavity              Cavity
                                       Serous Sac      Component        Subdivision
                                                                                      Tissue
 Subdivision




                                                Pleural Sac             Pleura(Wall
                        Pleural                                            of Sac)
                         Cavity
                                          Parietal
                                           Pleura
                                                                     Visceral
                                                                     Pleura
               Interlobar
                 recess           Mediastinal
                                   Pleura              Mesothelium
                                                        of Pleura

                                                                                               5
    FMA follows formal rules for
       Aristotelian definitions

When A is_a B, the definition of „A ‟ takes the
 form:

          an A =Def. a B which C s...

   a human being =Def. an animal which is
                  rational
                                                  6
             Examples
Cell =Def. an anatomical structure which
 consists of cytoplasm surrounded by a
 plasma membrane




                                       7
   The FMA regimentation
brings the advantage that circular definitions
  are avoided
each definition reflects the position in the
  hierarchy to which a defined term belongs
the position of a term within the hierarchy
  enriches its own definition by incorporating
  automatically the definitions of all the terms
  above it.


                                                   8
    The FMA regimentation
The entire information content of the FMA‟s
 term hierarchy can be translated very
 cleanly into a computer representation
But the definitions encapsulate this
 information in a modular form which is of
 maximal advantage to human beings



                                              9
The FMA regimentation ensures
  intelligibility of definitions
The terms used in a definition should be
simpler (more intelligible) than the term to
be defined; otherwise the definition provides
no assistance
 – to human understanding
 – to machine processing



                                           10
                  FMA
  organized in a graph-theoretical structure
  involving two sorts of links or edges:
is-a (= is a subtype of )
  (pleural sac is-a serous sac)
part-of
 (cervical vertebra part-of vertebral column)



                                               11
                                                        Anatomical
        Anatomical Space
                                                         Structure


Organ Cavity          Organ
Subdivision                               Organ                         Organ Part
                      Cavity


 Serous Sac          Serous Sac                          Organ            Organ
   Cavity              Cavity
                                       Serous Sac      Component        Subdivision
                                                                                      Tissue
 Subdivision




                                                Pleural Sac             Pleura(Wall
                        Pleural                                            of Sac)
                         Cavity
                                          Parietal
                                           Pleura
                                                                     Visceral
                                                                     Pleura
               Interlobar
                 recess           Mediastinal
                                   Pleura              Mesothelium
                                                        of Pleura

                                                                                               12
at every level of granularity




                            13
     The FMA is a Structural
           Anatomy

Plasma membrane =Def. a cell part that
  surrounds the cytoplasm




                                     14
       The Gene Ontology
Pro
Open Source
Cross-Species
Impressive annotation resource
Impressive policies for maintenance
Has recognized the need for reform



                                      15
         The Gene Ontology

Con
Poor formal architecture (Mk I.)
Poor support for automatic reasoning and
  error-checking
No cross-ontology relations
Not (yet) transgranular


                                           16
   GO:0019836 hemolysis of red
           blood cells

=Def. The processes by which an organism
 effects hemolysis ...

              X =Def. the Y of X

 This sort of definition is worse than circular

                                              17
       Gene Ontology now adopting
     structured definitions built out of
           genus and differentiae




Species =Def Genus + Differentiae

neuron cell differentiation =Def
differentiation by which a cell acquires features of a neuron




                                                           18
      National Cancer Institute
         Thesaurus (NCIT)
Pro
NCIT is open source
NCIT has broad coverage
NCIT has some formal structure (OWL-DL)
NCIT has realized the errors of its ways

Con
Full of errors (many inherited from UMLS)
Bad realization of formal structure
                                            19
          Goals of NCIT

to make use of current terminology best
practices to relate relevant concepts to
one another in a formal structure, e.g. to
support automatic reasoning;




                                             20
      Formal Definitions
of 37,261 nodes, 33,720 remain formally
undefined
Thus only a small portion of the NCIT
ontology can be used for purposes of
automatic classification and error-checking




                                          21
       Verbal Definitions
About half the NCIT terms are assigned
 verbal definitions for human use
Unfortunately some are assigned more than
 one




                                            22
      Disease Progression
Definition1
 Cancer that continues to grow or spread.
Definition2
 Increase in the size of a tumor or spread
 of cancer in the body.
Definition3
 The worsening of a disease over time.

                                             23
              Cancer
 a process (of getting better or worse)
an object (which can grow and spread)

       occurrent vs. continuant




                                          24
               Disease
Definition1
 A disease is any abnormal condition of the
 body or mind that causes discomfort,
 dysfunction, or distress to the person
 affected or those in contact with the
 person. ...
Definition2
 A definite pathologic process with a
 characteristic set of signs and symptoms.
 ...
                                             25
    Confuses definitions with
          descriptions
Tuberculosis =Def.
 A chronic, recurrent infection caused by the bacterium
 Mycobacterium tuberculosis. Tuberculosis (TB) may affect almost
 any tissue or organ of the body with the lungs being the most
 common site of infection. The clinical stages of TB are primary or
 initial infection, latent or dormant infection, and recrudescent or
 adult-type TB. Ninety to 95% of primary TB infections may go
 unrecognized. Histopathologically, tissue lesions consist of
 granulomas which usually undergo central caseation necrosis. Local
 symptoms of TB vary according to the part affected; acute
 symptoms include hectic fever, sweats, and emaciation; serious
 complications include granulomatous erosion of pulmonary bronchi
 associated with hemoptysis. If untreated, progressive TB may be
 associated with a high degree of mortality. This infection is
 frequently observed in immunocompromised individuals with AIDS
 or a history of illicit IV drug use.


                                                                   26
    Confuses definitions with
          descriptions
Tuberculosis =Def.
 A chronic, recurrent infection caused by the bacterium
 Mycobacterium tuberculosis. Tuberculosis (TB) may affect almost
 any tissue or organ of the body with the lungs being the most
 common site of infection. The clinical stages of TB are primary or
 initial infection, latent or dormant infection, and recrudescent or
 adult-type TB. Ninety to 95% of primary TB infections may go
 unrecognized. Histopathologically, tissue lesions consist of
 granulomas which usually undergo central caseation necrosis. Local
 symptoms of TB vary according to the part affected; acute
 symptoms include hectic fever, sweats, and emaciation; serious
 complications include granulomatous erosion of pulmonary bronchi
 associated with hemoptysis. If untreated, progressive TB may be
 associated with a high degree of mortality. This infection is
 frequently observed in immunocompromised individuals with AIDS
 or a history of illicit IV drug use.


                                                                   27
        A better definition
Tuberculosis
Definition:
A chronic, recurrent infection caused by the
  bacterium Mycobacterium tuberculosis.




                                               28
  Duratec, Lactobutyrin, Stilbene
            Aldehyde

are classified by the NCIT as Unclassified
Drugs and Chemicals




                                             29
 NCIT recognizes three
disjoint classes of plants
 Vascular Plant
 Non-vascular Plant
 Other Plant




                             30
     and three kinds of cells

Abnormal Cell is a top-level class (thus not
 subsumed by Cell )
Normal Cell is a subclass of Microanatomy.
Cell is a subclass of Other Anatomic Concept
 (so that cells themselves are concepts)



                                               31
 NCIT as now constituted will
  block automatic reasoning

Neither Normal Cells nor Abnormal Cells are
 Cells within the context of the NCIT




                                          32
UMLS Semantic Network

Alexa McCray, “An upper level ontology
for the biomedical domain”. Comp
Functional Genomics 2003; 4: 80-84.




                                    33
   UMLS Semantic Network

Pros
Broad coverage; no multiple inheritance
Cons
Incoherent use of „conceptual entities‟
(e.g. the digestive system as a conceptual
  part of the organism)


                                             34
 UMLS Semantic Network

Edges in the graph represent merely
“possible significant relations” :
– Bacterium causes Experimental Model of
  Disease
– Experimental Model of Disease affects
  Fungus
– Experimental model of disease is_a
  Pathologic Function


                                           35
a hodgepodge of ‘concepts’   36
             location_of

Tissue location_of Mental or Behavioral
  Dysfunction

Fungus location_of Vitamin




                                          37
  Fungus location_of Vitamin
Every instance of fungus is located in some
  vitamin?
Every instance of fungus is located in every
  vitamin?
Some instances of fungus are located in
  some vitamins?
Some instances of vitamin have instances of
  fungi located in them?
                                           38
what are the nodes in this graph?   39
  UMLS Semantic Network

A is_a B =Def.
A is narrower in meaning than B


A disrupts B
A contained_in B
                                  40
 UMLS Semantic Network

Drug Delivery Device contains Clinical
Drug

Drug Delivery Device
narrower_in_meaning_than Manufactured
Object


                                         41
General Ontological Overview




                          42
   Good ontologies require:

Consistent use of terms, supported by
 logically coherent (non-circular)
 definitions, in equivalent human-
 readable and computable formats
Coherent shared treatment of relations
 to allow cascading inference both
 within and between ontologies

                                   43
 Three fundamental dichotomies

 continuants vs. occurrents
 dependent vs. independent
 types vs. instances

     ONTOLOGIES ARE
REPRESENTATIONS OF TYPES
                                 44
    ONTOLOGIES ARE
REPRESENTATIONS OF TYPES

    aka kinds, universals,
categories, species, genera, ...


                                   45
Molecules, cell components , organisms
 are independent continuants which have
 functions

Functions are dependent continuants which
 become realized through special sorts of
 processes we call functionings

Processes (occurrents) include: functionings,
 side-effects, stochastic processes


                                          46
Continuants (aka endurants)
  – have continuous existence in time
  – preserve their identity through change
  – exist in toto whenever they exist at all

Occurrents (aka processes)
  – have temporal parts
  – unfold themselves in successive phases
  – exist only in their phases
                                               47
You are a continuant
Your life is an occurrent

 You are 3-dimensional
Your life is 4-dimensional

                             48
       Dependent entities


require independent continuants as their
  bearers

There is no grin without a cat




                                           49
  Dependent vs. independent
        continuants
Independent continuants (organisms, cells,
   molecules, environments)

Dependent continuants (qualities, shapes,
   roles, propensities, functions)




                                             50
 All occurrents are dependent
            entities

They are dependent on those independent
 continuants which are their participants
 (agents, patients, media ...)




                                            51
         Top-Level Ontology
                                 Occurrent
         Continuant         (always dependent
                              on one or more
                               independent
                                continuants)
Independent    Dependent
 Continuant    Continuant




                                                52
 = A representation of top-level types

           Continuant                      Occurrent


                                       biological process
 Independent            Dependent
  Continuant            Continuant

cell component    molecular function




                                                            53
         Top-Level Ontology

         Continuant                 Occurrent



Independent    Dependent                   Side-Effect,
                            Functioning
 Continuant    Continuant                  Stochastic
                                           Process, ...


               Function


                                                   54
         Top-Level Ontology

         Continuant                 Occurrent



Independent    Dependent                   Side-Effect,
                            Functioning
 Continuant    Continuant                  Stochastic
                                           Process, ...


               Function


                                                   55
         Top-Level Ontology
         Continuant                   Occurrent


Independent    Dependent                    Side-Effect,
                              Functioning
 Continuant    Continuant                   Stochastic
                                            Process, ...
                            Spatial
    Quality    Function
                            Region



     instances (in space and time)                    56
Smith B, Ceusters W, Kumar A, Rosse C. On Carcinomas and
Other Pathological Entities, Comp Functional Genomics, Apr.
2006                                                    57
everything here is an
independent continuant




                         58
        Functions, etc.

Some dependent continuants are
            realizable
  expression of a gene
  application of a therapy
  course of a disease
  execution of an algorithm
  realization of a protocol

                                 59
Functions vs Functionings

the function of your heart = to pump
blood in your body
this function is realized in processes of
pumping blood
not all functions are realized (consider the
function of this sperm ...)


                                            60
The OBO Foundry




                  61
High quality shared ontologies
      build communities

General trend on the part of NIH, FDA and
other bodies to consolidate ontology-
based standards for the communication
and processing of biomedical data.

caBIG / NECTAR / BIRN / BRIDG / OBO ...

                                          62
   Responses to this trend

Old style: UMLS (Unified Medical Language
  System) – rooted in faithfulness to the
  ways language is used by different
  medical communities
New style: OBO Foundry – pre-emptive
  regimentation of language, structure and
  format


                                             63
    Two strategies for creating
terminologies and database schemas

 Ad hoc creation by each clinical or
 research community
 vs.
 Pre-established reference ontologies
 upon which specific local
 applications can draw

                                       64
   We know that high-quality
      ontologies can help
in creating better mappings between human
  and model organism phenotypes

 S Zhang, O Bodenreider, “Alignment of Multiple
 Ontologies of Anatomy: Deriving Indirect
 Mappings from Direct Mappings to a Reference
 Ontology”, AMIA 2005



                                                  65
      The solution
   The OBO Foundry




http://ontology.buffalo.edu/obofoundry
                                         66
     The OBO Foundry


Goals:
to create the conditions for a step-by-step
evolution towards robust gold standard
reference ontologies in the biomedical
domain
to introduce some of the features of
scientific peer review into biomedical
ontology development
                                          67
      The OBO Foundry

Goal:
to create controlled vocabularies for use
by clinical trial banks, clinical guidelines
bodies, scientific journals, ...




                                               68
            The OBO Foundry
              OBO Foundry

A subset of OBO ontologies whose developers agree
in advance to accept a common set of principles
designed to assure
–   intelligibility to biologist curators, annotators, users
–   formal robustness
–   stability
–   compatibility
–   interoperability
–   support for logic-based reasoning

                                                               69
    The OBO Foundry
      OBO Foundry

– OBO-UBO / Ontology of Biomedical Reality
– OBO Relation Ontology
– Gene Ontology
– Sequence Ontology
– RNA Ontology
– PATO Phenotype Ontology
– FuGO Functional Genomics Investigation
  Ontology
– Mk. II NCI Thesaurus
– FMA (?)
                                             70
       A reference ontology


is analogous to a scientific theory; it seeks
to optimize representational adequacy to
its subject matter to the maximal degree
that is compatible with the constraints of
computational usefulness.



                                            71
   An application ontology
 is comparable to an engineering artifact
 such as a software tool. It is constructed
 for a specific practical purpose.

Examples:
  NCIT
  FuGO Functional Genomics Investigation
  Ontology
                                              72
Reference Ontology vs. Application
            Ontology
Currently application ontologies are often built
afresh for each new task; commonly introducing
not only idiosyncrasies of format or logic, but also
simplifications or distortions of their subject-
matters.
To solve this problem application ontology
development shoud take place always against the
background of a formally robust reference
ontology framework

                                                  73
    The OBO Foundry

           CRITERIA
http://ontology.buffalo.edu/obofoundry




                                         74
The ontology is open and available to be used by all.

The developers of the ontology agree in advance to
   collaborate with developers of other OBO Foundry
   ontology where domains overlap.

The ontology is in, or can be instantiated in, a
   common formal language.

The ontology possesses a unique identifier space
   within OBO.

The ontology provider has procedures for identifying
   distinct successive versions.

                                                   75
The ontology has a clearly specified and clearly
   delineated content.

The ontology includes textual definitions for all
   terms.

The ontology is well-documented.

The ontology has a plurality of independent users.

The ontology uses relations which are
   unambiguously defined following the pattern of
   definitions laid down in the OBO Relation
   Ontology.
                                                    76
      The OBO Foundry

             CRITERIA
Further criteria will be added over time in
order to bring about a gradual
improvement in the quality of the
ontologies in the Foundry



                                              77
Advantages of the methodology
 of shared coherently defined
          definitions
 promotes quality assurance (better coding)
 guarantees automatic reasoning across
  ontologies and across data at different
  granularities
 yields direct connection to temporally
  indexed instance data


                                          78
Rules for Good Ontologies




                            79
  A basic distinction
        type vs. instance

science text vs. clinical document

       „man‟ vs. „Michael‟




                                     80
    Instances are not
represented in an ontology

For ontology, it is the scientific
generalizations that are important

(but instances must still be taken into
account)


                                          81
A   515287   DC3300 Dust Collector Fan
B   521683   Gilmer Belt
C   521682   Motor Drive Belt
                                         82
Ontology   Types Instances
                         83
       Ontology =
A Representation of Types   84
         Each node of an ontology
         consists of:
         • preferred term (aka term)
         • term identifier (TUI, aka CUI)
         • synonyms
         • definition, glosses, comments



       Ontology =
A Representation of Types               85
         Nodes in an ontology are
         connected by relations:
         primarily: is_a (= is subtype of)
         and part_of
         designed to support search,
         reasoning and annotation



       Ontology =
A Representation of Types               86
types                                substance


                              organism

                         animal

                mammal

          cat
                                         frog
siamese



instances
                                                 87
    Motivation: To capture
            reality
Inferences and decisions we make are
  based upon what we know of reality.
An ontology is a computable representation
  of biological reality, which is designed to
  enable a computer to reason over the data
  we collect about this reality in (some of)
  the ways that we do.


                                            88
              Concepts

Biomedical ontology integration will never
  be achieved through integration of
  meanings or concepts
The problem is precisely that different user
  communities use different concepts
Concepts are in your head and will change
  as your understanding changes

                                               89
              Concepts

Ontologies represent types: not concepts,
  meanings, ideas ...
Types exist, with their instances, in
  objective reality
– including types of image, of imaging
  process, of brain region, of clinical
  procedure, etc.

                                            90
          Rules on types
Don‟t confuse types with words
Don‟t confuse types with concepts
Don‟t confuse types with ways of getting to
 know types
Don‟t confuse types with ways of talking
 about types
Don‟t confuses types with data about types


                                              91
Some other simple rules for
  high quality ontologies




                              92
               Univocity

Terms should have the same meanings on
 every occasion of use.
They should refer to the same kinds of
 entities in reality
Basic ontological relations such as is_a and
 part_of should be used in the same way
 by all ontologies


                                               93
               Positivity

Complements of types are not themselves
 types.
Hence terms such as
 non-mammal
 non-membrane
 other metalworker in New Zealand
do not designate types in reality


                                          94
Ontology of types  logic of terms

There are no conjunctive and disjunctive
 types:

 anatomic structure, system, or substance
 musculoskeletal and connective tissue
 disorder
 rheumatism, excluding the back
                                            95
              Objectivity

Which types exist in reality is not a function
 of our knowledge.
Terms such as
  unknown
  unclassified
  unlocalized
  arthropathies not otherwise specified
do not designate types in reality.
                                                 96
  Keep Epistemology Separate from
             Ontology
If you want to say that
   We do not know where A‟s are located
do not invent a new class of
 A‟s with unknown locations
 (A well-constructed ontology should grow
 linearly; it should not need to delete classes
 or relations because of increases in
 knowledge)
                                              97
    Syntactic Separateness
   Do not confuse sentences with terms

If you want to say

 I surmise that this is a case of pneumonia

do not invent a new class of surmised
 pneumonias

                                              98
    Single Inheritance

No kind in a classificatory hierarchy
should have more than one is_a
parent on the immediate higher
level




                                        99
      Multiple Inheritance

                     thing



blue thing                        car

             is_a              is_a

                    blue car
                                        100
       Multiple Inheritance


is a source of errors
encourages laziness
serves as obstacle to integration with
   neighboring ontologies
hampers use of Aristotelian methodology
   for defining terms


                                          101
      Multiple Inheritance

                     thing



blue thing                       car

             is_a1           is_a2

                     blue car
                                       102
      is_a Overloading

The success of ontology alignment
demands that ontological relations (is_a,
part_of, ...) have the same meanings in
the different ontologies to be aligned.




                                            103
Example: is_a is pressed into service
   by the GO to express location
 is-located-at and similar relations are
 expressed by creating special compound
 terms using:
    site of …
    … within …
    … in …
    extrinsic to …
 yielding associated errors
                                           104
       e.g. errors with „within‟

lytic vacuole within a protein storage vacuole

lytic vacuole within a protein storage vacuole
  is-a protein storage vacuole

Compare:
embryo within a uterus is-a uterus

                                            105
 similar problems with part_of

extrinsic to membrane part_of membrane




                                         106
        Compositionality

The meanings of compound terms should be
 determined
 1. by the meanings of component terms
together with
 2. the rules governing syntax




                                      107
Why do we need rules/standards
      for good ontology?
Ontologies must be intelligible both to humans (for
  annotation and curation) and to machines (for
  reasoning and error-checking): the lack of rules
  for classification leads to human error and
  blocks automatic reasoning and error-checking
Intuitive rules facilitate training of curators and
  annotators
Common rules allow alignment with other
  ontologies

                                                 108
OBO Relation Ontology




                        109
               First step

Alignment of OBO Foundry ontologies
  through a common system of formally
  defined relations in the OBO Relation
  Ontology
See “Relations in Biomedical Ontologies”,
  Genome Biology Apr. 2005



                                            110
Judith Blake:

 “The use of bio-ontologies … ensures
 consistency of data curation, supports
 extensive data integration, and enables
 robust exchange of information between
 heterogeneous informatics systems. ..
 ontologies … formally define relationships
 between the concepts.”


                                          111
"Gene Ontology: Tool for the
   Unification of Biology"

an ontology "comprises a set of well-
defined terms with well-defined
relationships"
(Ashburner et al., 2000, p. 27)




                                        112
          is_a (sensu UMLS)
A is_a B =def

„A ‟ is narrower in meaning than „B ‟

grows out of the heritage of dictionaries
(which ignore the basic distinction between
  types and instances)

                                              113
                   is_a

congenital absent nipple is_a nipple
cancer documentation is_a cancer
disease prevention is_a disease
Nazism is_a social science




                                       114
           is_a (sensu logic)
A is_a B =def

For all x, if x instance_of A then x
  instance_of B

cell division is_a biological process

adult is_a child ???
                                        115
      Two kinds of entities

occurrents (processes, events, happenings)
 cell division, ovulation, death

continuants (objects, qualities, ...)
 cell, ovum, organism, temperature of
 organism, ...


                                        116
        is_a (for occurrents)
A is_a B =def

For all x, if x instance_of A then x
  instance_of B

cell division is_a biological process


                                        117
       is_a (for continuants)
A is_a B =def

For all x, t if x instance_of A at t then x
  instance_of B at t

 abnormal cell is_a cell
 adult human is_a human
 but not: adult is_a child
                                              118
  Part_of as a relation between
 types is more problematic than is
        standardly supposed


heart part_of human being ?
human heart part_of human being ?
human being has_part human testis ?
human testis part_of human being ?


                                      119
      two kinds of parthood
1.   between instances:
     Mary‟s heart part_of Mary
     this nucleus part_of this cell

2.   between types
     human heart part_of human
     cell nucleus part_of cell

                                      120
  Definition of part_of as a
   relation between types

A part_of B =Def all instances of A are
instance-level parts of some instance of B


  ALL–SOME STRUCTURE


                                          121
      part_of (for occurrents)
A part_of B =Def

For all x, if x instance_of A then there is
  some y, y instance_of B and x part_of y
  where „part_of‟ is the instance-level part
  relation



                                          122
     part_of (for continuants)
A part_of B =def.

For all x, t if x instance_of A at t then
  there is some y, y instance_of B at t and
  x part_of y
where „part_of‟ is the instance-level part
  relation

ALL-SOME STRUCTURE
                                         123
 How to use the OBO Relation
          Ontology
Ontologies are representations of types and of
 the relations between types
The definitions of these relations involve
 reference to times and instances, but these
 references are washed out when we get to
 the assertions (edges) in the ontology
But curators should still be aware of the
 underlying definitions when formulating such
 assertions
                                          124
      part_of (for occurrents)
A part_of B =Def

For all x, if x instance_of A then there is
  some y, y instance_of B and x part_of y
  where „part_of‟ is the instance-level part
  relation



                                          125
    A part_of B, B part_of C ...

The all-some structure of such definitions
  allows
cascading of inferences (true path rule)
  (i) within ontologies
  (ii) between ontologies
  (iii) between ontologies and repositories
  of instance-data

                                              126
Strengthened true path rule

Whichever A you choose, the instance of B
 of which it is a part will be included in
 some C, which will include as part also the
 A with which you began
The same principle applies to the other
 relations in the OBO-RO:

 located_at, transformation_of,
 derived_from, adjacent_to, etc.
                                           127
         Kinds of relations

Between types:
  – is_a, part_of, ...
Between an instance and a type
  – this explosion instance_of the type
    explosion
Between instances:
  – Mary‟s heart part_of Mary


                                          128
        In every ontology
  some terms and some relations are
  primitive = they cannot be defined (on
  pain of infinite regress)
Examples of primitive relations:
   – identity
   – instantiation
   – (instance-level) part_of
   – (instance-level) continuous_with
                                           129
Fiat and
bona fide
boundaries




             130
Continuity
Attachment
Adjacency




             131
everything here is an
independent continuant




                         132
structures vs. formations
= bona fide vs. fiat
boundaries




                            133
       Modes of Connection


The body is a highly connected entity.
Exceptions: cells floating free in blood.




                                            134
      Modes of Connection

Modes of connection:
 attached_to (muscle to bone)
 synapsed_with (nerve to nerve, nerve
   to muscle)
 continuous_with (= share a fiat
   boundary)

                                   135
  articular (glenoid)fossa   articular eminence




                                                  ANTERIOR




Attachment, location,
    containment

                                                             136
Containment involves relation to a
         hole or cavity




  1: cavity
  2: tunnel, conduit (artery)
  3: mouth; a snail’s shell

                                     137
Fiat vs. Bona Fide Boundaries




                                138
Double Hole Structure

              Retainer
              (a boundary of some
              surrounding structure)

              Medium
              (filling the environing hole)


              Tenant
              (occupying the central hole)




                                              139
     fossa

    head of
    condyle

fiat boundary

neck of condyle




      THE TEMPOROMANDIBULAR
               JOINT

                              140
        continuous_with
  (a relation between instances
   which share a fiat boundary)
is always symmetric:

if x continuous_with y , then y
continuous_with x




                                  141
          continuous_with
     (relation between types)

A continuous_with B =Def.

for all x, if x instance-of A then there is
some y such that y instance_of B and x
continuous_with y



                                              142
continuous_with is not always
            symmetric

Consider lymph node and lymphatic
vessel:

  Each lymph node is continuous with
  some lymphatic vessel, but there are
  lymphatic vessels (e.g. lymphs and
  lymphatic trunks) which are not
  continuous with any lymph nodes
                                         143
            Adjacent_to
   as a relation between types
         is not symmetric
Consider
 seminal vesicle adjacent_to urinary
 bladder
Not: urinary bladder adjacent_to
 seminal vesicle

                                       144
instance level
  this nucleus is adjacent to this cytoplasm
implies:
  this cytoplasm is adjacent to this nucleus

type level
  nucleus adjacent_to cytoplasm
  Not: cytoplasm adjacent_to nucleus


                                               145
             Applications

Expectations of symmetry e.g. for protein-
   protein interactions may hold only at the
   instance level
if A interacts with B, it does not follow that
   B interacts with A

if A is expressed simultaneously with B, it
   does not follow that B is expressed
   simultaneously with A
                                                 146
    transformation_of
         same instance
C                        C1
c at t                   c at t1
                                   time

    pre-RNA        mature RNA
       child       adult


                                          147
      transformation_of
A transformation_of B =Def.
Every instance of A was at some earlier
time an instance of B

         adult transformation_of child




                                          148
tumor development
C              C1
c at t          c at t1




                          149
             derives_from
    C                        C1
    c at t                   c1 at t1
                                        time
   C'
   c' at t       instances

                    ovum
zygote derives_from
                    sperm
                                               150
two continuants fuse to form a
new continuant

  C                      C1
 c at t                  c1 at t1



 C'
 c' at t   fusion



                                    151
one initial continuant is replaced by two
successor continuants

  C                                C1
  c at t                       c1 at t1

                                   C2
                                c1 at t1

fission


                                            152
one continuant detaches itself from an
initial continuant, which itself continues
to exist
  C
  c at t                        c at t1

                                    C1
                                c1 at t

budding


                                             153
one continuant absorbs a
second continuant while itself
continuing to exist
  C
 c at t                   c at t1



 C'
 c' at t   capture



                                    154
A suite of defined relations
      between types
Foundational is_a
              part_of
Spatial       located_in
              contained_in
              adjacent_to
Temporal      transformation_of
              derives_from
              preceded_by
Participation has_participant
              has_agent
                                  155
To be added to the Relation Ontology

 lacks (between an instance and a type, e.g.
   this fly lacks wings)
 dependent_on (between a dependent
   entity and its carrier or bearer)
 quality_of (between a dependent and an
   independent continuant)
 functioning_of (between a process and an
   independent continuant)

                                          156
          Low Hanging Fruit
Ontologies should include only those
 relational assertions which hold universally
 (= have the ALL-SOME form)
Often, order will matter here:
We can include
 adult transformation_of child
but not
 child transforms_into adult
                                           157
The Gene Ontology




                    158
     GO’s three ontologies


molecular                biological
functions                processes




              cellular
            components



                                      159
When a gene is identified
three types of questions need to be
  addressed:
1. Where is it located in the cell?
2. What functions does it have on the
  molecular level?
3. To what biological processes do these
  functions contribute?


                                           160
    Three granularities:

Cellular (for components)
Molecular (for functions)
Organ + organism (for processes)




                                   161
            GO has cells
but it does not include terms for molecules
or organisms within any of its three
ontologies
except e.g. GO:0018995 host
=Def. Any organism in which another
organism spends part or all of its life cycle



                                           162
    Are the relations between
 functions and processes a matter
          of granularity?

Molecular activities are the „building blocks‟
 of biological processes ?
But they are not allowed to be represented
 in GO as parts of biological processes




                                                 163
     GO’s three ontologies


molecular                biological
functions                processes




              cellular
            components



                                      164
What does “function” mean?

an entity has a biological function if and
only if it is part of an organism and has a
disposition to act reliably in such a way
as to contribute to the organism‟s
survival

the function is this disposition


                                          165
      Improved version
an entity has a biological function if
and only if it is part of an organism
and has a disposition to act reliably in
such a way as to contribute to the
organism‟s realization of the canonical
life plan for an organism of that type


                                      166
This canonical life plan might
           include
  canonical embryological development
  canonical growth
  canonical reproduction
  canonical aging
  canonical death



                                        167
The function of the heart is to pump
                  blood
 Not every activity (process) in an
 organism is the exercise of a function –
 there are
  – mal functionings
  – side-effects (heart beating)
  – accidents (external interference)
  – background stochastic activity

                                            168
Kidney




         169
Nephron




          170
Functional Segments




                      171
Functions




            172
       Functions
This is a screwdriver
This is a good screwdriver
This is a broken screwdriver

This is a heart
This is a healthy heart
This is an unhealthy heart



                               173
    Functions are associated with
certain characteristic process shapes

 Screwdriver: rotates and simultaneously
   moves forward simultaneously transferring
   torque from hand and arm to screw
 Heart: performs a contracting movement
   inwards and an expanding movement
   outwards


                                          174
      Not functioning at all
leads to death, modulo
internal factors:
   plasticity
   redundancy (2 kidneys)
   criticality of the system involved
external factors:
   prosthesis (dialysis machines, oxygen tent)
   special environments
   assistance from other organisms
                                                 175
What clinical medicine is for
 to eliminate malfunctioning by fixing
 broken body parts
 (or to prevent the appearance of
 malfunctioning by intervening e.g. at the
 molecular level)




                                             176
Hypothesis: there are no „bad‟
         functions
It is not the function of an oncogene to
cause cancer
Oncogenes were in every case proto-
oncogenes with functions of their own
They become oncogenes because of bad
(non-prototypical) environments



                                           177
     Is there an exception for
       molecular functions?

Does this apply only to functions on
   biological levels of granularity
(= levels of granularity coarser than the
   molecule) ?
If pathology is the deviation from (normal)
   functioning, does it make sense to talk of
   a pathological molecule?

                                                178
   Is there an exception for
     molecular functions?
A molecular function is a propensity of a gene
product instance to perform actions on the
molecular level of granularity.
Hypothesis 1: these actions must be reliably
such as to contribute to biological processes.
Hypothesis 2: these actions must be reliably
such as to contribute to the organism‟s
realization of the canonical life plan for an
organism of that type.

                                                 179
       The Gene Ontology
is a canonical ontology – it represents only
   what is normal in the realm of molecular
   functioning




                                               180
       The GO is a canonical
          representation

 “The Gene Ontology is a computational
 representation of the ways in which gene
 products normally function in the
 biological realm”

Nucl. Acids Res. 2006: 34.


                                            181
    The FMA is a canonical
        representation
It is a computational representation of
types and relations between types
deduced from the qualitative observations
of the normal human body, which have
been refined and sanctioned by successive
generations of anatomists and presented
in textbooks and atlases of structural
anatomy.

                                       182
  The importance of pathways
     (successive causality)

Each stage in the history of a disease
  presupposes the earlier stages
Therefore need to reason across time,
  tracking the order of events in time, using
  relations such as derives_from,
 transformation_of ...
Need pathway ontologies on every level of
 granularity
                                            183
 The importance of granularity
   (simultaneous causality)
Networks are continuants
At any given time there are networks existing
  in the organism at different levels of
  granularity
Changes in one cause simultaneous changes in
  all the others
(Compare Boyle‟s law: a rise in temperature
  causes a simultaneous increase in pressure)

                                            184
   The Granularity Gulf
most existing data-sources are of
 fixed, single granularity
many (all?) clinical phenomena cross
 granularities
Therefore need to reason across
 time, tracking the order of events
 in time

                                       185
     GO’s three ontologies


molecular                 biological
function    dependent      process




              cellular
            component    independent



                                       186
     GO’s three ontologies

                         organism-
molecular    cellular       level
function     process     biological
                          process




              cellular
            component



                                      187
        Normalization of Granular Levels


                                           organism-
molecular            cellular                 level
function             process               biological
                                            process




molecule              cellular
                    component              organism



                                                        188
                          organism-
molecular    cellular        level
            process       biological
process
                           process




                         organism-
molecular    cellular       level
            function     biological
function
                          function




              cellular   organism
 molecule
            component


                                       189
                             organism-
  molecular      cellular       level
                process      biological
  process
                              process

functioning   functioning   functioning



                            organism-
  molecular     cellular       level
               function     biological
  function
                             function




                 cellular
   molecule                 organism
               component


                                          190
                                organism-
 molecular        cellular         level
 process         process         process

functionings   functionings   functionings


                               organism-
  molecular      cellular         level
                function       biological
  function
                               function




   molecule       cellular
                              organism
                component




                              organism-
  molecular       cellular
                                 level
  location       location
                              location
                                             191
       The GO is a canonical
          representation

 “The Gene Ontology is a computational
 representation of the ways in which gene
 products normally function in the
 biological realm”

Nucl. Acids Res. 2006: 34.


                                            192
                                  organism-
 molecular          cellular         level
 process           process         process

functionings     functionings   functionings


                                 organism-
  molecular        cellular         level
                  function       biological
  function
                                 function




   molecule         cellular
                                organism
                  component


         everything here is typical

                                               193
  The Methodology of Annotations
Scientific curators use experimental observations
  reported in the biomedical literature to link gene
  products with GO terms in annotations.
The gene annotations taken together yield a slowly
  growing computer-interpretable map of
  biological reality.
The process of annotating literature also leads to
  improvements and extensions of the ontology,
  which institutes a virtuous cycle of improvement
  in the quality and reach of both future
  annotations and the ontology itself.
                                                  194
  When we annotate the record
       of an experiment
we use terms representing types to capture what we
learn about:
 – this experiment (instance), performed here and
   now, in this laboratory
 – the instances experimented upon
These instances are typical = they are representatives
of types
 – of experiment (described in FuGO)
 – of gene product molecules, molecular functions,
   cellular components, biological processes (described
   in GO)
                                                  195
   Experimental records
document a variety of instances (particular
real-world examples or cases), ranging
from instances of gene products (including
individual molecules) to instances of
biochemical processes, molecular
functions, and cellular locations




                                         196
      Experimental records
  provide evidence that gene products of given
  types have molecular functions of given types by
  documenting occurrences in the real world that
  involve corresponding instances of functioning.
They document the existence of real-world
  molecules that have the potential to execute
  (carry out, realize, perform) the types of
  molecular functions that are involved in these
  occurrences.


                                                197
                Glossary

Instance: A particular entity in spatio-
  temporal reality.
Type: A general kind instantiated by an
  open-ended totality of instances which
  share certain qualities and propensities in
  common of the sort that can be
  documented in scientific literature


                                                198
              Glossary
Gene product instance: A molecule that
 is generated by the expression of a DNA
 sequence and which plays some
 significant role in the biology of the
 organism.
Gene product type: A type of gene
 product instance.


                                           199
               Glossary
Biological process instance (aka
 “occurrence”): A change or complex of
 changes on the level of granularity of the
 cell or organism, mediated by one or more
 gene products.
Biological process type: A type of
 biological process instance.


                                         200
                  Glossary
Cellular component instance: A part of a cell,
  including cellular structures, macromolecular
  complexes and spatial locations identified in
  relation to the cell
Cellular component type: A type of cellular
  component.




                                                  201
               Glossary
Molecular function instance: The
 propensity of a gene product instance to
 perform actions, such as catalysis or
 binding, on the molecular level of
 granularity.
Molecular function type: A type of
 molecular function instance.


                                            202
                   Glossary
Molecular function execution instance (aka
 “functioning”): A process instance on the
 molecular level of granularity that is the result of
 the action of a gene product instance.
Molecular function execution type: A type of
 molecular function execution instance (aka “a
 type of functioning”)




                                                   203
Should „activity‟ be dropped from
   Molecular Function terms?

Pro:
Functions are never activities (they are propensities)
Many functions are never realized
The current remedy is ugly
The current remedy is not universally acceptable
  (structural constituent of bone)
Con:
Much renaming work would be needed to advance
  clarity
                                                   204
Should the Molecular Function
   ontology be renamed?
Pro
Could keep „activity‟
Functionings are observable, functions are
  not
The GO is interested precisely in
  functionings (not in side effects,
  malfunctionings, accidents, stochastic
  processes)
The GO is interested in how functionings
  contribute to biological processes
                                             205
 Should the Molecular Function
    ontology be renamed?
Biological science is marked precisely by the
  dominance of the functional orientation (cf.
  classifications of functions in neuroscience)

Conclusion
Keep „Molecular Function‟, drop „activity‟; rename
  terms where necessary; but in such a way as to
  avoid double counting of both molecular
  functions and molecular functionings

                                                  206
What will be the structure of
    the OBO Foundry?




                                207
                                organism-
 molecular        cellular         level
 process         process         process

functionings   functionings   functionings


                               organism-
  molecular      cellular        level
                function       biological
  function
                               function




                  cellular
  molecule                    organism
                component




  molecular       cellular    organism-
  location      locations        level
                              locations
                                             208
molecular
 process

             cellular      organism-level
            physiology       physiology

molecular
function
  (GO)




            cell (types)      species

 ChEBI,
Sequence,
 RNA ...
                             anatomy
             cellular
                             (fly, fish,
            anatomy
                             human...)
                                            209
 molecula
 r process



              cellular    organism-level
             physiology     physiology


molecula
r function


                                            normal
   (GO)



                                           (functionings)
                cell
                            species
              (types)
 ChEBI,
Sequence,
 RNA ...
                           anatomy
              cellular     (fly, fish,
             anatomy
                           human...)


                                                            210
                     pathophysiology
                        (disease)




pathological
(malfunctionings)


                       pathoanatomy
                    (fly, fish, human ...)




                                             211
molecula
r process



              cellular    organism-level    pathophysiology
             physiology     physiology         (disease)

molecula
r function
   (GO)




                 cell
                            species
               (types)

 ChEBI,
Sequence,
                                              pathoanatomy
 RNA ...                                   (fly, fish, human ...)
                           anatomy
               cellular    (fly, fish,
              anatomy
                           human...)
                (GO)

                                                                    212
molecula
r process



              cellular    organism-level    pathophysiology
             physiology     physiology         (disease)

molecula
r function
   (GO)




                 cell
               (types)
                            speciesphenotype
 ChEBI,
Sequence,
                                              pathoanatomy
 RNA ...                                   (fly, fish, human ...)
                           anatomy
               cellular    (fly, fish,
              anatomy
                           human...)


                                                                    213
molecula
r process

              cellular     organism-level    pathophysiology
             physiology      physiology         (disease)
molecula
r function
   (GO)



                 cell
               (types)       species
                                     phenotype
 ChEBI,                                        pathoanatomy
Sequence,
 RNA ...
                                            (fly, fish, human ...)
                cellular
                             anatomy
               anatomy       (fly, fish,
                             human...)



                         investigation
                            (FuGO)
                                                                     214
End




      215

								
To top