Documents
Resources
Learning Center
Upload
Plans & pricing Sign in
Sign Out

Extracting

VIEWS: 14 PAGES: 19

									Automating Generation of Textual Class
Definitions from OWL to English

Robert Stevens, James Malone, Sandra Williams,
Richard Power
    Summary

    •      Motivation
    •      Use Case
    •      Methods and Description Generator
    •      Results
    •      Evaluation
    •      Open Questions (still)




2   01.03.2011     Automating Generation of Textual Class Definitions from
                   OWL to English
    Motivation
    • Textual definitions are
           • cornerstone of good practice in ontology delivery
           • a requirement of the OBO process
           • hard work to produce
    • Logical definitions
           • make meaning explicit to the computer
           • help maintenance of the ontology‟s structure, querying, and so on
           • are also hard to produce but also more difficult to understand
    • The information in one form should reflect the information in the other
    • Need to keep textual and logical definitions synchronised
    • Aim to produce fluent textual definitions from logical
      definitions/description in OWL




3   01.03.2011        Automating Generation of Textual Class Definitions from
                      OWL to English
    OWL Smackdown: Computer vs Human




4   01.03.2011   Automating Generation of Textual Class Definitions from
                 OWL to English
        Our Hypotheses

    • Text = humans
    • Logical = computers (and future human-computer hybrids)
    • Textual definitions ≈ Logical definition
    • Textual definitions tend to be more lossy than logical
      (cardinalities are often dropped, specific roles not
      mentioned, etc.)
    • Logical definitions are often more explicit than natural
      language and therefore should contain sufficient content to
      produce a textual definition.




5       01.03.2011   Automating Generation of Textual Class Definitions from
                     OWL to English
    EFO Use Case
    www.ebi.ac.uk/efo
    • Experimental Factor Ontology (EFO) is an application
      ontology which consumes domain ontologies to satisfy
      specific application focused use cases
    • Primarily Gene Expression data from ArrayExpress @ EBI




6   01.03.2011   Automating Generation of Textual Class Definitions from
                 OWL to English
    EFO @ Gene Expression Atlas
    www.ebi.ac.uk/gxa




7   01.03.2011   Automating Generation of Textual Class Definitions from
                 OWL to English
    Related Work

    • Generating descriptions from ontologies often called
      „ontology verbalisation‟
    • A number concerned only with ABox verbalisation
      (Hielkema 2009; Galanis and Androutsopoulos, 2007)
    • Others produce only separate sentences, one for each
      OWL axiom (Kalijurand, 2007)
    • Our approach has much in common but differs in;
           •  only a subset of OWL is considered (the simple description logic
             EL++)
           • instead of realising axioms in isolation we apply some rules for
             organisation and aggregation to give more natural feel



8   01.03.2011       Automating Generation of Textual Class Definitions from
                     OWL to English
    Method Overview
    • An OWL ontology is just a “pile of axioms”
    • We can produce individual sentences based on a grammar that
      guides transformation from OWL to English (or other natural
      language)
    • Need to group sentences (group axioms with the same subject
      together)
    • Need to aggregate axioms (collapse axioms with the same
      relationship together)
    • Once grouped and aggregated, a paragraph of text can be produced
      sentence by sentence.


                 hasPart some leg
                                                                                 Has parts leg, body
                 hasPart some body
                 hasPart some head                                               and head


9   01.03.2011         Automating Generation of Textual Class Definitions from
                       OWL to English
     Processing stages
     •    Transcode OWL/XML to Prolog
     •    Construct a lexicon for atomic entities – (next slide)
     •    Group axioms by atomic entity
     •    Aggregate axioms with similar structure
     •    Generate sentences from aggregated axioms.

                  class(animal).
                  subClassOf(class(cat), class(animal).
                  subClassOf(class(dog), class(animal).
                    =>
                  class(animal).
                  subClassOf([class(cat), class(dog)], class(animal)).
                   =>
                  ANIMAL.
                  A cat and a dog are both kinds of animals.




10   01.03.2011              Automating Generation of Textual Class Definitions from
                             OWL to English
     Description Generator
     •    Input: OWL/XML ontology
     •    Output: Text describing atomic entities
     •    generation from label/URL
     •    It is assumed that the syntax of each phrase will be severely
          constrained as follows:
            • individuals are expressed by proper names
            • classes by common nouns (with singular and plural forms)
            • properties by transitive verbs (simple or compound) with slots for
               a subject and an object.

                  ANIMAL.
                  The following are kinds of animals: a cat, a duck, a giraffe, a person, a
                  sheep, and a tiger.
                  An animal eats a thing.
                  If X has as pet Y then necessarily Y is an animal.



11   01.03.2011               Automating Generation of Textual Class Definitions from
                              OWL to English
     Results
      Class label        OWL axioms (Manchester syntax)                             Natural Language Definition Extracted

      22rv1              bearer_of some 'prostate carcinoma'                        A 22rv1 is a cell line.
                         derives_from some 'Homo sapiens'                           A 22rv1 is all of the following: something
                         derives_from some prostate                                 that is bearer of a prostate carcinoma,
                                                                                    something that derives from a homo
                                                                                    sapiens, and something that derives from
                                                                                    a prostate.
      HeLa               bearer_of some 'cervical carcinoma'                        A he la is a cell line. A he la is all of the
                         derives_from some 'Homo sapiens'                           following: something that is bearer of a
                         derives_from some cervix                                   cervical carcinoma, something that
                         derives_from some 'epithelial cell'                        derives from a homo sapiens, something
                                                                                    that derives from an epithelial cell, and
                                                                                    something that derives from a cervix.

      Ara-C-resistant    has subclass b117h*                                        A ara c resistant murine leukemia is a cell
      murine leukemia    has subclass b140h*                                        line. A b117h, and a b140h are kinds of
                                                                                    ara c resistant murine leukemias.
      GM18507            derives_from some 'Homo sapiens'                           A gm18507 is all of the following:
                         derives_from some lymphoblast                              something that has as quality a male,
                         has_quality some male                                      something that derives from a homo
                                                                                    sapiens, and something that derives from
                                                                                    a lymphoblast.

          *axioms placed on subclasses

12   01.03.2011           Automating Generation of Textual Class Definitions from
                          OWL to English
     Results
     • Online survey of ontology users at EBI
     • 10 of the 50 verbalisations were evaluated based on widest range of
       axioms


                          60

                          50

                          40
                  Total   30

                          20

                          10

                           0
                                     1                 2                3           4       5
                                (meaning                                                (meaning
                                 is clear)                   Judgement                  not clear)


13   01.03.2011           Automating Generation of Textual Class Definitions from
                          OWL to English
     Findings
     • Finding of dodgy class;
            • definition for Ara-C-resistant murine leukemia indicated
              subclasses b117h and b140h types of this, implying that they
              were diseases rather than cell lines
     • Desire amongst this user group for simplicity of language
       – avoid ontological formality
            • e.g. bearer of
     • Especially property names for qualities
            • e.g. has as quality male
     • Initial verbalisation making semantics clear was not liked
     • Plural forms occasionally issue:
                  lex(class(EFO_0000322),noun, „cell line‟, „cell lines‟).
                  lex(class(EFO_0002095),noun, „22rv1‟,‟22rv1s‟).
14   01.03.2011           Automating Generation of Textual Class Definitions from
                          OWL to English
     Conclusion
     • Initial results were largely well received and considered useful in
       most cases
     • Discovery of incorrect class definition demonstrates potential as tool
       for class validation
     • Preference for text definitions was for „clear and simple‟ over „precise
       and complex‟
     • Dependent entities could become adjectival forms of the independent
       entities in which they inhere (cell has quality female becomes female
       cell)
     • Formal relations/class labels reduce understanding and should be
       brought closer to domain language
     • Many ontologies are not amenable to text mining – this is an
       important use case neglected by most
     • Definitions now being imported into EFO



15   01.03.2011     Automating Generation of Textual Class Definitions from
                    OWL to English
     Next Steps

     • Systematic study of acceptable wordings
     • Different wording styles for different users
     • Adjectival forms for qualities etc; the role of a upper level
       ontology
     • Moving beyond EL++
     • Parsing for OBO




16   01.03.2011   Automating Generation of Textual Class Definitions from
                  OWL to English
     Next Steps: Round Tripping




17   01.03.2011   Automating Generation of Textual Class Definitions from
                  OWL to English
     Open Questions

     • Should textual descriptions ≡ logical descriptions?
     • Are discrepencies acceptable?




18   01.03.2011   Automating Generation of Textual Class Definitions from
                  OWL to English
     Acknowledgements

     • Sandra Williams, Richard Power and Robert Stevens are
       funded by the SWAT project (EPSRC grants
       EP/G033579/1 and EP/G032459/1);
     • James Malone is funded by EMBL and EMERALD
       (project number LSHG-CT-2006-037686).
     • We would like to thank the members of the EBI‟s ontology
       interest group, functional genomics group and Dr Helen
       Parkinson for comments and survey participation




19   01.03.2011   Automating Generation of Textual Class Definitions from
                  OWL to English

								
To top