Docstoc

What is an ontology

Document Sample
What is an ontology Powered By Docstoc
					Multilingual Information
        Exchange

      APAN, Bangkok
      27 January 2005
   Margherita.sini@fao.org
          The general problem
• Searching for multilingual resources is not easy:
   – on the web
   – on metadata catalogues / bibliographical databases
   – on full text documents
• Results are generally in the language used in
  the search query
=> We need a multilingual approach and
  multilingual tools (Thesauri / Ontologies, etc.)
   What we can achieve (1):
 Multilingual concept resolution
• With a multilingual thesaurus or ontology
  we can find resources on any language

 Because we can
   realize ......

Multilingual concept
     resolution!
 What we can achieve (2): Brokering
With a multilingual thesaurus or ontology we can find
resources from several sources also if we do not know the
terminology and the language used in these sources
                                fishing vessel
   ships
   navio
   navire                                        vessels      bateau de pêche
   船舶                fishing vessels              crafts        fishing boat




            Results in multiple languages from multiple databases
    How to build a multilingual
      Thesaurus / Ontology
• Lexicalizations of concepts in multiple
  languages:
  – {… fishing boat; bateau de pêche; 捕捞渔船 … }
• For every language we can have
  synonyms:
  – { … fishing vessel, fishing boat, fishing craft … }
  – { … bateau de pêche, navire de pêche, … }
  – { … 捕捞渔船, … }
      FAO activities (ongoing)
• Food safety ontology (English, Spanish, French)
• Fishery ontology (English, Chinese)
• Food and Nutrition ontology-based portal
  (English, Spanish, French)
• Extensive work with AGROVOC
  –   RDFS / OWL version
  –   Semantic refinements
  –   Expand multilingual coverage
  –   Expand subject coverage
 The multilingual vocabulary...
• Must cover all concepts of interest to the
  users in the various languages,
• ... at a minimum all domain concepts
  lexicalized in any of the participating
  languages
• Must accommodate hierarchical structures
  suggested by different languages
                                   (Dr. Soergel)
               Problems (1)
• Translation of an English thesaurus into
  German does not make a German
  thesaurus
  => whenever possible we need to consider the
   concept in his globality (many languages,
   definitions, “surrounding context” etc.)
• Equivalence of terms holds only in some
  contexts
• More difficult to translate non-specialized
  terms                              (Dr. Soergel)
                   Problems (2)
• Two terms mean almost the same thing but differ slightly
  in meaning or connotation:
   – English: alcoholism
   – French: alcoholisme

   – English: vegetable (includes potatoes)
   – German: Gemüse (does not include potatoes)

• If the difference is big enough, one needs to introduce
  two separate concepts under a broader term; otherwise
  a scope note needs to clearly instruct indexers in all
  languages how the term is to be used so that the
  indexing stays, as far as possible, free from cultural bias
  or reflects multiple biases by assigning several
  descriptors.
                                                   (Dr. Soergel)
Available resources: example
• SuperThes, ...
• SWAD-Europe initiative: thesaurus
  activities
  – RDF encoding of multilingual thesaurus
    • Multilingual labelling approach
      (mirroring relations for every language)
    • Interlingual mapping approach
      (different structures to be mapped)
 SWAD-Europe: Inter-Thesaurus Mapping
• SKOS mapping:
  –   Exact
  –   Inexact
  –   Major
  –   Minor
  –   Partial
  –   Broad
  –   Narrow
  –   AND
  –   OR
  –   NOT
       Inter-Thesaurus Mapping: example
<ag:Concept>
<descriptor xml:lang="fr">Academie</descriptor>
<map:exactMatch>
   <map:AND>
         <map:memberList rdf:parseType="Collection">
             <aat:Concept>
                  <descriptor xml:lang="en">Academy</descriptor>
             </aat:Concept>
             <aat:Concept>
                  <descriptor xml:lang="en">Buildings</descriptor>
             </aat:Concept>
         </map:memberList>
   </map:AND>
</map:exactMatch>
</ag:Concept>
        Available resources:
         another possibility
• Use OWL
 – Define concepts
 – Define terms
 – Define string
 – Define relationships between these 3
   elements:
   • <similatTo>, <equivalentTo>, (+ skos suggestions)
   • <hasSynonym>, <hasAntonym>, <hasCognate>
   • <hasSpellingVariant>, <hasTranslation>
        Available resources:
        other techniques NLP
• Knowledge discovery: helps on the
  creation of ontologies in a specific
  language
• Used to create good IS
  – Concept extraction
  – Multilingual search engine
• …
              Conclusion
• We need multilingual tools
  – Ontologies better than traditional thesauri
• The task is not easy
  – Subject experts are essential
  – NLP could help
• We need tools
  – To help experts to realize the mapping
  – To do annotations
  –…
     Live demo


http://www.fao.org
Thank you.

				
DOCUMENT INFO