Introduction to Topic Maps (2) Understanding the details by gregoria

VIEWS: 28 PAGES: 47

									ONTO PED IA
 The Identity of Everything


  Introduction to Topic Maps
              (2)
                Understanding the details

                              Steve Pepper
                              pepper.steve@gmail.com
                              Oslo University College, 2007-09-22




                                                               www.ontopedia.net
O NTO PE D IA
  The Identity of Everything




   Course agenda

        Week 37 – 09-08                Introduction to Topic Maps – Part 1
        Week 38 – 09-15                Creating a topic map
        Week 39 – 09-22                Introduction to Topic Maps – Part 2
        Week 42 – 10-13                Ontology-driven editing
        Week 43 – 10-20                The machinery of Topic Maps
        Week 46 – 11-10                (Semantic Web)
        Week 48 – 11-24                (Ontologies)


                   Terminology:
                      –   Topic Maps: The technology and the standard
                      –   topic maps: The artefacts (documents) we create
                                                                            www.ontopedia.net
O NTO PE D IA
  The Identity of Everything




   Today‟s agenda

        Advanced modeling issues
           –    Types and type hierarchies
           –    Association roles and arity
           –    Variant names and name types
           –    Scope
           –    Identity
        Q&A on your personal topic maps




                                               www.ontopedia.net
ONTO PED IA
 The Identity of Everything




             More about topic types




                                      www.ontopedia.net
O NTO PE D IA                                                              ?
  The Identity of Everything




   Topic types

        A topic type defines a class of things
           –    It‟s a particular kind of category that has instances
           –    You can also think of it as a set of things that have
                one or more properties in common
        Rule #1: If it doesn’t have instances, it isn’t a type!
           –    “Music” is a category, but not a type (there are no instances)
                      nothing “is a” music
           –    “Opera” is a type, because there are things which are operas
                      Tosca “is an” opera

        A diagnostic for deciding if „foo‟ is a type:
           –    If you can think of things which are „foos‟ the answer is yes
                      But be careful: Is „wine‟ a type?
           –    If the answer is no, ask what kind of thing „foo‟ is
                      Now, that really is a type!
                                                                                 www.ontopedia.net
O NTO PE D IA
  The Identity of Everything




   ISA and type-instance

        The relationship between a type and its instance is actually
         a special kind of association
           –    We call it (guess what): a type-instance relationship
           –    It‟s also often called an ISA relationship

                                            is a
                               tosca               opera



        It can be represented as an association in XTM or LTM
           –    But there‟s no real point
           –    Use the syntactic shortcut instead:
           –    [tosca : opera]



                                                                        www.ontopedia.net
O NTO PE D IA
  The Identity of Everything




   Rules of thumb for topic types

        Choose an appropriate level of generality
           –    “Countries” is better than “Countries in South-East Asia”
           –    The domain of the topic map tells you which countries it includes
           –    If it doesn‟t, an association would be a better solution
                   located-in(Thailand, South-East_Asia)
        But don‟t make it so general as to be useless
           –    “Places” instead of “countries” would mix countries and cities
        Keep the name short
           –    That makes it easier to display
        Use the singular form
           –    Experience shows this to be most useful, so “Country”, not “Countries”
        Use initial capitals
           –    A matter of taste, but I think it looks most tidy


                                                                                         www.ontopedia.net
O NTO PE D IA
  The Identity of Everything




   Type hierarchies

          Some topic types can be arranged in hierarchies
           –     Type hierarchies are a natural way to order parts of the world
           –     Humans are quite familiar with tree structures
          Type hierarchies provide
           –     more user-friendly navigation
           –     more powerful querying/inferencing
           –     more compact schemas and ontologies
           –     greater clarity about the relationships between types
          Use hierarchies, but beware of two pitfalls:
           1) Not all hierarchies are type hierarchies...
           2) It‟s easy to confuse your ISAs and your AKOs…

                                                                         www.ontopedia.net
O NTO PE D IA
  The Identity of Everything




   Type hierarchies: AKO

                                             Mammal




                    Primate                           Canine



                                Homo
         Chimp                                  Dog            Wolf
                               sapiens

                                      a dog is A Kind Of canine,
                                    a canine is A Kind Of mammal,
                                                  etc.
                                                                      www.ontopedia.net
O NTO PE D IA
  The Identity of Everything




   Dragon #1: Mixing ISAs and AKOs


              Steve is a homo sapiens          type-instance (ISA)
              A homo sapiens is a mammal       supertype-subtype (AKO)
              Therefore: Steve is a mammal


              Steve is a homo sapiens          type-instance (ISA)
              Homo sapiens is a species        type-instance (ISA)
              Therefore: *Steve is a species




                                                          www.ontopedia.net
O NTO PE D IA
  The Identity of Everything




   Types, subtypes and instances

                                                Mammal
                                                                         LEGEND
                                                                       supertype-subtype

                    Primate                            Canine
                                                                        types



                                                                       type-instance
                                Homo
         Chimp                                   Dog            Wolf
                               sapiens
                                                                       instances



                      Steve              Nils

                                                                                www.ontopedia.net
O NTO PE D IA
  The Identity of Everything




   How type hierarchies work

        The superclass-subclass relationship has defined
         semantics
           –    Therefore: make sure you use it correctly
           –    Software (tolog, for example) will assume you mean what you say
           –    If you abuse the semantics you will get incorrect results!

        If A is a superclass of B, then                                          A
           –    Both A and B must be classes
           –    If C is an instance of B, it must also be an instance of A            ?
           –    If C is a subclass of B, it must also be a subclass of A,
                (in which case an instance of C is also an instance of B          B
                and an instance of A)
                                                                                      ?
        If in doubt define your own association type
           –    merging it with superclass/subclass later is trivial              C

                                                                             www.ontopedia.net
O NTO PE D IA
  The Identity of Everything




   Being both type and instance

        Most modelling paradigms distinguish between “type”
         and “instance”
           –    In most paradigms something cannot be both

        In Topic Maps something can be both type and instance
           –    (or class/category and individual)

        For example, homo sapiens can be both
           –    a type (supertype=primate, instance=Steve), and
           –    an instance (type=species)

        So be careful!



                                                                  www.ontopedia.net
O NTO PE D IA
  The Identity of Everything




   Representing a type hierarchy

        Use associations between typing topics
           –    subtypeOf(homo_sapiens : subtype, primate : supertype)
           –    subtypeOf(primate : subtype, mammal : supertype)

        XTM 1.0 defined identifiers for these three subjects
           –    subtypeOf (or superclass-subclass):
                http://www.topicmaps.org/xtm/1.0/core.xtm#superclass-subclass

           –    supertype (or superclass):
                http://www.topicmaps.org/xtm/1.0/core.xtm#superclass

           –    subtype (or subclass):
                http://www.topicmaps.org/xtm/1.0/core.xtm#subclass

        Topic Maps software understands these and
         implements the semantics for you

                                                                       www.ontopedia.net
O NTO PE D IA
  The Identity of Everything




   Type hierarchies in LTM

          /* Techquila hierarchy PSIs */
          [hierarchical-relation-type = "Hierarchical relation type"
            @"http://www.techquila.com/psi/hierarchy/#hierarchical-relation-type"]
          [superordinate-role-type = "Superordinate role type"
            @"http://www.techquila.com/psi/hierarchy/#superordinate-role-type"]
          [subordinate-role-type = "Subordinate role type"
            @"http://www.techquila.com/psi/hierarchy/#subordinate-role-type"]

          /* XTM superclass-subclass PSIs */
          [subtypeOf : hierarchical-relation-type
            = "Subtype of” = "Supertype of" / supertype
            @"http://www.topicmaps.org/xtm/1.0/core.xtm#superclass-subclass"]
          [subtype : subordinate-role-type = "Subtype"
            @"http://www.topicmaps.org/xtm/1.0/core.xtm#subclass"]
          [supertype : superordinate-role-type = "Supertype"
            @"http://www.topicmaps.org/xtm/1.0/core.xtm#superclass"]

          /* An example type hierarchy */
          subtypeOf( composer : subtype , musician : supertype )
          subtypeOf( conductor : subtype , musician : supertype )
          subtypeOf( cellist : subtype , musician : supertype )
                                                                       www.ontopedia.net
O NTO PE D IA
  The Identity of Everything




   Dragon #2: Non-type hierarchies

        Not all hierarchies are type hierarchies
        For example:
           –    geographical containment                          Music
                                                                 Europe
                                                                Submarine

           –    part of relationships
           –    subject classifications
                                                  Norway
                                                  Engine
                                              Classical music                 Sweden
                                                                                Body
                                                                            Popular music
        These relationships
         are not supertype-
         subtype                           Opera
                                            Oslo
                                           Piston
                                                          Choral
                                                          Bergen
                                                          Pump
                                                                          Rock
                                                                         Hatch
                                                                       Stockholm        Göteborg
                                                                                         Reggae
                                                                                          Turret
                                                          music          music
           –    located in
           –    part of                        piston is is NOTkind of of music...
                                               Norway is NOT kind of Europe...
                                             A An opera NOT a a a kind submarine...
           –    subtopic of
        So again, be careful!
                                                                                   www.ontopedia.net
ONTO PED IA
 The Identity of Everything




  More about association types –
     and all about role types




                              www.ontopedia.net
O NTO PE D IA
  The Identity of Everything




   Topics play roles in associations

        Associations have no direction
           –    They represent relationships and                                   born in
                are inherently multidirectional                        puccini                    lucca
                                                                                 birthplace of
                          –    “Puccini was born in Lucca”
                                                                          person             place
                          –    “Lucca was the birthplace of Puccini”
                      Two ways to express the same relationship
           –    Impression of direction caused by use of natural language
                      One of the topics viewed as the subject and the other as the object
        Instead of direction, associations use roles
           –    Puccini plays the role of person and Lucca plays the role of place
           –    person and place are association role types (or “role types”, for short)
           –    Labels are assigned based on role perspective

                                                                                             www.ontopedia.net
O NTO PE D IA
  The Identity of Everything




   Anatomy of an association
          composer                                                        city

               T                                                           T
                                   person            born-in   place

                                     T                    T     T



               T                     R                    A     R          T
            Puccini                                                      Lucca

                          Role types characterize the nature of the
                           subject‟s involvement in the relationship
                               –   They are also topics

                                                                       www.ontopedia.net
O NTO PE D IA
  The Identity of Everything




   Role type and topic type

        The are NOT the same thing!
           –    Different constructs                                [puccini : composer]
           –    Different purposes                                     person
                                                                                T      composer
        Topic type                                          born in
                                                                            pupil

           –    Expresses something universal or                                       composed by
                essential about the subject                               pupil of

           –    e.g. Puccini is a composer
        Role type                                          T
           –    Expresses the nature of the subject‟s      lucca                          T
                involvement in a particular relationship               T                tosca

           –    e.g. Puccini plays the role of pupil               ponchielli

        Sometimes they are “the same”
           –    More usually they are different
                                                                                     www.ontopedia.net
O NTO PE D IA
  The Identity of Everything




   LTM syntax for role types

        Complete syntax:
         born-in( puccini : person, lucca : place )
         pupil-of( puccini : pupil, ponchielli : teacher )
         composed-by( tosca : work, puccini : composer )

        Abbreviated syntax:
         [lucca : city]                     role type is
         [puccini : composer]               “inherited” from
                                            topic type


         born-in( puccini, lucca )

          born-in( puccini : composer, lucca : city )
                                                     www.ontopedia.net
O NTO PE D IA
  The Identity of Everything




   Symmetric associations

        Some associations are the same in both directions
           –    E.g., if A is a friend of B, then B is (presumably) a friend of A
        In this case the role type is the same
           –    We call this a symmetric association

                               friend       friend-of         friend

                                 T             T                T



               T                R              A                R               T
            puccini                                                         mascagni


                                                                            www.ontopedia.net
O NTO PE D IA
  The Identity of Everything




   N-ary associations

        Associations can have any number of roles
        Two roles is by far the most common
           –    such associations are called binary associations
        However, sometimes you need more than two roles
           –    for example,                child    parenthood         mother
                to express
                parenthood                   T           T                  T

                                 T           R           A                  R             T
                                steve                                                   edna

           parenthood (                                  R          T
             steve : child,                                        father
             edna : mother,
             harry : father )
                                                         T
                                                                                 www.ontopedia.net
                                                        harry
O NTO PE D IA
  The Identity of Everything




   Unary associations

        You can even have associations with just one role
         player
        Unary associations represent yes/no conditions
           –    cf. binary properties (true/false)
           –    e.g. expressing that an opera is unfinished
                unfinished( turandot : work )

                                              work            unfinished

                                               T                 T



                                 T             R                 A
                               turandot                                    www.ontopedia.net
O NTO PE D IA
  The Identity of Everything




   The arity of associations: summary

        Unary associations are not common
           –    Useful for representing properties that have boolean values
                      e.g., the property of being “unfinished”
        Binary associations are the most common
           –    Often correspond to verb ( subject, object ) constructs
        Ternary associations are quite common
           –    Often correspond to verb( subject, direct-object, indirect-object ) constructs
        N-ary associations (where n > 3)
           –    Less common but sometimes useful
           –    Many n-ary associations are better represented as (n-1) binary
                associations...




                                                                                 www.ontopedia.net
O NTO PE D IA
  The Identity of Everything




   Rules of thumb for roles

        Keep the number of roles as low as possible
           –    Consider whether introducing an intermediate topic makes sense
        Avoid repeating roles
           –    If one role can be played multiple times in the same association
                this indicates that the association represents a group
           –    In these cases, you should probably have a topic for the group


               T           R       A       R       T   T       A       T       A       T
                                       R                                   A
                               R
                                   R                               A   A
                                               T                                   T
                     T             T                       T           T

                                                                               www.ontopedia.net
O NTO PE D IA
  The Identity of Everything




   Naming association types

        Nouns
           –    expressing the nature of the relationship, e.g., “first performance”
           –    compounds created from the role names, e.g., “teacher/pupil”
        Verbs
           –    very natural, but they imply direction (subject-verb-object)
        Steve‟s recommendation
           –    use verbs
           –    choose the most natural as the default
                      „composed by‟ is more natural than „composer of‟
           –    use additional names scoped by role type for the „object‟
                      the corresponding active/passive form is often the best choice


                                                                                 www.ontopedia.net
ONTO PED IA
 The Identity of Everything




             More about names and
                 occurrences




                                    www.ontopedia.net
O NTO PE D IA
  The Identity of Everything




   More about names

        Names are essentially labels that are used to
         communicate with humans via a user interface
           –    Different from identifiers used by computers (see “Identity”, below)
           –    Topics can have multiple names
           –    Names may be typed (new in XTM 2.0)
           –    Each name can be scoped
           –    Names can have variants
        The question often arises
           –    When is it appropriate to use which?
        The answer is by no means clear
           –    The Topic Maps community is still gaining experience in this
           –    The following contains some pointers

                                                                         www.ontopedia.net
O NTO PE D IA
  The Identity of Everything




   Variant names

        Variant names are essentially variant forms of
         the same name
        Examples of variants are
           –    sort key
           –    plural form
           –    pronunciation
           –    common misspellings/alternative spellings
           –    transliterations into other scripts (or original forms)




                                                                          www.ontopedia.net
O NTO PE D IA
  The Identity of Everything




   Name types

        A name type is a set of names that have
         something in common
        Examples of name types are
           –    first name
           –    last name
           –    country code
           –    language code




                                                   www.ontopedia.net
O NTO PE D IA
  The Identity of Everything




   Scoped names

        Names which are qualified to be used in a certain
         context
        Typical examples:
           –    Names in foreign languages
           –    Names relevant for a certain kind of user
                (e.g. technical vs. non-technical)




                                                            www.ontopedia.net
O NTO PE D IA
  The Identity of Everything




   Name type or scoped name?

        Rules of thumb:
           –    For names in different natural languages use scope
                (because language is a kind of context)
           –    If a name of a certain kind is to be found systematically on (almost)
                every topic of a certain type, use a name type, e.g.
                      Every person might have a given name and a user name
                      Every language might have a language code

        Your application may leave you no choice!
           –    LTM does not currently support typed names
                      So you have to use scope
           –    Ontopoly does not currently support scoped names
                      So you have to use name types

                                                                              www.ontopedia.net
O NTO PE D IA
  The Identity of Everything




   The default name

        Another rule of thumb:
           –    Always have exactly one name that is neither typed nor scoped
           –    This is effectively the default name for the topic

        Never have more than one name that is both
         unscoped and untyped
           –    Applications will have no way of consistently choosing one name

        In general
           –    Keep names as short as possible
           –    Or at least do not make them longer than necessary



                                                                      www.ontopedia.net
ONTO PED IA
 The Identity of Everything




                   Scope and identity




                                        www.ontopedia.net
O NTO PE D IA
  The Identity of Everything




   Scope: A few more details

        The context within which a statement is valid
           –    (Statement = name, variant, occurrence or association)
           –    Expressed as a set of (zero or more) topics
           –    Scope with no topics (the default) is called the unconstrained scope
        General use is for
           –    Provenance (“where from”)
           –    Opinion (“who says”)
        Names
           –    Natural language
        Variants
           –    Context of use (e.g. acronym, alternative transliteration)

                                                                             www.ontopedia.net
O NTO PE D IA
  The Identity of Everything




   Identity: The all-important issue

        What makes merging possible?
           –    NOT the use of names, which are notoriously unreliable
           –    Names are not unambiguous (the homonym problem)
           –    Many topics have multiple names (the synonym problem)
        Achievement of the collocation objective
           –    Only possible through the use of unique global identifiers
        The issue of identification of subjects is therefore
         crucial
           –    If subjects have unique identifiers, people can be free to use
                whatever names they like – and machines can still aggregate
                information


                                                                         www.ontopedia.net
O NTO PE D IA
  The Identity of Everything




   Subjects and Topics

        Topics are surrogates, or “proxies”
         (inside the computer) for the
         ineffable subjects that you want to
         talk about, such as Puccini, love,
         these slides, or the second law of
         thermodynamics
                                                                A subject in
                                                              the real world




                                          A topic in the
                                      T
                                          computer domain


                                                            www.ontopedia.net
O NTO PE D IA
  The Identity of Everything




   The identity of subjects

        Topics exist in order to
         allow us to talk about
         subjects
           –    The relationship between the two
                is sometimes called intentionality

        We need to know exactly
         which subject a topic
         represents
                                                                       Tosca
           –    That is, we need to establish its
                subject identity                     Lucca                        Madame
           –    The collocation objective depends                                 Butterfly

                on knowing when applications are
                talking about the same thing                 Puccini




                                                                          www.ontopedia.net
O NTO PE D IA                                                       Life, the Universe and Everything


  The Identity of Everything


                                                                                                    subject

   Subject identifiers

        The identity of most subjects can only
         be established indirectly
           –    An information resource can provide an
                indication of the subject‟s identity to a human                                          Giacomo Puccini,
                                                                                                         Italian composer,
           –    Such a resource is called a subject descriptor                                           b. Lucca 22nd Dec
        A subject descriptor has an address,                                                            1858, d. Brussels,
                                                                                                         29th Nov 1924.
         even though the subject it indicates                           The Computer Domain              Best known for his
                                                                                                         operas, of which
         does not                                                                                              Tosca is the
           –    Computers can use the address of the                                                           most . . .
                subject descriptor to establish identity                subject identifier
                                                                                                               subject descriptor
           –    Such addresses are called
                subject identifiers
        Subject descriptors and subject
         identifiers are the two sides of
         the human-computer dichotomy
                                                                    Puccini
                                                                    topic

                                                             The Topic Map Domain                             www.ontopedia.net
O NTO PE D IA
  The Identity of Everything




   Published Subjects

        In order for identifiers to be reused, they must
         made publicly available
           –    A subject identifier that has been made available for use outside
                one particular application is called a published subject identifier
                (PSI)
           –    Its descriptor is called a published subject descriptor (PSD)
        Anyone can publish PSI sets
           –    Adoption of PSI sets will be an evolutionary process based on trust
           –    It will lead to greater and greater interoperability – between topic
                map applications, between Topic Maps and RDF, and across
                information and knowledge management in general
           –    Check out http://psi.ontopedia.net (under development)

                                                                           www.ontopedia.net
O NTO PE D IA
  The Identity of Everything




   PSIs for machines and humans




                               www.ontopedia.net
O NTO PE D IA
  The Identity of Everything




   Advice on subject identifiers

        Always use them for your typing topics
           –    Makes your ontology more portable

        The more serious your application, the more extensively
         you should use them for instances
           –    Merging with other topic maps will not be successful without identifiers

        LTM code for subject identifiers
           –    See previous lecture and opera.ltm
           –    Example:
           –    [composer = "Composer"
                   @"http://psi.ontopedia.net/Composer"]




                                                                                www.ontopedia.net
O NTO PE D IA
  The Identity of Everything




   Steve‟s conventions for PSIs

        URI prefix:
           –    http://psi.ontopedia.net/
           –    Note: Not all my identifiers have corresponding descriptors

        URI suffix:
           –    Initial cap for topic types and role types (e.g. Composer)
           –    Lower case for association, occurrence and name types (e.g. born_in)
           –    Wikipedia conventions for instances
           –    Replace spaces with underscores

        Check Norwegian Opera for examples
           –    Do not use the Italian Opera Topic Map – its conventions are outdated




                                                                              www.ontopedia.net
ONTO PED IA
 The Identity of Everything




                              Wrap Up




                                        www.ontopedia.net
O NTO PE D IA
  The Identity of Everything




   Home assignment

        Finish your LTM topic map
           –    Read through the slides from this lecture
           –    Consider whether your modelling is appropriate
           –    Consider whether you have followed recommended
                conventions
           –    Send the final result to pepper.steve@gmail.com by
                Monday September 29




                                                                     www.ontopedia.net
O NTO PE D IA
  The Identity of Everything




   Next lecture

        Monday October 13
        Same time, same place
        Agenda
           –    Ontology-driven editing




                                          www.ontopedia.net

								
To top