GIS by ewghwehws

VIEWS: 8 PAGES: 36

									Universal Networking
Language


          Shalini Gupta - 07305R02
The Problem
   Large exploration of Data
   Linguistic barriers(Multilingualism)‫‏‬
       Web contents are mostly in English and
        cannot be accessed without some proficiency
        in this language
       Though India forms large part of total
        population, the proportion of Internet Access
        is very low.
   Need for high speed translation to different
    languages
Solution: Machine Translation
   2 approaches:
    Transfer based
        Works on specific pairs of languages

             Some text analysis on source language
             Some on target language
       Interlingua based
         Build a universal language

         Convert data to universal language

         De convert it back

         Needs only 2N conversions opposed to
           N*(N-1) translations for transfer based
UNL: An Interlingua
   Language independent Knowledge
    Representation
   Vehicle for machine translation
   UNL solves “Information Monopolies”
    problem

     English                      Hindi
                  Interlingua
                      (UNL)‫‏‬

     French                      Chinese
Outline
   Introduction
   UNL Components
   Some Controversial Issues in UNL
   Language Divergences between Hindi
    and English
   Conclusion
Introduction to UNL
   Proposed by the United Nations University
   Enables computers to process information
    and knowledge across the language
    barriers
   Replicates functions of natural languages in
    human communication
   Enables distributing, receiving and
    understanding multilingual information
   Represents information sentence by
    sentence
UNL Graph
   Each sentence is converted into a hyper
    graph
       Concepts as nodes
       Relations as directed arcs
   Concepts are called Universal Words
   Word Knowledge represented by Universal
    Words (UWs) which are language
    independent
   Conceptual Knowledge captured by relating
    UWs through relations
  Example:
  John eats rice with a spoon
                                Universal Word



                                   Attribute

Semantic Relations
UNL Expression
   John eats rice with a spoon
       {unl}
       agt(eat(icl>do).@entry.@present,
        John(iof>person)‫‏‬
       obj(eat(icl>do).@entry.@present,
        rice(icl>food)‫‏‬
       ins(eat(icl>do).@entry.@present,
        spoon(icl>artifact).@indef
       {/unl}
Universal Word
    Types of Universal Word
   Syntactic and semantic unit of UNL
   Represents a concept
   Represents node in graph of UNL
    expression
   2 classes:
       Unit concepts
           Basic UWs
           Restricted UWs
           Extra UWs
       Compound concepts: Scopes
Types of Universal Words(UWs)‫‏‬

   Basic UWs
       Bare headwords with no constraint list
       E.g. :
           house
           drink
   Restricted UWs
       Headwords with a constraint list
       Represents a more specific concept, or subset
        of concepts
Types of UWs (contd..)‫‏‬

       Constraint List restricts the range of the
        concept that a Basic UW represents
         E.g. :

               state(icl>country)‫‏‬
               state(icl>abstract thing)‫‏‬
   Extra UWs
       Special type of Restricted UW
       Denote concepts that are not present in English.
       Foreign-language words are used as Head
        Words
       E.g. :
           Bharatnatyam(icl>dance)‫‏‬
Compound Concepts
   Raju said that [he had opened the window]

                               say
                            (icl>do)‫‏‬   @entry.@past

                      agt               obj
                                                                 open
                                                               (icl>do)‫@ ‏‬entry.@past
                                                                         @complete
                                              :01
            Raju                                         agt         obj
        (iof>person
                                                                  window
                                                    he           (icl>obj)‫‏‬
Compound Concepts (contd..)‫‏‬

   Set of binary relations that are grouped
    together to express a compound concept
   Interpreted as a whole
   Expressed by a scope in UNL expressions
   Raju said that [he had opened the window].
     Part of the sentence within square brackets
       should be grouped
     Only when they are grouped together and
       considered as a whole unit can the correct
       interpretation be obtained.
    Relations
    Relation of UNL is expressed as:
        <relation>(<uw1>, <uw2>)‫‏‬
        <relation> is one of the relations defined in UNL
        <uw1>, <uw2> are universal words
    E.g. John broke the window
        agt(break(icl>do).@entry.@past,
         John(iof>person))‫‏‬
        obj(break(icl>do).@entry.@past,
         window(icl>thing))
    41 such relations have been defined
Attributes
   Describe subjectivity of sentence
   Enrich the description given by UWs and
    relations
   E.g. Time with respect to the Speaker
       happened in the past : @past
       happening at present : @present
       will happen in future : @future
       John broke the window
           agt(break(icl>do).@entry.@past,
            John(iof>person))‫‏‬
UNL Knowledge Base

   Defines every possible relation between
    concepts
   Two important roles
       Defines semantics of Universal Words
       Gives linguistic knowledge of concepts
   E.g. The anchor wrote the script
       Linguistic Knowledge tells that anchor is a person
       Semantics tells that only a person can write a script
        (Anchor(of ship) can't do so)‫‏‬
Controversial Issues

   Meaning Representation Language:
       Should provide sufficient means to express
        knowledge.
       Should be simple.
   Main expressive device of UNL is Restrictions
   New expressive means for describing UWs have
    been proposed.
    Semantic Restriction
   UW: operator(icl>thing)‫‏‬
   Doesn't effectively separate the meaning
   2 meanings
        long distance operator(icl>human)‫‏‬
        addition operator (icl>abstract thing)
   Hypernymy and Meronymy are mostly
    used for expressing restrictions
   Synonmy and antonymy can be used
        E.g. wealth(equ>richness), poor(ant>rich)‫‏‬
    Argument Frame Restriction
   X borrows Y from Z for W
   All four arguments are needed to define the
    action of borrowing completely
   Example
     John borrowed $10000 for 3 years
     John has been borrowing money for 3 years

   UNL as a meaning representation language
    should have an ability to draw a distinction
    between the argument and non-argument links
    of predicates
    Weakly Differentiated Relations
    Some relations seem to be weakly
     differentiated and therefore difficult to
     use consistently.
        E.g. gol (final state) – plt (final place)‫‏‬
        E.g. src (initial state) – plf (initial place)
    John went to Brussels
        can be described both with gol and plt
        difference is that gol characterizes Brussels
         as the final state of John, while plt – as the
         final place of the whole event
    Redundant Relations
   Some relations seems to be based more on
    the semantic class of UWs
   E.g. mod (modification) – man (manner)‫‏‬
   Difference between them boils down to the
    semantic class of the starting point of the
    relation
     answered politely (man) [to answer]

     a polite answer (mod) [an answer]

   Relations 'man' and 'mod' can be merged
    Divergences between English
    and Hindi
   Constituent Order Divergence
       Jim is playing tennis.       जिम टै निस खेल रहा है
        (S)     (V)      (O)           (S) (O) (V)
   Adjunction Divergence
       The [living in Delhi] boy
       दिल्ली में रहिेवाला लडका
   Preposition-Stranding Divergence
       Which shop did John go to?
       ककस िकाि िौि गया में
             ु
    Divergences(contd..)‫‏‬
    Null Subject Divergence
        िा रहा हूं             going-am
    Pleonastic Divergence
      It is raining.          यह बाररश हो राही है
    Conflational Divergence
      Jim stabbed him.

      जिम उसको छरे से मारा
                   ु
    Promotional Divergence
      The play is on.         खेल चल रहा है
Conclusion

   UNL is an Interlingua for Machine
    Translation
   Studied Components of UNL
   Controversial Issues in UNL
   Divergences between English and Hindi
    References
   Igor Boguslavsky. Some controversial issues
    of UNL: linguistic aspects. 2004.
   Shachi Dave and Pushpak Bhattacharyya.
    Knowledge extraction from Hindi text, 2001.
   Shachi Dave, Jignashu Parikh, and Pushpak
    Bhattacharyya. Interlingua-based English-
    Hindi machine translation and language
    divergence. Machine Translation, 16(4):251–
    304, 2001.
    References
   The universal networking language manual,
    www.undl.org. 2006.
   Zhu M. Uchida H. The universal networking
    language (UNL) specifications. Technical
    Report, 2005.
Thank You
UNL System
    Knowledge Extraction from
    Hindi Text
   EnConverter is a language independent
    parser
   provides framework for analysis
   Need to provide a lexicon and Analysis Rules
   Analysis Rule: (<PRE>)... <LNODE>
    <RNODE> (<SUF1>) (<SUF2>) (<SUF3>)...
    <PRI>
   Lexicon Entry: [HW] {ID} ”UW” (ATTRIB1,
    ATTRIB2, ...) <FLG,FRE,PRI>;
    Knowledge Extraction from
    Hindi Text
   Each Step:
       Morphological
        Analysis
       Decision
           Relation
           Lexical
            Attribute
           UNL Attribute
    Verbal Concepts
   Classes of predicates
     actions ( have an active initiator, Eg. kill)‫‏‬

     activities ( set of heterogeneous actions with
       common goal, Eg.trade)‫‏‬
     events (Have no agent, Eg. the bridge broke
       )‫‏‬
     processes (Denote a situation that occupies
       a certain time span, Eg. the tree grows)‫‏‬
     states (Homogeneous, do not denote a
       change, Eg. hear, ache)‫‏‬
    Classes of predicates
    properties (Differ from the states in that they
     are atemporal, Eg. blind, red)‫‏‬
    relations (Specify relation between two or more
     things, Eg. love, hate,)‫‏‬
    In UNL, all verbal concepts group into three
     classes
      (icl>do) contains actions and activities

      (icl>occur) consists of events and processes

      (icl>be) composed of states, properties and
        relations
    Adjectival Concepts
    All adjectival concepts are divided into two
     classes:
        predicative (aoj>thing)‫‏‬
        restrictive (mod>thing)‫‏‬
    This does not work well in some situations
        Eg. Wise Greeks diluted wine with water
            Restrictive interpretation: ‘Those Greeks who were
             wise diluted wine with water. Silly ones didn’t’.
            Non-restrictive (qualificative) interpretation:
             ‘Greeks were wise. They diluted wine with water’.
            Its restrictive vs qualificative
    Should be applied to other
    modifiers also
   The students sitting in the corner are waiting
    for the professor
       The students(,) who are sitting in the corner(,)
        are waiting for the professor.
       The students in the corner are waiting for the
        professor
       The phrase 'who are sitting' can be restrictive
        (‘those of the students who are sitting in the
        corner are waiting for the professor; others are
        not’)‫‏‬
       non-restrictive (‘the students are waiting for the
        professor; they are sitting in the corner’)‫‏‬

								
To top