Docstoc

juliogonzalo-brno

Document Sample
juliogonzalo-brno Powered By Docstoc
					Sense clusters versus sense
         relations
   Irina Chugur, Julio Gonzalo
          UNED (Spain)
  Sense clusters vs. Sense relations
Arguments for sense
clustering
• subtle distinctions
produce noise in
applications
• WN too fine-
grained
•Remove predictable
sense extensions?
  Sense clusters vs. Sense relations
Arguments for sense     But...
clustering
                                               Polysemy relations
• subtle distinctions
                        clusters are not       are more
produce noise in
                        absolute (e.g.         informative and
applications
                        metaphors in IR/MT)    predictive
• WN too fine-
                        not really!           WN rich sense
grained
                                              distinctions permit
                        Use them to infer and
•Remove predictable                           empirical
                        study systematic
sense extensions?                             /quantitative studies
                        polysemy
                                              on polysemy
                                              phenomena
  Sense clusters vs. Sense relations
Arguments for sense     But...
clustering
                                               Polysemy relations
• subtle distinctions
                        clusters are not       are more
produce noise in
                        absolute (e.g., are    informative and
applications
                        metaphors close?)      predictive
• WN too fine-
                        not really!           Annotation of
grained
                                              semantic relations in
                        Use them to infer and
•Remove predictable                           1000 wn nouns
                        study systematic
sense extensions?
                        polysemy
1) Cluster evidence from Semcor
 Hypothesis: if two senses tend to co-occur
  in the same documents, they are not good
  IR discriminators.
 Criterion: cluster senses that co-occur
  frequently in IR-Semcor collection.
 Example: fact 1 and fact 2 co-occur in 13 out of 171
    docs.
    – Fact 1. (a piece of information about circumstances that exist or
      events that have occurred)
    – Fact 2. (a statement or assertion of verified information about
      something that is the case or has happened)
 2) Cluster evidence from parallel
             polysemy
 English            Spanish       French     German

Band 1               Orquesta 4   Groupe 9    Band 2
Instrumentalists
not including
string players

Band 2               Orquesta 4   Groupe 6    Band 2
A group of
musicians playing
popular music for
dancing
 Parallel polysemy in EuroWordNet

English         Spanish         French              German
{child,kid}  {niño,crío,menor}{enfant,mineur}     {Kind}
{male child,  {niño}          {enfant}            {Kind,Spross}
Boy,child}
Comparison of clustering criteria
Clusters vs. semantic relations




       Polysemy relations are more predictive!
        Characterization of sense
         inventories for WSD
   Given two senses of a word,
    – How are they related? (polysemy relations)
    – How closely? (sense proximity)
    – In what applications should be distinguished?
   Given an individual sense of a word
    – Should it be split into subsenses? (sense stability)
               Cross-Linguistic evidence


Fine 40129
Mountains on the other side of the valley rose from the mist
like islands, and here and there flecks of cloud as pale and
<tag>fine</tag> as sea-spray, trailed across their sombre,
wooded slopes.
TRANSLATION: * *
            Sense proximity (Resnik &
                               Yarowsky)
                                    |wi|| wj|  tr (x) = tr (y)
                                        1
PL(same lexicalization|wi, wj)                                 L            L

                                                x {wi examples }
                                                y {wj examples }




                                      L Planguageslexicalization|w , w )
                             1
 Proximity(wi, wj)                         (same
                                                L                   i   j
                       |languages|
                      Experiment Design
                              MAIN SET

                                                         182 senses
44 Senseval-2 words (nouns and
adjectives)                                                508 examples



 11 native/bilingual speakers of 4                 Bulgarian
 languages                                         Russian
                                                   Spanish
                                                   Urdu
  (control set: 12 languages, 5 families, 28 subjects)
                  RESULTS:
       distribution of proximity indexes




Average proximity = 0.29: same as Hector in Senseval 1!
distribution of homonyms




                    ?
distribution of metaphors
        distribution of metonymy




Average
proximity:
target in
source 0.64,
source in
target 0.37
Annotation of 1000 wn nouns
     Relation       % sense pairs
    Homonymy            41.2
    Metonymy            32.5
     Metaphor           13.0
   Specialization        7.7
   Generalization        1.7
                                    Need for
    Equivalence          3.1        cluster
                                    here!
       fuzzy             0.8
Typology of sense relations

 Homonymy
 Metonymy
  Metaphor
Specialization
Generalization
 Equivalence
    fuzzy
Typology of sense relations:
       metonymy                      Animal-meat
                                     Animal-fur
                                     Tree-wood
  Homonymy        target in source   Object-color
  Metonymy                           Plant-fruit
                                     People-language
   Metaphor
                                     Action-duration
 Specialization                      Recipient-quantity ...
                                     Action-object
Generalization    source in target   Action-result

  Equivalence                        Shape-object
                                     Plant-food/beverage
     fuzzy                           Material-product ...

                  Co-metonymy        Substance-agent
Typology of sense relations:
       metaphors
                          Action/state/entity (source domain)

  Homonymy
                          Action/state/entity (target domain)
  Metonymy
                  object  object / person (47)
Metaphor (182)
                  person  person (21)
 Specialization
Generalization    physical action  abstract action (16)

  Equivalence
                  Physical property  abstract property (11)
     fuzzy
                  Animal  person (10)
                  ...
Typology of sense relations:
       metaphors
                          Action/state/entity (source domain)

  Homonymy
                          Action/state/entity (target domain)
  Metonymy
                  object  object / person (47)
Metaphor (182)
                  person  person (21)
 Specialization
Generalization     Source:
                   • historical, mythological, biblical character...
  Equivalence      • profession, occupation, position ...

     fuzzy         Target: prototype person


                   e.g. Adonis (greek mythology/handsome)
              Conclusions



Let’s annotate semantic relations between
  WN word senses!

				
DOCUMENT INFO
Shared By:
Categories:
Stats:
views:4
posted:3/24/2011
language:English
pages:23