Docstoc

Inteligencia Artificial - Download as PowerPoint

Document Sample
Inteligencia Artificial - Download as PowerPoint Powered By Docstoc
					 Artificial Intelligence
Communication by natural
     language
         • Fall 2008

  • professor: Luigi Ceccaroni
            Communication
• Communication is the intentional
  exchange of information
  – brought about by the production and perception
    of signs drawn from a shared system of
    conventional signs.
• What sets humans apart from other animals
  and machines is the complex system of
  structured messages known as natural
  language.
  – It enables us to communicate most of what we
                                           22
    know about the world.
  Natural language processing
• In contrast with formal languages, natural
  languages, such as Spanish, French and
  English, have no strict definition.
  – They are used by a community of speakers.
• Natural language processing (NLP) treats
  natural languages as if they were formal
  languages
  – to build computational systems able to
    understand and generate human language in
    all its forms.                         33
   Understanding speech acts
• The action of producing language is called
  speech act.
• The problem of understanding speech
  acts is much like other understanding
  problems
  – such as understanding images or diagnosing
    illnesses.
• We are given a set of ambiguous inputs,
  – from them we have to work backwards to
    decide what state of the world could have
    created these inputs.                 44
   Fundamentals of language
• A formal language is defined as a
  (possibly infinite) set of strings.
• Each string is a concatenation of terminal
  symbols, sometimes called words.
• Formal languages such as first-order logic
  and Java have strict mathematical
  definitions.
• A grammar is a finite set of rules that
  specifies a language.
                                       55
   Fundamentals of language
• Formal languages always have an official
  grammar, specified in some document.
• Natural languages have no official
  grammar.
  – Linguists strive to discover properties of the
    language and then to codify their discoveries
    in a grammar.
  – To date, no linguist has succeeded
    completely.
                                             66
   Fundamentals of language
• Linguists attempt to define a language as
  it is.
• Prescriptive grammarians try to dictate
  how a language should be.
• They create rules which are sometimes
  printed in style guides, but have little
  relevance to actual language usage.


                                      77
    Fundamentals of language
• Both formal and natural languages associate
  a meaning or semantics to each valid string.
• In natural languages, it is also important to
  understand the pragmatics of a string:
  – the actual meaning of the string as it is spoken in
    a given situation:
     • There are very different ways to say “please”.
• The meaning is not just in the words
  themselves, but in the interpretation of the
  words in situ.
                                                        88
   Fundamentals of language
• Most grammar rule formalisms are based
  on the idea of phrase structure:
  – Strings are composed of substrings called
    phrases, which come in different categories.
• Examples of the category noun phrase,
  or NP:
  – “the king”
  – “the agent in the corner”

                                           99
   Fundamentals of language
1.Phrases usually correspond to natural
  semantic elements
  – from which the meaning of an utterance can
    be constructed; for example:
    • Noun phrases refer to objects in the world.
2.Categorizing phrases helps us to describe
  the allowable strings of the language.
  – Any of the noun phrases can combine with a
    verb phrase (or VP) such as “is dead” to form
    a phrase of category sentence (or S). 10
    Fundamentals of language
• Without the intermediate notions of NP and
  VP, it would be difficult to explain why “the
  king is dead” is a sentence whereas “king the
  dead is” is not.
• Category names such as NP, VP and S are
  called nonterminal symbols.
• Grammars define nonterminals using rewrite
  rules:
S → NP VP
An S may consist of any NP followed by any VP.
      Levels of analysis in NLP
• Lexico-morphological
   • Detecting lexical units and their morphological
     information
• Syntactic
   • Checking if a sentence is syntactically valid
• Semantic
   • Extracting global meaning from individual
     meanings and from relations
• Pragmatic
   • Relating a sentence to the line of discussion
• Illocutive
   • Relating a sentence to intentions
   Problems in NLP: examples
• Lexical ambiguity
  • “reinventing the front wheel”
     • “wheel” can be a noun or a verb (part-of-speech
       tagging or POS-tagging)
  • “she saw the bank”
     • Building of a financial institution? Sloping land?
       Supply held in reserve for future use? (word sense
       disambiguation or WSD)
   Problems in NLP: examples
• Syntactic ambiguity
  • “He saw a man on the mountain top with
    binoculars”
     • Who’s got the binoculars?
  • “The seller of newspapers of the
    neighborhood”
     • What is the prepositional-phrase attached to?
       (prepositional-phrase attachment or PP-
       attachment)
   Problems in NLP: examples
• Semantic ambiguity
  • “He gave the children a cake”
    • A cake in total or one to each child? (scope of the
      quantification)
  • “Colorless green ideas sleep furiously”
    • Sentence composed by Noam Chomsky in 1957
      as an example of a sentence whose grammar is
      correct but whose meaning is nonsensical.
    • It was used to show inadequacy of the then-
      popular probabilistic models of grammar, and the
      need for more structured models.
   Problems in NLP: examples
• References, ellipsis, pragmatics
  • “She gave him a book”
  • "We gave the monkeys the bananas because
    they were hungry“
  • "We gave the monkeys the bananas because
    they were over-ripe"
     • Same surface grammatical structure. However, the
       pronoun they refers to monkeys in one sentence
       and bananas in the other, and it is impossible to tell
       which without a knowledge of the properties of
       monkeys and bananas.
    Problems in NLP: examples
• Illocution (Where is the stress? What intentions?)
   • "I never said she stole my money" - Someone else
     said it, but I didn't.
   • "I never said she stole my money" - I simply didn't
     ever say it.
   • "I never said she stole my money" - I might have
     implied it in some way, but I never explicitly said it.
   • "I never said she stole my money" - I said someone
     took it; I didn't say it was she.
   • "I never said she stole my money" - I just said she
     probably borrowed it.
   • "I never said she stole my money" - I said she stole
     someone else's money.
   • "I never said she stole my money" - I said she stole
     something, but not my money.
     Statistical natural-language
              processing
• It uses stochastic, probabilistic and statistical methods
  to resolve some of the difficulties discussed above,
  especially those which arise because longer
  sentences are highly ambiguous when processed
  with realistic grammars, yielding thousands or millions
  of possible analyses.
• Methods for disambiguation often involve the use of
  corpora and Markov models.
• Statistical NLP comprises all quantitative approaches
  to automated language processing, including
  probabilistic modeling and information theory.
• The technology for statistical NLP comes mainly from
  machine learning and data mining, both of which are
  fields of artificial intelligence that involve learning from
  data.
    Major tasks and applications in
                 NLP
•   Automatic summarization
•   Foreign language reading aid
•   Foreign language writing aid
•   Information extraction
•   Information retrieval (IR)
    • IR is concerned with storing, searching and
      retrieving information.
    • It is a separate field within computer science
      (closer to databases), but IR relies on some NLP
      methods (for example, stemming).
    • Some current research and applications seek to
      bridge the gap between IR and NLP.
 Major tasks and applications in
              NLP
• Machine translation
  • Automatically translating from one human
    language to another.
• Named entity recognition (NER)
  • Given a stream of text, determining which
    items in the text map to proper names, such
    as people or places.
  • Although in English, named entities are
    marked with capitalized words, many other
    languages do not use capitalization to
    distinguish named entities.
    Major tasks and applications in
                 NLP
•   Natural language generation
•   Natural language understanding
•   Optical character recognition (OCR)
•   Question answering
    • Given a human language question, the task of
      producing a human-language answer.
    • The question may be a closed-ended (such as
      "What is the capital of Canada?") or open-
      ended (such as "What is the meaning of
      life?").
    Major tasks and applications in
                 NLP
• Speech recognition
    • Given a sound clip of a person or people
      speaking, the task of producing a text dictation
      of the speakers.
    • (The opposite of text to speech.)
•   Spoken dialogue system
•   Text simplification
•   Text-to-speech
•   Text-proofing
              Resources
• Natural language processing (in Spanish)
  [http://es.geocities.com/lenguajenatural/]
• Introductory book
  [http://www.gelbukh.com/clbook/]
• Resources for text, speech and language
  processing
  [http://www.cs.technion.ac.il/~gabr/resourc
  es/resources.html]
• Natural language processing blog
  [http://nlpers.blogspot.com/]
               Resources
• About Opinion, Language, and Blogs
  [http://opinlab.wordpress.com/]
• A comprehensive list of resources,
  classified by category
  [http://www.proxem.com/]
• ACL Wiki for natural language processing
  and computational linguistics
  [http://aclweb.org/aclwiki/index.php?title=
  Main_Page]
   Research and development
            groups
• IBM NLP Research Area
  [http://domino.watson.ibm.com/comm/research.nsf/page
  s/r.nlp.html]
• Microsoft Research: NLP
  [http://research.microsoft.com/nlp/]
• Language Technologies Institute at Carnegie Mellon
  University [http://www.lti.cs.cmu.edu/]
• Natural Language Group at the Information Sciences
  Institute [http://www.isi.edu/natural-language/]
• Natural Language Generation Group at the Open
  University [http://mcs.open.ac.uk/nlg/]
   Research and development
            groups
• Survey of the State of the Art in Human Language
  Technology [http://cslu.cse.ogi.edu/HLTsurvey/]
• University of Edinburgh Natural Language Processing
  Group [http://www.iccs.informatics.ed.ac.uk/]
• Natural Language and Information Processing Group at
  the University of Cambridge
  [http://www.cl.cam.ac.uk/research/nl/]
• Stanford Natural Language Processing Group
  [http://nlp.stanford.edu/]
• UPC center for research and technology development
  on language and speech processing (TALP)
  [http://www.talp.cat/talp/]

				
DOCUMENT INFO