NLP - PowerPoint by liwenting

VIEWS: 80 PAGES: 70

									 Introduction to Natural Language
            Processing




Intro to Natural Language Processing   1
               What is Natural Language
                     Processing?
• The study of human languages and how they can be
  represented computationally and analyzed and
  generated algorithmically
   – The cat is on the mat. --> on (mat, cat)
   – on (mat, cat) --> The cat is on the mat

• Studying NLP involves studying natural language,
  formal representations, and algorithms for their
  manipulation



Intro to Natural Language Processing                 2
                            NLP: Grand Challenges
The Ultimate Goal – For computers to use NL as effectively as
  humans do….

Reading and writing text
• Abstracting
• Monitoring
• Extraction into Databases

Interactive Dialogue: Natural, effective access to computer systems
• Informal Speech Input and Output

Translation: Input and Output in Multiple Languages


    CIS 391 - Intro to AI                                         3
               What is Natural Language
                     Processing?

Building computational models of natural language
  comprehension and production

Other Names:
• Computational Linguistics (CL)
• Human Language Technology (HLT)
• Natural Language Engineering (NLE)
• Speech and Text Processing


Intro to Natural Language Processing                4
               Engineering Perspective
 Use CL as part of a larger application:
       – Spoken dialogue systems for telephone based information
         systems
       – Components of web search engines or document retrieval
         services
             • Machine translation
             • Question/answering systems
             • Text Summarization
       – Interface for intelligent tutoring/training systems
 Emphasis on
       – Robustness (doesn‟t collapse on unexpected input)
       – Coverage (does something useful with most inputs)
       – Efficiency (speech; large document collections)
Intro to Natural Language Processing                               5
     Cognitive Science Perspective

Goal: gain an understanding of how people
  comprehend and produce language.

Goal: a model that explains actual human behaviour


Solution must:
   explain psycholinguistic data
   be verified by experimentation


Intro to Natural Language Processing                 6
   Theoretical Linguistics Perspective
• In principle, coincides with the Cognitive Science
  Perspective
• CL can potentially help test the empirical adequacy of
  theoretical models.
• Linguistics is typically a descriptive enterprise.
• Building computational models of the theories allows
  them to be empirically tested. E.g., does your
  grammar correctly parse all the grammatical
  examples in a given test suite, while rejecting all the
  ungrammatical examples?


Intro to Natural Language Processing                    7
   Language as Goal-Oriented Behaviour

• We speak for a reason, e.g.,
     – get hearer to believe something
     – get hearer to perform some action
     – impress hearer
• Language generators must determine how to use
  linguistic strategies to achieve desired effects
• Language understanders must use linguistic
  knowledge to recognise speaker‟s underlying
  purpose



Intro to Natural Language Processing                 8
                                       Examples
(1) It‟s hot in here, isn‟t it?

(2) Can you book me a flight to London
    tomorrow morning?

(3) P: What time does the train for Washington,
       DC leave?
    C: 6:00 from Track 17.

Intro to Natural Language Processing              9
  Knowledge needed to understand
      and produce language
• Phonetics and phonology: how words are related to sounds
  that realize them
• Morphology: how words are constructed from more basic
  meaning units
• Syntax: how words can be put together to form correct
  utterances
• Lexical semantics: what words mean
• Compositional semantics: how word meanings combine to
  form larger meanings
• Pragmatics: how situation affects interpretation of utterance
• Discourse structure: how preceding utterances affects
  processing of next utterance

 Intro to Natural Language Processing                        10
              What can we learn about
                    language?
• Phonetics and Phonology: speech sounds, their
  production, and the rule systems that govern their
  use
      –   tap, butter
      –   nice white rice; height/hot; kite/cot; night/not...
      –   city hall, parking lot, city hall parking lot
      –   The cat is on the mat. The cat is on the mat?




Intro to Natural Language Processing                            11
                                Morphology
• How words are constructed from more basic units,
  called morphemes


                     friend + ly = friendly



           noun                        Suffix -ly turns noun into an
                                       adjective (and verb into an
                                       adverb)

Intro to Natural Language Processing                                   12
• Morphology: words and their composition
      – cat, cats, dogs
      – child, children
      – undo, union




Intro to Natural Language Processing        13
                   Syntactic Knowledge
• how words can be put together to form legal
  sentences in the language
• what structural role each word plays in the sentence
• what phrases are subparts of other phrases

                                       prepositional phrase

    The white book by Jurafsky and Martin is fascinating.


              modifier                 modifier

                              noun phrase

Intro to Natural Language Processing                          14
• Syntax: the structuring of words into larger phrases
     –   John hit Bill
     –   Bill was hit by John (passive)
     –   Bill, John hit (preposing)
     –   Who John hit was Bill (wh-cleft)




Intro to Natural Language Processing                     15
                   Semantic Knowledge
• What words mean
• How word meanings combine in sentences to form
  sentence meanings
              The sole died.                  (selectional restrictions)

       shoe part                       fish

Syntax and semantics work together!
                          (1) What does it taste like?
                          (2) What taste does it like?

N.B. Context-independent meaning
Intro to Natural Language Processing                                       16
• Semantics: the (truth-functional) meaning of words
  and phrases
     –   gun(x) & holster(y) & in(x,y)
     –   fake (gun (x)) (compositional semantics)
     –   The king of France is bald (presupposition violation)
     –   bass fishing, bass playing (word sense disambiguation)




Intro to Natural Language Processing                              17
• Pragmatics and Discourse: the meaning of words and
  phrases in context
    –   George got married and had a baby.
    –   George had a baby and got married.
    –   Some people left early.
    –   Prosodic Variation
          •   German teachers
          •   Bill doesn‟t drink because he‟s unhappy.
          •   John only introduced Mary to Sue.
          •   John called Bill a Republican and then he insulted him.
          •   John likes his mother, and so does Bill.




Intro to Natural Language Processing                                    18
                  Pragmatic Knowledge
• What utterances mean in different contexts


     Jon was hot and desperate for a dunk in the river.
     Jon suddenly realised he didn’t have any cash.
     He rushed to the bank.
         financial institution         river bank



Intro to Natural Language Processing                  19
                     Discourse Structure
Much meaning comes from simple conventions that we
 generally follow in discourse
• How we refer to entities
     – Indefinite NPs used to introduce new items into the
       discourse
                         A woman walked into the cafe.
     – Definite NPs can be used to refer to subsequent references
                        The woman sat by the window.
     – Pronouns used to refer to items already known in discourse
                         She ordered a cappuccino.


Intro to Natural Language Processing                                20
                     Discourse Relations
• Relationships we infer between discourse entities
• Not expressed in either of the propositions, but from
  their juxtaposition

    1. (a) I’m hungry.
       (b) Let’s go to the Fuji Gardens.

    2. (a) Bush supports big business.
       (b) He’ll vote no on House Bill 1711.


Intro to Natural Language Processing                      21
Discourse and Temporal Interpretation


                Max fell. John pushed him.

                                 explanation

        Syntax and semantics: “him” refers to Max
        Lexical semantics and discourse: the pushing
        occurred before the falling.


Intro to Natural Language Processing                   22
 Discourse and Temporal Interpretation

         John and Max were struggling at
         the edge of the cliff.
         Max fell. John pushed him.

         Here discourse knowledge tells us the
         pushing event occurred after the falling event




Intro to Natural Language Processing                      23
                        World knowledge
• What we know about the world and what we can
  assume our hearer knows about the world is
  intimately tied to our ability to use language



            I took the cake from the plate and ate it.




Intro to Natural Language Processing                     24
                                   Ambiguity
                                       I made her duck.

• The categories of knowledge of language can be
  thought of as ambiguity-resolving components
• How many different interpretations does the above
  sentence have?
• How can each ambiguous piece be resolved?
• Does speech input make the sentence even more
  ambiguous?



Intro to Natural Language Processing                      25
                       Spoken input
                                                    Basic Process of NLU
    For speech
  understanding       Phonological /
                      morphological                          Phonological & morphological
                        analyser                                        rules

                                 Sequence of words
 “He loves Mary.”
                       SYNTACTIC                                 Grammatical
                      COMPONENT                                  Knowledge
                                                                                    Indicating relns (e.g.,
 He                               Syntactic structure                               mod) between words

  loves Mary                         (parse tree)
                                                                                            Thematic
                        SEMANTIC                                 Semantic rules,            Roles
                      INTERPRETER                               Lexical semantics
                                                                                           Selectional
                                                                                           restrictions
 x loves(x, Mary)               Logical form

                      CONTEXTUAL                                 Pragmatic &
                       REASONER                                 World Knowledge

  loves(John, Mary)
                                    Meaning Representation
       Intro to Natural Language Processing                                                      26
                       It‟s not that simple
• Syntax affects meaning
       1. (a) Flying planes is dangerous.
          (b) Flying planes are dangerous.


• Meaning and world knowledge affects syntax
       2.  (a) Flying insects is dangerous.
           (b) Flying insects are dangerous.

       3. (a) I saw the Grand Canyon flying to LA.
          (b) I saw a condor flying to LA.


Intro to Natural Language Processing                 27
  Words (Input)                                                   Words (Response)

                                           Lexicon and                 Realisation
       Parsing
                                            Grammar

Syntactic Structure                                                Syntactic Structure
       and                                                                 and
  Logical Form                                                  Logical Form of Response


                                            Discourse                  Utterance
     Contextual                                                        Planning
    Interpretation                           Context

 Final Meaning                                                  Meaning of Response


                                            Application
                                             Context


                                        Application Reasoning
 Intro to Natural Language Processing                                                28
                          Can machines think?
• Alan Turing: the Turing test (language as test for
  intelligence)
• Three participants: a computer and two humans (one is
  an interrogator)
• Interrogator‟s goal: to tell the machine and human apart
• Machine‟s goal: to fool the interrogator into believing
  that a person is responding
• Other human‟s goal: to help the interrogator reach his
  goal



Intro to Natural Language Processing                    29
                                       Examples
Q: Please write me a sonnet on the topic of the Forth
   Bridge.
A: Count me out on this one. I never could write
   poetry.



Q: Add 34957 to 70764.
A: 105621 (after a pause)



Intro to Natural Language Processing                    30
  Example (from a famous movie)


        Dave Bowman: Open the pod bay doors, HAL.
        HAL: I‟m sorry Dave, I‟m afraid I can‟t do that.




Intro to Natural Language Processing                       31
                    Deconstructing HAL
   •   Recognizes speech and understands language
   •   Decides how to respond and speaks reply
   •   With personality
   •   Recognizes the user‟s goals, adopts them, and
       helps to achieve them
   •   Remembers the conversational history
   •   Customizes interaction to different individuals
   •   Learns from experience
   •   Possesses vast knowledge, and is autonomous


Intro to Natural Language Processing                     32
  The state of the art and the near-
             term future
• World-Wide Web (WWW)
• Sample scenarios:
     –   generate weather reports in two languages
     –   provide tools to help people with SSI to communicate
     –   translate Web pages into different languages
     –   speak to your appliances
     –   find restaurants
     –   answer questions
     –   grade essays (?)
     –   closed-captioning in many languages
     –   automatic description of a soccer gams

Intro to Natural Language Processing                            33
                        NLP Applications
• Speech Synthesis, Speech Recognition, IVR
  Systems (TOOT: more or less succeeds)
• Information Retrieval (SCANMail demo)
• Information Extraction
      – Question Answering (AQUA)
• Machine Translation (SYSTRAN)
• Summarization (NewsBlaster)
• Automated Psychotherapy (Eliza)




Intro to Natural Language Processing          34
                                  Web demos
 • Dialogue
       – ELIZA http://www.peccavi.com/eliza/
       – DiaLeague 2001 http://www.csl.sony.co.jp/SLL/dialeague/
 • Machine Translation (Systran & Altavista)
       – Systran http://w3.systranlinks.com/systran/cgi
       – Babel Fish http://babelfish.altavista.com/translate.dyn
 • Question-answering
       – Ask Jeeves             http://www.ask.co.uk
 • Summarization (IBM)
       – http://www4.ibm.com/software/data/iminer/fortext/summarize/
         summarizeDemo.html
 • Speech synthesis (CSTR at Edinburgh)
       – Festival http://festvox.org/voicedemos.html

Intro to Natural Language Processing                               35
           The Past: Human Machine Interfaces
• SHRDLU (Winograd, 1969)
  – A fragile demonstration of the fundamental vision
• PUNDIT (Hirschman, Palmer, … 1989)
  – Information Extraction from Real World texts




   CIS 391 - Intro to AI                                36
    SHRDLU: A demonstration
                        proof
Person: PICK UP A BIG RED BLOCK.
Computer: OK. (does it)
Person: GRASP THE PYRAMID.
Computer: I DON'T UNDERSTAND
  WHICH PYRAMID YOU MEAN.
Person: FIND A BLOCK WHICH IS TALLER THAN THE ONE YOU
  ARE HOLDING AND PUT IT INTO THE BOX.
Computer: BY "IT", I ASSUME YOU MEAN THE BLOCK WHICH IS
  TALLER THAN THE ONE I AM HOLDING.
Computer: OK. (does it)
Person: WHAT DOES THE BOX CONTAIN?
Computer: THE BLUE PYRAMID AND THE BLUE BLOCK.
Person: WHAT IS THE PYRAMID SUPPORTED BY?
Computer: THE BOX.

    CIS 391 - Intro to AI                            37
SHRDLU: A demonstration
        proof




CIS 391 - Intro to AI     38
                  The Past: Information Extraction
 PUNDIT (Palmer et al. 1987)
        Sample CASREP & automatic summary
 FAILURE OF ONE OF TWO SACS. UNIT HAD LOW OUTPUT AIR
    PRESSURE. RESULTED IN SLOW GAS TURBINE START.
    TROUBLESHOOTING REVEALED NORMAL SAC LUBE OIL
    PRESSURE AND TEMPERATURE. EROSION OF IMPELLOR BLADE
    TIP EVIDENT. CAUSE OF EROSION OF IMPELLOR BLADE
    UNDETERMINED. NEW SAC RECEIVED.

Status of Sac:           Part: sac                    State: inoperative
Finding:                 Part: air pressure           State: low
Finding:                 Part: lube oil pressure      State: normal
Finding:                 Part: lube oil temperature   State: normal
Damage:                  Part: blade tip              State: eroded
Finding:                 Agent: ship’s force          State: has new sac
      CIS 391 - Intro to AI                                                39
CIS 391 - Intro to AI   40
         The Past: Crucial flaws in the paradigm
These systems worked well, BUT

1. Usually, only for a small set of examples
2. Person-years of work to port to new applications and,
   often, to extend coverage on a single application
3. Very limited and inconsistent coverage of English




    CIS 391 - Intro to AI                             41
         Interactive systems often worked well…

…because of a magical fact:
  People automatically adapt and limit their
  language given a small set of exemplars if the
  underlying linguistic generalizations are
  HABITABLE

This won‟t handle non-interactive language




    CIS 391 - Intro to AI                         42
             Statistical NLP approach
• A Statistical NLP approach seeks to solve
  these problems by automatically learning
  lexical and structural preferences from
  corpora. In particular, Statistical NLP
  recognizes that there is a lot of information in
  the relationships between words.
• The use of statistics offers a good solution to
  the ambiguity problem: statistical models are
  robust, generalize well, and behave gracefully
  in the presence of errors and new data.


Intro to Natural Language Processing             43
                      The State of NLP
NLP Past:
• Rich Representations

NLP Present:
• Powerful Statistical Disambiguation




    CIS 391 - Intro to AI                44
     An Early Robust Statistical NLP Application

•A Statistical Model For Etymology (Church ’85)
•Etymology is the study of the history of words and how their
form and meaning have changed over time.

                Italian           English
             AldriGHetti       lauGH, siGH
              IannuCCi              aCCept
               ItaliAno              hAte
   •Determining etymology is crucial for text-to-speech



                                                                45
CIS 391 - Intro to
     An Early Robust Statistical NLP Application
                     Angeletti      100%    Italian
                     Iannucci       100%    Italian
                      Italiano      100%    Italian
                    Lombardino      58%     Italian
                     Asahara        100%   Japanese
                     Fujimaki       100%   Japanese
                      Umeda         96%    Japanese
                  Anagnostopoulos   100%    Greek
                    Demetriadis     100%    Greek

                     Dukakis        99%    Russian
                      Annette       75%    French
                     Deneuve        54%    French
                    Baguenard       54%    Middle
                                           French
•Etymology can be determined reasonably accurately from
statistics computed from letter sequences trigrams!
                                                          46
 CIS 391 - Intro to
Information Extraction &
Named Entity Recognition
              Information Extraction
• Information extraction is the identification, in text, of specified
  classes of
         • names / entities
         • relations
         • events
• For relations and events, this includes finding the participants
  and modifiers (date, time, location, etc.).
• In other words, we build a data base with the information on a
  given relation or event:
         •   people‟s jobs
         •   people‟s whereabouts
         •   merger and acquisition activity
         •   disease outbreaks
         •   genomics relation

     CIS 391 - Intro to AI                                              48
          Extraction Example

   – George Garrick, 40 years old,
  George Garrick, 40 years old, president of the London-
      based European Information Services Inc., was
      appointed chief executive officer of
      Nielsen Marketing Research, USA.
    Nielsen Marketing Research, USA.

Position Company                     Location   Person           Status
President European Information       London     George Garrick   Out
          Services, Inc.
CEO         Nielsen Marketing Research USA      George Garrick   In




  CIS 391 - Intro to AI                                                   49
                             Named Entities
The who, where, when & how much in a sentence

The task: identify atomic elements of information in text

•   person names
•   company/organization names
•   locations
•   dates&times
•   percentages
•   monetary amounts
     CIS 391 - Intro to AI                                  50
      Won„t simple lists solve the
              problem?
•   too numerous to include in dictionaries
•   changing constantly
•   appear in many variant forms
•   subsequent occurrences might be abbreviated

 list search/matching doesn„t perform well




     CIS 391 - Intro to AI                        51
                              Applications
•   Information Extraction
•   Summary generation
•   Machine Translation
•   Document organization/classification
•   Automatic indexing of books
•   Improve Internet search results
    (location Clinton/South Carolina vs. President
    Clinton)



      CIS 391 - Intro to AI                          52
                    Levels of BBN Statistical
                            Analysis
                       Yugoslav President Slobodan Milosevic received on Thursday the
                       representatives of the Association of Yugoslav Banks, headed by its president
            S          Milos Milosavljevic, who is also the general director of JugoBanka.
                       VP
                              NP
                                     PP
                                                                                          Name finding
                                                      NP
                               NPA                                                        Parsing
                                                                                S         Co-reference
                                                                                       SBAR
                                                       VP
      NPA                                                                       WHNP     VP
                      PP                                    PP                                NP
                                          NPA                                                        PP
                                                                 NPA                                       NPA
                        NPA
                                                                                              NPA
GPE     Person                              ORG                        Person                             ORG
                  ,




                  ,
               on




                of



                of




                                                                                                 of
                                                                                                 is
                                                                                              also




                                                                                       JugoBanka
          Banks




                                                                                 who




                                                                                           general
              the



              the




                                                                                               the
               its

           Milos
       Slobodan




               by
         headed
     Association
       Milosevic
       President



        received
       Yugoslav




      Thursday




                                                                                          director
       president
       Yugoslav




   Milosavljevic
 representatives




                 CIS 391 - Intro to AI                                                              53
            Information Extraction from
                             Propositions
         Propositions are normalized connections from the parse trees.
         Entities and relations are extracted statistically from propositions.
Person: Slobodan Milosevic         Person: Milos Milosevic         Person: Milos Milosevic
Position: president                Position: president             Position: general director
Organization: Yugoslavia           Organization: Association       Organization: JugoBanka
                                   of Yugoslav Banks

           received                        headed                         is

       subj obj               obj                   subj            arg         arg
 president
           on   representatives                             president                  director



GPE   Person      Date Person        ORG              ORG      Person                 Person   ORG
                 ,




                 ,
              on




               of



               of




                                                                                         of
                                                                                         is
                                                                                      also




                                                                               JugoBanka
         Banks




                                                                        who




                                                                                   general
             the



             the




                                                                                       the
              its

          Milos
      Slobodan




              by
        headed
    Association
      Milosevic
      President



       received
      Yugoslav




     Thursday




                                                                                  director
      president

  Milosavljevic
      Yugoslav
representatives




           CIS 391 - Intro to AI                                                        54
Text Summarization
                                           What crashed into
     MILAN, Italy, April 18. A small airplanehappened? a gover
     building in heart of Milan, setting the top floors on fire, Italian
     police reported. There were no immediate reports on casualtie
     rescue workers attempted to clear the area in the city's financia
     district. Few details of the crash were available, but news repo
                       When, where? that it How many victims? act
     about it immediately set off fears        might be a terrorist
     akin to the Sept. 11 attacks in the United States. Those fears sen
     U.S. stocks tumbling to session lows in late morning trading.
        Says who?
                                               Was it a terrorist act?
     Witnesses reported hearing a loud explosion from the 30-story
     office building, which houses the administrative offices of the
     Lombardy region and sits next to the city's central train station.
     Italian state television said the crash put a hole in the 25th flo
     of the Pirelli building. News reports said smoke poured from th
     opening. Police and ambulances rushed to the building in down
                                      What was the available.
     Milan. No further details were immediately target?
CIS 391 -
                        (Includes work by Prof. Nenkova)
CIS 391 - Intro to AI                                      57
Machine Translation
                Why use computers in
                    translation?
•   Too much translation for humans
•   Technical materials too boring for humans
•   Greater consistency required
•   Need results more quickly
•   Not everything needs to be top quality
•   Reduce costs

• Any one of these may justify machine translation or computer
  aids




      CIS 391 - Intro to AI                                      59
       Statistical Machine Translation Technology
              Spanish/English             English Text
               Bilingual Text


        Statistical Analysis         Statistical Analysis


   Spanish                      Broken                     English
                                English

                          What hunger have I,
Que hambre tengo yo       Hungry I am so,
                                                         I am so hungry
                          I am so hungry,
                          Have I that hunger …
                                                                 60
  CIS 391 -
            How A Statistical MT System Learns




CIS 391 - Intro to AI                            61
                Translating a New Document




CIS 391 - Intro to AI                        62
                     v.2.0




                     v.2.4

Source: Aljazeera, January 8, 2005

                                     v.3.0
      CIS 391 - Intro to AI                  63
Language Weaver (Al Jazeera
         8/2007)




                        LanguageWeaver Demo Website

CIS 391 - Intro to AI                                 64
       Broadcast Monitoring
      BBN MAPS & Language
           Weaver MT




CIS 391 - Intro to AI         65
            66
CIS 391 -
            Language Weaver Hybrid Translation Technology

•   Chinese Source Text
    Sample 1:
    车展,一向是衡量一个国家汽车消费现状和市场潜力的“晴雨表”。本届北京国际车展有24个国家的1200余家厂商参展,8天接待40余万名参观者
    ,创下了中国车展的新纪录,让人深切地感受到汽车市场启动的信号。
     “中国是世界最后一个最大的汽车市场”。多年来,这句话更多地包含着汽车商人的一种希冀。然而如今,越来越多的事实预示着它正在变为现
    实。
     来自本届车展的一组数据很有说服力。《北京青年报》的一份现场调查显示,35岁以下参观者约占35%;62.1%的被访者表示,参观车展主要
    是为近期买车搜集信息,甚至在展会上就有可能购买或预订合适的产品;76%的被访者表示最近两年会购买私家车。
      今年以来,国内轿车市场的强劲增长让厂家喜上眉梢。据国家统计局公布的数字,前4个月,全国共生产轿车26.79万辆,增长27.6%;特别是
    4月份,生 产轿车9万辆,同比增长50.5%,创造了十几年来轿车月产增长的最高纪录。从销售看,一季度,全国轿车生产企业共销售轿车18.8万
    辆,同比增长 22%,产销率达105%;轿车库存比年初下降1.1万辆,下降幅度近25%。
    Language Weaver Experimental Syntax MT
    Sample 1 :
    The motor show, has always been the' barometer' of a national car consumption and market potential. The Beijing International Auto Show has
    more than 1,200 exhibitors from 24 countries and 8 days of receiving more than 40 million visitors, setting a new record in China's auto show,
    are deeply aware of the automobile market signals.
    "China is one of the largest automobile market in the world. Over the years, this phrase implies more auto businessmen. But now, more and
    more facts indicates that it is to become a reality.
    Data from the Motor Show is very convincing. The Beijing Qingnian Bao Report on-the-spot investigation showed that about 35 percent of 35-
    year-old visitors, 62.1 percent of the respondents said that the truck was mainly to buy a car in the near future to collect information, even at the
    exhibition may purchase or suitable products; 76% of respondents indicated in the past two years to buy private cars.
    Since the beginning of this year, the strong growth of the domestic car market. According to the figures released by the National Bureau of
    Statistics, in the first four months, the country produced 267,900 vehicles, up 27.6 percent; in particular, in April, the production of 90,000
    vehicles, an increase of 50.5% over the same period last year, setting a record high for the monthly output growth over the past 10-odd years.
    In terms of sales in the first quarter, manufacturing enterprises in the country sold 188,000 cars, up 22 percent over the same period of last
    year, up 10.5 percent; 11,000 vehicles, dropping by nearly 25 percent lower than the beginning of the year.




         CIS 391 - Intro to AI                                                                                                                        67
      MT Challenges: Ambiguity
• Syntactic Ambiguity
  I saw the man on the hill with the telescope

• Lexical Ambiguity
   E: book
   S: libro, reservar

• Semantic Ambiguity
   – Homography:
     ball(E) = pelota, baile(S)
   – Polysemy:
     kill(E), matar, acabar (S)
   – Semantic granularity
     esperar(S) = wait, expect, hope (E)
     be(E) = ser, estar(S)
     fish(E) = pez, pescado(S)


    CIS 391 - Intro to AI                        68
New Technologies….
     Speech Transcription + Information Retrieval:
             PodZinger (Now EveryZing)

“When you type in a word or terms, PODZINGER not only finds the
  relevant podcasts, but also highlights the segment of the audio
  in which they occurred. By clicking anywhere on the results, the
  audio will begin to play just where you clicked. There are also
  controls that let you back up, pause, or forward through the
  podcast. Or you can download the entire podcast.
PODZINGER, powered by 30 years of speech recognition research
  from BBN Technologies, Cambridge, Massachusetts, transforms
  the audio into words, unlocking the information inside
  podcasts. Using PODZINGER you open up a previously
  untapped source of content via a simple web search. So when
  it comes to podcasts, instead of searching for it ... just ZING IT!
  “


     CIS 391 - Intro to AI                                         70

								
To top