Docstoc

presentation University of Illinois at Urbana Champaign

Document Sample
presentation University of Illinois at Urbana Champaign Powered By Docstoc
					 Research on Intelligent Text
      Information Management
                          ChengXiang Zhai
                       Department of Computer Science
            Graduate School of Library & Information Science
                         Institute for Genomic Biology
                                      Statistics
                   University of Illinois, Urbana-Champaign
       http://www.cs.uiuc.edu/homes/czhai, czhai@cs.uiuc.edu
Contains joint work with Xuehua Shen, Bin Tan, Qiaozhu Mei,Yue Lu, Hongning, Vinod,
                       and other members of the TIMan group

                                                                                      1
                 Research Roadmap
                Web, Email, and Bioinformatics
   Search       Summarization               Visualization     Mining
 Applications                                               Applications
                Filtering                        Mining
  Information                Information
                                                            Knowledge
                             Organization
    Access      Search                         Extraction   Acquisition


                Categorization                Clustering
Current
 focus           Natural Language Content Analysis           Current
                                                              focus
 - Personalized                           -Contextual text mining
 - Retrieval models         Text          -Opinion integration
 - Topic map                              -Information quality
 - Recommender      Entity/Relation Extraction
                                                                          2
             Sample Projects
• Optimization of Retrieval Models
• User-Centered Adaptive Information Retrieval
• Multi-Resolution Topic Map for Browsing
• Contextual Text Mining
• Opinion Integration and summarization
• Information Trustworthiness


                                                 3
               Project 1:
    Optimization of Retrieval Models
• Content-based matching is a critical component in
  any search system
• Developed a number of retrieval models to optimize
  content matching
  – Language models: various LMs supporting proximity,
    word translations, feedback, …
  – Axiomatic framework: theoretical analysis of retrieval
    models
  – Recently looking into optimal interactive retrieval and
    domain-specific retrieval models (feedback,
    exploitation-exploration tradeoff, medical case
    retrieval, forum retrieval, … )
                                                              4
            Project 2:
User-Centered Adaptive IR (UCAIR)
• A novel retrieval strategy emphasizing
  – user modeling (“user-centered”)
  – search context modeling (“adaptive”)
  – interactive retrieval
• Implemented as a personalized search agent that
  – sits on the client-side (owned by the user)
  – integrates information around a user (1 user vs. N
    sources as opposed to 1 source vs. N users)
  – collaborates with each other
  – goes beyond search toward task support

                                                         5
       Non-Optimality of
Document-Centered Search Engines
               Query = Jaguar


   Car                          As of Oct. 17, 2005


                Car


         Software     Mixed results, unlikely optimal
                          for any particular user
              Car


    Animal

    Car

                                                        6
The UCAIR Project (NSF CAREER)

  WEB
                                                  Email

     Search
                Viewed
               Web pages   ...       Query
                                     History
                                               Search
     Engine
                                               Engine
                     Personalized
      Search
                     search agent
      Engine
                           “jaguar
                                               Personalized
                              ”                search agent

Desktop                                              “jaguar
 Files                                                  ”


                                                               7
Potential Benefit of Personalization

                     Suppose we know:
  Car                      1. Previous query = “racing cars”
                                             vs. “Apple OS”
               Car
                           2. “car” occurs far more frequently
                              than “Apple” in pages browsed
        Software              by the user in the last 20 days

                           3. User just viewed an “Apple OS”
             Car              document


   Animal

   Car

                                                               8
Intelligent Re-ranking of Unseen Results




   When a user clicks on the “back” button after viewing a document,
                   UCAIR reranks unseen results to
      pull up documents similar to the one the user has viewed
                                                                       9
      UCAIR Outperforms Google
                        [Shen et al. 05]

            Precision at N documents
Ranking         prec@5 prec@10         prec@20    prec@30
Method
Google          0.538      0.472       0.377      0.308
UCAIR       0.581          0.556       0.453      0.375
Improvement 8.0%           17.8%       20.2%      21.8%


UCAIR toolbar available at http://sifaka.cs.uiuc.edu/ir/ucair/



                                                                 10
 Future: Personal Information Agent
                         Desktop
           WWW
Email                                      Intranet



                  User Profile
IM            Active Info Service                     E-COM



                               …
                                Task
        Security
        Handler                 Support

Blog        Personal Content Index              Sports
           Frequently Accessed Info


                                       …
                              Literature



                                                              11
              Ongoing Work
• UCAIR system
• Recommendation and advertising on social networks




                                                  12
Project 3: Multi-Resolution Topic Map
            for Browsing
 • Promoting browsing as a “first-class citizen”
 • Multi-resolution topic map for browsing
   – Enable a user to find information through navigation
   – Very useful when a user can’t formulate effective
     queries or uses a small screen device
 • Search log as information footprints
   – Organize search log into a topic map
   – Allow a user to follow information footprints of
     previous users
   – Enable social surfing
                                                            13
Querying vs. Browsing




                        14
Information Seeking as Sightseeing
 • Know the address of an attraction site?
   – Yes: take a taxi and go directly to the site
   – No: walk around or take a taxi to a nearby place
     then walk around
 • Know what exactly you want to find?
   – Yes: use the right keywords as a query and find
     the information directly
   – No: browse the information space or start with a
     rough query and then browse

 When query fails, browsing comes to rescue…
                                                        15
Current Support for Browsing is Limited
 • Hyperlinks
   – Only page-to-page
                              Beyond hyperlinks?
   – Mostly manually constructed
   – Browsing step is very small
 • Web directories                           OD
   – Manually constructed                     P
   – Fixed categories    Beyond fixed categories?
   – Only support vertical navigation

 How to promote browsing as a “first-class citizen”?
                                                   16
Sightseeing Analogy Continues…




                             17
  Topic Map for Touring Information
                Space
                                                                       Topic
                Multiple resolutions                                  regions
Zoom in
                        insurance                     auto            Level 3
                                       car             cars
                      rental                                  loan

                                  car::parts      car::used           Level 2
                       car::rental                   car::blue+book
          rental::boat
                                  car::pictures

                       national+car+rental                            Level 1
            enterprise+car+rental          alamo+car+rental
                                             exotic+car+rental
                  advantage+car+rental

                                                                                Zoom out
                         Horizontal navigation
                                                                                           18
   Topic-Map based Browsing
                         Querying
  Multi-
Resolution
Topic Map
                         Topic
                         Region


 Parents


Current
Position
                          Demo
Horizontal
Neighbors




                                    19
How can we construct such a
 multi-resolution topic map?

       Multiple possibilities…




                                 20
  Search Logs as Information Footprints
Footprints in information space
 User 2722 searched for "national car rental" [!] at 2006-03-09
    11:24:29
 User 2722 searched for "military car rental benefits" [!] at
    2006-03-10 09:33:37 (found http://www.valoans.com)
 User 2722 searched for "military car rental benefits" [!] at
    2006-03-10 09:33:37 (found http://benefits.military.com)
 User 2722 searched for "military car rental benefits" [!] at
    2006-03-10 09:33:37 (found http://www.avis.com)
 User 2722 searched for "enterprise rent a car" [!] at 2006-04-
    05 23:37:42 (found http://www.enterprise.com)
 User 2722 searched for "meineke car care center" [!] at 2006-
    05-02 09:12:49 (found http://www.meineke.com)
 User 2722 searched for "car rental" [!] at 2006-05-25
    15:54:36
 User 2722 searched for "autosave car rental" [!] at 2006-05-
    25 23:26:54 (found http://eautosave.com)
 User 2722 searched for "budget car rental" [!] at 2006-05-25
    23:29:53
 User 2722 searched for "alamo car rental" [!] at 2006-05-25
    23:56:13
 ……
                                                                  21
Information Footprints  Topic Map
• Challenges
  – How to define/construct a topic region
  – How to control granularities/resolutions of topic
    regions
  – How to connect topic regions to support effective
    browsing
• Two approaches
  – Multi-granularity clustering
  – Query editing


                                                        22
             Collaborative Surfing

             New queries become new footprints
  Navigation
trace enriches
map structures
              Clickthroughs become new footprints

 Browse logs offer more opportunities
   to understand user interests and intents

                                              23
                Project 4:
          Contextual Text Mining
• Documents are often associated with context (meta-
  data)
  – Direct context: time, location, source, authors,…
  – Indirect context: events, policies, …
• Many applications require “contextual text analysis”:
  – Discovering topics from text in a context-sensitive way
  – Analyzing variations of topics over different contexts
  – Revealing interesting patterns (e.g., topic evolution,
    topic variations, topic communities)


                                                             24
               Example 1:
          Comparing News Articles
         Vietnam War            Afghan War                 Iraq War

             CNN                   Fox                      Blog
         Before 9/11 During Iraq war                       Current

            US blog         European blog                  Others

Common Themes     “Vietnam” specific   “Afghan” specific   “Iraq” specific

United nations    …                    …                   …
Death of people   …                    …                   …
…                 …                    …                   …

           What’s in common? What’s unique?

                                                                             25
More Contextual Analysis Questions
• What positive/negative aspects did people say about
  X (e.g., a person, an event)? Trends?
• How does an opinion/topic evolve over time?
• What are emerging topics? What topics are fading
  away?
• How can we characterize a social network?



                                                     26
          Research Questions
• Can we model all these problems generally?
• Can we solve these problems with a unified
  approach?
• How can we bring human into the loop?




                                               27
        Contextual Probabilistic
  Latent Semantics Analysis ([KDD 2006]…)
             View1 View2 View3                                                      Choose a theme
 Themes                                                      Draw a word from i
                                                                           Criticism of government
                                              government 0.3               response to the hurricane
                                                                                       government
                                              response 0.2..               primarily consisted of
                                                                                  Document
government                                                                         response
                                                                              criticism of its response
                                                                                        context:
                                                                              to … The total shut-in oil
                                              donate 0.1                           Time = July 2005
                                                                              production from the Gulf
                                              relief 0.05                          Location = Texas
                                                                              of Mexico … approximately
                                                                                           donate
donation                                                                                      =
                                                                                      Authorhelpxxx
                                              help 0.02 ..                    24% of the annual
                                                                                    aid
                                                                              production and the shut-
                                                                                Occup. = Sociologist
                                                                              in gas production … Over
                                                                                  Age Group = 45+
                                                city 0.2
                                                                              seventy countries pledged
 New                                            new 0.1                                     … Orleans
                                                                              monetary donations or
 Orleans                                        orleans 0.05 ..                            new
                                                                              other assistance. …

             Texas   July    sociolo
                     2005    gist
                                         Choose a view

                                                                                         1
   Theme                                  1
                                          2
                                                              1
                                                              2                          2

   coverages:                             3                   ……
                                                              3
                                                              4
                                                                                         3
                                                                                         4
                                          4
                                                                                             Choose a
                                                                                             Coverage
                     Texas             July 2005                   document

                                                                                                           28
             Comparing News Articles
     Iraq War (30 articles) vs. Afghan War (26 articles)
The common theme indicates that “United Nations” is involved in both wars

                      Cluster 1                  Cluster 2              Cluster 3

     Common           united       0.042         killed      0.035      …
                      nations      0.04          month       0.032
       Theme          …                          deaths      0.023
                                                 …
        Iraq          n            0.03          troops      0.016      …
                      Weapons 0.024              hoon        0.015
       Theme          Inspections 0.023          sanches    0.012
                      …                          …
                      Northern 0.04              taleban     0.026      …
                      alliance    0.04           rumsfeld    0.02
      Afghan          kabul       0.03           hotel       0.012
                      taleban     0.025          front       0.011
       Theme          aid         0.02           …
                      …
Collection-specific themes indicate different roles of “United Nations” in the two wars

                                                                                          29
Spatiotemporal Patterns in Blog Articles
•   Query= “Hurricane Katrina”
•   Topics in the results:
Government Response    New Orleans      Oil Price        Praying and Blessing    Aid and Donation   Personal
bush 0.071            city 0.063        price 0.077      god 0.141              donate 0.120        i 0.405
president 0.061       orleans 0.054     oil 0.064        pray 0.047             relief 0.076        my 0.116
federal 0.051         new 0.034         gas 0.045        prayer 0.041           red 0.070           me 0.060
government 0.047      louisiana 0.023   increase 0.020   love 0.030             cross 0.065         am 0.029
fema 0.047            flood 0.022       product 0.020    life 0.025             help 0.050          think 0.015
administrate 0.023    evacuate 0.021    fuel 0.018       bless 0.025            victim 0.036        feel 0.012
response 0.020        storm 0.017       company 0.018    lord 0.017             organize 0.022      know 0.011
brown 0.019           resident 0.016    energy 0.017     jesus 0.016            effort 0.020        something 0.007
blame 0.017           center 0.016      market 0.016     will 0.013             fund 0.019          guess 0.007
governor 0.014        rescue 0.012      gasoline 0.012   faith 0.012            volunteer 0.019     myself 0.006




•   Spatiotemporal patterns




                                                                                                                 30
Theme Life Cycles (“Hurricane Katrina”)
                            Oil Price
                                                 price 0.0772
                                                 oil 0.0643
                   New Orleans                   gas 0.0454
                                                 increase 0.0210
                                                 product 0.0203
                                                 fuel 0.0188
                                                 company 0.0182
                                                 …


                                        city 0.0634
                                        orleans 0.0541
                                        new 0.0342
                                        louisiana 0.0235
                                        flood 0.0227
                                        evacuate 0.0211
                                        storm 0.0177
                                        …


                                                                   31
Theme Snapshots (“Hurricane Katrina”)
                                       Week2: The discussion moves towards the north and west

 Week1: The theme is the strongest along the Gulf of Mexico             Week3: The theme distributes more uniformly over the states




                 Week4: The theme is again strong along the east coast and the Gulf of Mexico

                                                                                         Week5: The theme fades out in most states




                                                                                                                                      32
                         Theme Life Cycles (KDD Papers)
                                                                                                         gene 0.0173
                                0.02                                                                     expressions 0.0096
                                                                                  Biology Data
                               0.018                                                                     probability 0.0081
                                                                                  Web Information
                                                                                                         microarray 0.0038
Normalized Strength of Theme




                               0.016                                              Time Series
                                                                                  Classification
                                                                                                         …
                               0.014
                                                                                  Association Rule      marketing 0.0087
                               0.012                                                                    customer 0.0086
                                                                                  Clustering
                                0.01                                              Bussiness             model 0.0079
                               0.008                                                                    business 0.0048
                               0.006                                                                    …
                               0.004                                                                 rules 0.0142
                                                                                                     association 0.0064
                               0.002
                                                                                                     support 0.0053
                                   0
                                                                                                     …
                                       1999   2000   2001    2002   2003   2004
                                                      Time (year)




                                                                                                                              33
        Theme Evolution Graph: KDD
   1999            2000         2001                2002         2003               2004   T

                                               web 0.009
                          SVM 0.007            classifica –     mixture 0.005
                          criteria 0.007       tion 0.007       random 0.006
                          classifica –         features0.006    cluster 0.006
                          tion     0.006       topic 0.005      clustering 0.005
                          linear 0.005         …                variables 0.005
                          …                                     …                    topic 0.010
                                                                                     mixture 0.008
decision 0.006                                                                       LDA 0.006
tree      0.006                        …                                             semantic
classifier 0.005                           Classifica                                        0.005
class      0.005                           - tion       0.015     Informa            …
Bayes      0.005                           text         0.013     - tion 0.012
…                                          unlabeled    0.012     web       0.010
                                           document     0.008     social 0.008
                                           labeled      0.008     retrieval 0.007
                                           learning     0.007     distance 0.005
                      …
                      …                    …             …        networks 0.004
                                                                  …

                                                                                                     34
    Multi-Faceted Sentiment Summary
         (query=“Da Vinci Code”)
           Neutral                         Positive                      Negative
           ... Ron Howards selection of    Tom Hanks stars in the        But the movie might get
           Tom Hanks to play Robert        movie,who can be mad at       delayed, and even killed off if
           Langdon.                        that?                         he loses.
           Directed by: Ron Howard         Tom Hanks, who is my          protesting ... will lose your faith
Facet 1:   Writing credits: Akiva          favorite movie star act the   by ... watching the movie.
Movie      Goldsman ...                    leading role.

           After watching the movie I      Anybody is interested in      ... so sick of people making
           went online and some            it?                           such a big deal about a
           research on ...                                               FICTION book and movie.

           I remembered when i first       Awesome book.                 ... so sick of people making
           read the book, I finished the                                 such a big deal about a
           book in two days.                                             FICTION book and movie.
Facet 2:
Book       I’m reading “Da Vinci Code”     So still a good book to       This controversy book cause
           now.                            past time.                    lots conflict in west society.
           …




                                                                                                               35
Separate Theme Sentiment Dynamics
      “book”     “religious beliefs”




                                       36
Event Impact Analysis: IR Research
                                                         xml        0.0678
                         vector      0.0514              email      0.0197
                         concept    0.0298               model      0.0191
                         extend     0.0297               collect    0.0187
 Theme:                  model       0.0291              judgment   0.0102        SIGIR papers
 retrieval models        space       0.0236              rank       0.0097
                         boolean    0.0151               subtopic   0.0079
                         function   0.0123               …
term           0.1599    feedback   0.0077
relevance      0.0752    …                                 Publication of the paper “A language
weight         0.0660                     1992           modeling approach to information retrieval”
feedback       0.0372
independence 0.0311
                                                                                                       year
model          0.0310   Starting of the TREC conferences
                                                                       1998
frequent      0.0233                                                            model        0.1687
probabilistic 0.0188                                                            language 0.0753
                                              probabilist 0.0778                estimate 0.0520
document      0.0173
                                              model      0.0432                 parameter 0.0281
…
                                              logic      0.0404                 distribution 0.0268
                                              ir         0.0338                 probable     0.0205
                                              boolean 0.0281                    smooth       0.0198
                                              algebra 0.0200                    markov      0.0137
                                              estimate 0.0119                   likelihood 0.0059
                                              weight     0.0111                 …
                                              …
                                                                                                              37
Topic Modeling + Social Networks
Authors writing about the same topic form a community
    Separation of 3 research communities: IR, ML, Web

  Topic Model Only           Topic Model + Social Network




                                                            38
Next Step in Contextual Text Mining
• Combining contextual text analysis with visualization
• More detailed semantic modeling (entities, relations,…)
• Integration of search and contextual text analysis to develop
   an analyst’s workbench:
   – Interactive semantic navigation and probing
   – Synthesis of information/knowledge
   – Personalized/customized service




                                                                  39
              Project 5:
Opinion Integration and Summarization
• Increasing popularity of Web 2.0 applications
  – more people express opinions on the Web
                  How to digest all?
                                       190,451
                                        posts



     4,773,658
       results




                                                  40
                Motivation:
           Two kinds of opinions
                                190,451 posts   4,773,658 results




             How to benefit from both?
Expert opinions               Ordinary opinions
•CNET editor’s review         •Forum discussions
•Wikipedia article            •Blog articles
•Well-structured              •Represent the majority
•Easy to access               •Up to date
•Maybe biased                 •Hard to access
•Outdated soon                •fragmental
                                                                    41
                      Problem Definition
                                              Output
 Input
                                                                        Similar     Supplementary
                                                                       opinions         opinions


                            Extra Aspects Review Aspects
Topic: iPod                                                 Design    cute… tiny…   ..thicker..
                                                            Battery   last many     die out
                  Design                                              hrs           soon
Expert review     Battery
with aspects      Price..                                    Price    could afford still
                                                                      it           expensive
Text collection                                                       … easy to use…
                                                           iTunes
  of ordinary
opinions, e.g.                                             warranty   …better to extend..
   Weblogs
                                                                Integrated Summary
                                                                                               42
                    Methods
• Semi-Supervised Probabilistic Latent Semantic
  Analysis (PLSA)
  – The aspects extracted from expert reviews serve as
    clues to define a conjugate prior on topics
  – Maximum a Posteriori (MAP) estimation
  – Repeated applications of PLSA to integrate and align
    opinions in blog articles to expert review
          Results: Product (iPhone)
• Opinion Integration with review aspects
Review article           Similar opinions         Supplementary opinions
 You can make             N/A                     … methods for unlocking the
 emergency calls, but                             iPhone have emerged on the
 you can't use any                                Unlock/hack past few weeks,
                                                  Internet in the
 other functions…       Confirm the                  iPhone
                                                  although they involve tinkering
Activation           opinions from the            with the iPhone hardware…
                          review
 rated battery life of 8 iPhone will Feature      Playing relatively high bitrate
 hours talk time, 24      Up to 8 Hours of Talk   VGA H.264 videos, our iPhone
 hours of music           Time, 6 Hours of        lasted almost exactly 9 freaking
 playback, 7 hours of Internet Use, 7 Hours       hours of continuous playback
 video playback, and 6 of Video Playback or       with cell and WiFi on (but
 hours on Internet use. 24 Hours of Audio         Bluetooth off).
                                                              Additional info
                          Playback
                                                             under real usage
Battery
                                                                                     44
          Results: Product (iPhone)
• Opinions on extra aspects
support     Supplementary opinions on extra aspects
15          You may have heard of iASign … an iPhone Dev Wiki tool that
                                                   Another way to
            allows you to activate your phone without going through the
                                                   activate iPhone
            iTunes rigamarole.
13          Cisco has owned the trademark on the name "iPhone" since
                                             Technology Corp.,
            2000, when it acquired InfoGeariPhone trademark which
            originally registered the name. originally owned by
13                                                Cisco
            With the imminent availability of Apple's uber cool iPhone, a
            look at 10 things current smartphones like the Nokia N95 have
                          do for a choice for
            been able toA better while and that the iPhone can't currently
            match...       smart phones?


                                                                             45
     Results: Product (iPhone)
• Support statistics for review aspects
          People care about         Controversy: activation
                price               requires contract with
                                            AT&T


                              People comment a lot
                              about the unique wi-fi
                                     feature




                                                              46
Summarization of Contradictory Opinions
                                  [Kim & Zhai CIKM 09]

            Neutral                         Positive                      Negative
            ... Ron Howards selection of    Tom Hanks stars in the        But the movie might get
            Tom Hanks to play Robert        movie,who can be mad at       delayed, and even killed off if
            Langdon.                        that?                         he loses.
            Directed by: Ron Howard         Tom Hanks, who is my          protesting ... will lose your faith
 Facet 1:   Writing credits: Akiva          favorite movie star act the   by ... watching the movie.
 Movie
                      How can we help analysts digest and
            Goldsman ...                    leading role.

                       interpret Anybody is interested in ... so sick of
            After watching the movie I contradictory opinioons?people making
            went online and some            it?                           such a big deal about a
            research on ...                                               FICTION book and movie.

            I remembered when i first       Awesome book.                 ... so sick of people making
            read the book, I finished the                                 such a big deal about a
            book in two days.                                             FICTION book and movie.
 Facet 2:
 Book       I’m reading “Da Vinci Code”     So still a good book to       This controversy book cause
            now.                            past time.                    lots conflict in west society.
            …




                                                                                                                47
Contrastive Opinion Summarization
     X                         Y

x1                        y1
x2                        y2
x3                        y3
x4                        y4
                                   …
x5
     …                    ym

xn


                                       48
Contrastive Opinion Summarization
     X                                          Y
                  U  X,        V Y
x1                                         y1
x2           u1            v1              y2
x3           u2            v2              y3
                   …            …
x4                                         y4
             uk            vk                       …
x5
         …                                 ym

xn           Contrastive Opinion Summary


                                                        49
             Problem Formulation
                 Representativeness

     X                                      Y

x1                   U            V    y1
x2              u1           v1        y2
x3              u2           v2        y3
                     …            …
x4                                     y4
                uk           vk                 …
x5
         …                             ym

xn
                     Contrastiveness
                                                    50
               Problem Formulation
                            Representativeness
                        1                           1
     X
             r (S ) 
                        X
                             max  ( x, ui ) 
                            x X
                                   i[1, k ]        Y
                                                         max  ( y, v )
                                                        yY
                                                              i[1, k ]
                                                                          i
                                                                                   Y

x1                                   U                  V                     y1
x2                      u1                     v1                             y2
x3                      u2                     v2                             y3
                                   …                    …
x4                                                                            y4
                        uk                     vk                                      …
x5
         …                                                                    ym

xn
                                   Contrastiveness
                                                                 1 k
                                                         c( S )   (ui , vi )            51
                                                                 k i 1
  Summarization as Optimization
    S *  arg max(r ( S )  (1   )c( S ))
                S

                                              
     arg max(
            S       X
                         max ( x, u )  Y  max ( y, v )
                        x X
                               i[1, k ]
                                           i
                                                   yY
                                                         i[1, k ]
                                                                     i


      1  k
          (ui , vi ))
       k i 1

1. Define an appropriate content similarity function Ф
2. Define an appropriate contrastive similarity function ψ
3. Solve the optimization problem efficiently.

                                                                         52
                   Sample Results
No                  Positive                                 Negative
1    oh ... and file transfers are fast &   you need the software to actually
     easy .                                 transfer files
2    i noticed that the micro adjustment    the adjustment knob seemed ok, but
     knob and collet are well made and      when lowering the router, i have to
     work well too.                         practically pull it down while turning
                                            the knob.
3    the navigation is nice enough , but    difficult navigation - i wo n’t
     scrolling and searching through        necessarily say " difficult ,“ but i do n’t
     thousands of tracks ,                  enjoy the scrollwheel to navigate .
     hundreds of albums or artists , or
     even dozens of genres is not
     conducive to save driving
4    i imagine if i left my player          there are 2 things that need fixing first
     untouched (no backlight) it could      is the battery life.
     play for considerably more than 12     it will run for 6 hrs without problems
     hours at a low volume level.           with medium usage of the buttons.
                                                                                 53
                      Sample Result
No                   Positive                                 Negative
1     oh ... and file transfers are fast &   you need the software to actually
      easy .                                 transfer files
2     i noticed that the micro adjustment    the adjustment knob seemed ok, but
      knob and collet are well made and      when lowering the router, i have to
      work well too.                         practically pull it down while turning
                                             the knob.
3     the navigation is nice enough , but    difficult navigation - i wo n’t
      scrolling and searching through        necessarily say " difficult ,“ but i do n’t
      Different polarities of opinions
      thousands of tracks ,
      hundreds of albums or artists , or
                                             enjoy the scrollwheel to navigate .

     made from different perspectives.
      even dozens of genres is not
      conducive to save driving
4     i imagine if i left my player          there are 2 things that need fixing first
      untouched (no backlight) it could      is the battery life.
      play for considerably more than 12     it will run for 6 hrs without problems
      hours at a low volume level.           with medium usage of the buttons.
                                                                                  54
                     Sample Result
No                  Positive                                 Negative
1    oh ... and file transfers are fast &   you need the software to actually
     easy .                                 transfer files
2
             Positive vs. negative seemed ok, but
                          the adjustment knob
     i noticed that the micro adjustment
            Not much disagreement while turning
                          when lowering the router, i have to
     knob and collet are well made and
     work well too.       practically pull it down
                                            the knob.
3    the navigation is nice enough , but    difficult navigation - i wo n’t
     scrolling and searching through        necessarily say " difficult ,“ but i do n’t
     thousands of tracks ,                  enjoy the scrollwheel to navigate .
     hundreds of albums or artists , or
     even dozens of genres is not
     conducive to save driving
4    i imagine if i left my player          there are 2 things that need fixing first
     untouched (no backlight) it could      is the battery life.
     play for considerably more than 12     it will run for 6 hrs without problems
     hours at a low volume level.           with medium usage of the buttons.
                                                                                 55
                     Sample Result
No                  Positive                                 Negative
1    oh ... and file transfers are fast &   you need the software to actually
     easy .                                 transfer files
2    i noticed that the micro adjustment    the adjustment knob seemed ok, but
     knob and collet are well made and      when lowering the router, i have to
     work well too.                         practically pull it down while turning
                                            the knob.
3    the navigation is nice enough , but    difficult navigation - i wo n’t
     scrolling and searching through        necessarily say " difficult ,“ but i do n’t
     Judgments revealing detailed
     thousands of tracks ,                  enjoy the scrollwheel to navigate .
            conditions
     hundreds of albums or artists , or
     even dozens of genres is not
     conducive to save driving
4    i imagine if i left my player          there are 2 things that need fixing first
     untouched (no backlight) it could      is the battery life.
     play for considerably more than 12     it will run for 6 hrs without problems
     hours at a low volume level.           with medium usage of the buttons.
                                                                                 56
Open Opinion Search System

 http://timan1.cs.uiuc.edu/cgi-bin/hkim277/COSDemo/lemur.cgi




                                                               57
    Latent Aspect Rating Analysis
                             How to infer aspect ratings




         How to infer aspect weights?


Value   Location   Service     …..
                              Value    Location   Service
Solution: Latent Rating Regression
              Model
          Aspect Segmentation             +          Latent Rating Regression
Reviews + overall ratings   Aspect segments       Term weights   Aspect Rating   Aspect Weight
                                location:1            0.0
                                amazing:1             0.9
                                walk:1                0.1              1.3           0.2
                                anywhere:1
                                                      0.3
                                room:1                0.1
                                nicely:1              0.7
                                appointed:1           0.1              1.8           0.2
                                comfortable:1
                                                      0.9
                                nice:1                0.6
                                accommodating:1       0.8
                                smile:1               0.7              3.8           0.6
                                friendliness:1
                                                      0.8
                                attentiveness:1
                                                      0.9

       Topic model for aspect discovery

                                                                                            59
Aspect-Based Opinion Summarization
         Reviewer Behavior Analysis &
        Personalized Ranking of Entities
                                   People like cheap
People like                        hotels because of
expensive hotels                   good value
because of
good service


Query: 0.9 value
      0.1 others

Non-Personalized



  Personalized
Project 6: Information Trustworthiness
• How to assess information quality?
• Solution: trust propagation framework




                                          62
   Trust Propagation Framework
            Evidence

Source
                           Claim




                                   63
Sample Result: Trusted Sources on
        Different Topics




                                    64
Trusted Sources on Different Genres




                                  65
Toward Next-Generation Search Engines
                      Task Support
              Full-Fledged Text
                      Mining
              Info. Management
                        Access

                        Search

            Current Search Engine
         Keyword Queries         Bag of words

     Search History
                             Large-Scale
                                   Entities-Relations
Personalization
Complete User Model          Semantic Analysis
                                          Knowledge
(User Modeling)
                             (Vertical Search Engines)
                                          Representation
                                                  66
               The End
         Thank You!
More information about our research can be found at
           http://timan.cs.uiuc.edu/




                                                      67

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:5
posted:5/14/2012
language:
pages:67