SP09 cs188 lecture 29 -- machine learning-1.pptx by aqq19286

VIEWS: 15 PAGES: 5

warren buffet pdf

More Info
									                                                          Recap: Classification
Unsupervised & Semi-supervised
                                               !! Classification systems:
           Learning                               !! Supervised learning
                                                  !! Make a prediction given
                                                     evidence
                                                  !! We’ve seen several
                                                     methods for this
    John Blitzer, John DeNero, Dan Klein
                                                  !! Useful when you have
                                                     labeled data




                                           1                                                          2




                       Clustering                                     Clustering
!! Clustering systems:                         !! Basic idea: group together similar instances
   !! Unsupervised learning                    !! Example: 2D point patterns
   !! Detect patterns in
      unlabeled data
      !! E.g. group emails or
         search results
      !! E.g. find categories of
         customers
      !! E.g. detect anomalous
         program executions
   !! Useful when don’t know
      what you’re looking for                  !! What could “similar” mean?
   !! Requires data, but no                       !! One option: small (squared) Euclidean distance
      labels
   !! Often get gibberish
                                           3                                                          4




                        K-Means                              K-Means Example
!! An iterative clustering
   algorithm
   !! Pick K random points
      as cluster centers
      (means)
   !! Alternate:
      !! Assign data instances
         to closest mean
      !! Assign each mean to
         the average of its
         assigned points
   !! Stop when no points’
      assignments change


                                           5                                                          6




                                                                                                          1
            Example: K-Means                              K-Means as Optimization
!! [web demo]                                      !! Consider the total distance to the means:

   !!http://www.cs.washington.edu/research/
     imagedatabase/demo/kmcluster/                                                  means
                                                            points
                                                                      assignments

                                                   !! Each iteration reduces phi

                                                   !! Two stages each iteration:
                                                      !! Update assignments: fix means c,
                                                                     change assignments a
                                                      !! Update means: fix assignments a,
                                                                     change means c

                                              7                                                    8




  Phase I: Update Assignments                              Phase II: Update Means
!! For each point, re-assign to                    !! Move each mean to the
                                                      average of its assigned
   closest mean:                                      points:




!! Can only decrease total                         !! Also can only decrease total
   distance phi!                                      distance… (Why?)

                                                   !! Fun fact: the point y with
                                                      minimum squared Euclidean
                                                      distance to a set of points {x}
                                                      is their mean
                                              9                                                   10




                    Initialization                           K-Means Getting Stuck
!! K-means is non-deterministic                    !! A local optimum:
   !! Requires initial means
   !! It does matter what you pick!

   !! What can go wrong?

   !! Various schemes for preventing
      this kind of thing: variance-
      based split / merge, initialization
      heuristics                                      Why doesn’t this work out like
                                                      the earlier example, with the
                                                      purple taking over half the blue?

                                              11                                                  12




                                                                                                       2
             K-Means Questions                                                        Agglomerative Clustering
!! Will K-means converge?                                                     !! Agglomerative clustering:
                                                                                  !! First merge very similar instances
   !! To a global optimum?
                                                                                  !! Incrementally build larger clusters out of
                                                                                     smaller clusters
!! Will it always find the true patterns in the data?
   !! If the patterns are very very clear?                                    !! Algorithm:
                                                                                  !! Maintain a set of clusters
                                                                                  !! Initially, each instance in its own cluster
!! Will it find something interesting?                                            !! Repeat:
                                                                                      !! Pick the two closest clusters
                                                                                      !! Merge them into a new cluster
!! Do people ever use it?                                                             !! Stop when there’s only one cluster left


                                                                              !! Produces not one clustering, but a family
!! How many clusters to pick?                                                    of clusterings represented by a
                                                                                 dendrogram

                                                                         13                                                                                  14




                                                                                                Clustering Application
       Agglomerative Clustering
!! How should we define
   “closest” for clusters with
   multiple elements?

!! Many options
   !! Closest pair (single-link
      clustering)
   !! Farthest pair (complete-link
      clustering)                                                                                                            Top-level categories:
   !! Average of all pairs
   !! Ward’s method (min variance,
                                                                                                                             supervised classification
      like k-means)

!! Different choices create                                                                                                   Story groupings:
   different clustering behaviors                                                                                             unsupervised clustering
                                                                         15                                                                                  16




     Step 1: Agglomerative Clustering                                                     Step 2: K-means clustering
!! Separate clusterings for each global category                              !! Initialize means to centers from agglomerative step
!! Represent documents as vectors                                             !! Why might this be a good idea?
   !! Millions of dimensions (1 for each proper noun)                             !! Guaranteed to decrease squared-distance from cluster means
!! How do we know when to stop?                                                   !! Helps to “clean up” points that may not be assigned appropriately


Warren Buffet                                           Chrysler              Warren Buffet                                                Chrysler
Berkshire Hathaway                                      S&P 500               Berkshire Hathaway                                           S&P 500
S&P 500                                                 General Motors        S&P 500                                                      General Motors
GEICO                                                   Fiat                  GEICO                                                        Fiat
   Warren Buffet                                                                  Warren Buffet
   Dow Jones                                      Chrysler                        Dow Jones                                            Chrysler
   GEICO                                          Barack Obama                    GEICO                                                Barack Obama
   Charlie Munger                                 Fiat                            Charlie Munger                                       Fiat
                                                  United Auto Workers                                                                  United Auto Workers




                                                                                                                                                                  3
                                                                                                   Sentiment Analysis
            Semi-supervised learning
!! For a particular task, labeled data
   is always better than unlabeled
    !! Get a correction for every mistake

!! But labeled data is usually much
   more expensive to obtain
    !! Google News: Manually label news story
       clusters every 15 minutes?
                                                                +
    !! Other examples? Exceptions?

!! Combine labeled and unlabeled
   data to build better models                                               Other companies: http://www.jodange.com , http://www.brandtology.com , …
                                                                                                                                                                    20




            Sentiment Classification                                             Features for Sentiment Classification
      Product Review               Running with Scissors: A Memoir                          horrible. read half of it, suffering from a headache
                                                                              This book was horrible. IIread half of it, suffering from a headache thethe
                                   Title: Horrible book, horrible.            entire time, and eventually i lit it on fire. One less copy in the world...don't
                                                                                                                                                 world...don't
Linear classifier (perceptron)                                                waste your money. I wish i had the time spent reading this book back so i
                                   This book was horrible. I read half
                                                                              could use it for better purposes. This book wasted my life
                                   of it, suffering from a headache the
                                   entire time, and eventually i lit it on   Recall perceptron classification rule:
Positive                           fire. One less copy in the
                Negative
                                   world...don't waste your money. I
Supervised learning problem        wish i had the time spent reading this

                   ...             book back so i could use it for better    Features: counts of particular words & bigrams
                                   purposes. This book wasted my life
                                                                                                                                                                    22




                Domain Adaptation                                                      Books & Kitchen Appliances
                                                                             Running with Scissors: A Memoir             Avante Deep Fryer, Chrome &
                                                                                                                        Avante Deep Fryer, Chrome &
  Training data: labeled book reviews
                                                                                    Horrible book, horrible.
                                                                             Title: Horriblebook, horrible.              Black
                                                                                                                        Black

                                            ...                              This book was horrible. I read half
                                                                                                       read half
                                                                                                                         Title: lid does not work well...
                                                                                                                        Title: lid does not work well...


                                                                             of it, sufferingfrom aaheadache the
                                                                                    suffering from headache the           love the way the Tefal deep fryer
                                                                                                                        I I love the way the Tefal deep fryer

 Test data: unlabeled kitchen appliance reviews                                                             13% ! I am returning
                                                                                           Error increase: cooks,however,26%
                                                                             entire time, and eventually i lit it on
                                                                                                                  on
                                                                                                         cooks, however, I am returning my
                                                                             fire. One less copy in the                  my second one due to a defective
                                                                                                                        second one due to a defective lid

           ??              ??            ...                    ??           world...don't waste your money. I           lid closure. The lid close initially,
                                                                                                                        closure. The lid may may close
                                                                             wish i had the time spent reading this      initially, few uses no uses it no
                                                                                                                        but after a but after aitfew longer stays
                                                                             book back so i could use it for better      longer will closed. I will not be
                                                                                                                        closed. Istays not be purchasing this
 Semi-supervised problem: Can we build a good
                                                                             purposes. This book wasted my life          purchasing
                                                                                                                        one again. this one again.
 classifier for kitchen appliances?




                                                                                                                                                                         4
         Handling Unseen Words                                                      Clustering: Feature Vectors for Words
(1) Unsupervised: Cluster words based on context                                  Approximately 1000 pivots, 1 million feature words
(2) Supervised: Use clusters in place of the words themselves
                                                                                                                    Feature Words
 Clustering intuition: Contexts for the word defective
                                                                                                        fascinating         defective             repetitive

Unlabeled kitchen contexts            Unlabeled books contexts                             excellent
•!Do not buy the Shark portable        •!The book is so repetitive                               ...




                                                                                  Pivots
steamer …. Trigger mechanism           that I found myself yelling …. I
is defective.                          will definitely not buy another.                     not buy
•!the very nice lady assured me        •!A disappointment …. Ender                            awful
that I must have a defective set       was talked about for <#>
…. What a disappointment!              pages altogether.                                    terrible




          K-Means in Pivot Space                                                    Real-valued Linear Projections
                         excellent                                                                            excellent
                                     fantastic                                                                            fantastic
                                       works_well                                                                           works_well


                                   the                 defective                                                        the                      defective
                                     blender                                                                              blender
                                                       repetitive                                                                                repetitive
                                    novel                                                                                novel
                                                             not_buy                                                                                   not_buy



                                                                                 Position along the line gives a real-valued
                                                                                 soft notion of “polarity” for each word
                                                                          27                                                                                      28




             Supervised     Semi-supervised
                     Some results                                                            Learned Polarity Clusters
  90
                                     87.7                                              negative                                              positive
  85                                                                                                              books
                                                                                                                                      engaging     must_read

                                                                                plot        <#>_pages     predictable    fascinating                    grisham
  80
                                                    78.9
  75                                                                             poorly_designed         awkward_to      espresso                     years_now

              74.5                                                             the_plastic              leaking              are_perfect         a_breeze

  70                                                                                                              kitchen
        Train: Books           Test: Kitchen Appliances
                                                                                                                                                                  30




                                                                                                                                                                       5

								
To top