Docstoc

Machine Learning

Document Sample
Machine Learning Powered By Docstoc
					Machine Learning

Spring 2010



Rong Jin

                   1
CSE847 Machine Learning
   Instructor: Rong Jin
   Office Hour:
       Tuesday 4:00pm-5:00pm
       Thursday 4:00pm-5:00pm
   Textbook
       Machine Learning
       The Elements of Statistical Learning
       Pattern Recognition and Machine Learning
       Many subjects are from papers
   Web site: http://www.cse.msu.edu/~cse847
                                                   2
Requirements
   6~10 homework assignments
   One project for each person
       Team: no more than 2 people
       Topics: either assigned by the instructor or
        proposed by students themselves
       Results: a project proposal, a progress report and
        a final report
   Midterm exam & final exam
                                                             3
Goal
   Familiarize you with the state-of-art in
    Machine Learning
       Breadth: many different techniques
       Depth: Project
       Hands-on experience
   Develop the way of machine learning thinking
       Learn how to model real problems using machine
        learning techniques
       Learn how to deal with real problems practically
                                                           4
Course Outline

 Theoretical Aspects             Practical Aspects
 •Information Theory     • Supervised Learning Algorithms
 • Optimization Theory   • Unsupervised Learning Algorithms
 • Probability Theory    • Important Practical Issues

 • Learning Theory       • Applications




                                                            5
Today’s Topics
   Why is machine learning?
   Example: learning to play backgammon
   General issues in machine learning




                                           6
Why Machine Learning?
   Past: most computer programs are mainly
    made by hand
   Future: Computers should be able to program
    themselves by the interaction with their
    environment




                                                  7
Recent Trends
   Recent progress in algorithm and theory
   Growing flood of online data
   Computational power is available
   Growing industry




                                              8
Three Niches for Machine Learning
   Data mining: using historical data to improve
    decisions
       Medical records  medical knowledge
   Software applications that are difficult to program by
    hand
       Autonomous driving
       Image Classification
   User modeling
       Automatic recommender systems
                                                         9
Typical Data Mining Task




Given:
   • 9147 patient records, each describing pregnancy and birth
   • Each patient contains 215 features
Task:
   • Classes of future patients at high risk for Emergency Cesarean Section   10
Data Mining Results




One of 18 learned rules:
        If       no previous vaginal delivery
                 abnormal 2nd Trimester Ultrasound
                 Malpresentation at admission
        Then     probability of Emergency C-Section is 0.6
                                                             11
Credit Risk Analysis




Learned Rules:
        If       Other-Delinquent-Account > 2
                 Number-Delinquent-Billing-Cycles > 1
        Then     Profitable-Costumer ? = no

        If       Other-Delinquent-Account = 0
                 (Income > $30K or Years-of-Credit > 3)
                                                          12
        Then     Profitable-Costumer ? = yes
Programs too Difficult to Program By Hand
   ALVINN drives 70mph on highways




                                        13
Programs too Difficult to Program By Hand
   ALVINN drives 70mph on highways




                                        14
Programs too Difficult to Program By Hand
   Image Classification
                           Classify Bird Images

              Positive Examples
                                                                         




                                              Statistical Model
                                      Train                       Test
               Negative Examples                                         
                                                                         

                                                                         
                                                                             15
Image Retrieval using Texts




                              16
Automatic Image Annotation
   Automatically annotate images with textual
    words
   Retrieve images with textual queries




                                                 17
Software that Models Users
   History                                       What to Recommend?
   Description:A homicide detective and a            Description: A high-school boy
   fire marshall must stop a pair of murderers       is given the chance to write a story
   who commit videotaped crimes to become            about an up-and-coming rock band
   media darlings                                    as he accompanies it on their
                                                     concert tour.
   Rating:
                                                                No
                                                     Recommend: ?
   Description: A biography of sports legend,
   Muhammad Ali, from his early days to his
   days in the ring
                                                     Description: A young
   Rating:                                           adventurer named Milo Thatch
   Description: Benjamin Martin is drawn             joins an intrepid group of
   into the American revolutionary war against       explorers to find the mysterious
   his will when a brutal British commander          lost continent of Atlantis.
   kills his son.                                    Recommend: ?Yes
                                                                                 18
   Rating:
Netflix Contest




                  19
Where is this Headed?
   Today: tip of iceberg
       First generation algorithms
       Applied to well-formatted databases
       Budding industry
   Opportunities for Tomorrow
       Multimedia
       Database
       Robots
       Automatic computing
       Bioinformatics
       …
                                              20
Relevant Disciplines
   Artificial Intelligence
   Statistics (particularly Bayesian Stat.)
   Computational complexity theory
   Information theory
   Optimization theory
   Philosophy
   Psychology
   …
                                               21
Today’s Topics
   Why is machine learning?
   Example: learning to play backgammon
   General issues in machine learning




                                           22
What is the Learning Problem
   Learning = Improving with experience at some task
       Improve over task T
       With respect to performance measure P
       Based on experience E
   Example: Learning to Play Backgammon
       T: Play backgammon
       P: % of games won in world tournament
       E: opportunity to play against itself


                                                        23
Backgammon




   More than 1020 states (boards)
   Best human players see only small fraction of all board
    during lifetime
   Searching is hard because of dice (branching factor > 100)
                                                                 24
TD-Gammon by Tesauro (1995)




   Trained by playing with itself
   Now approximately equal to the best human
    player
                                                25
Learn to Play Chess
   Task T: Play chess
   Performance P: Percent of games won in the
    world tournament
   Experience E:
       What experience?
       How shall it be represented?
       What exactly should be learned?
       What specific algorithm to learn it?
                                                 26
Choose a Target Function
   Goal:
       Policy: : b  m   B = board
   Choice of value         = real values
    function
       V: b, m  




                                             27
Choose a Target Function
   Goal:
       Policy: : b  m   B = board
   Choice of value         = real values
    function
       V: b, m  
       V: b  




                                             28
Value Function V(b): Example Definition

   If b final board that is won:    V(b) = 1
   If b final board that is lost:   V(b) = -1

   If b not final board            V(b) = E[V(b*)]
    where b* is final board after playing optimally



                                                  29
Representation of Target Function V(b)


  Same value                                        Lookup table
 for each board                               (one entry for each board)




                  Summarize experience into
                  • Polynomials
                  • Neural Networks
  No Learning                                     No Generalization

                                                                       30
Example: Linear Feature
Representation
   Features:
       pb(b), pw(b) = number of black (white) pieces on board b
       ub(b), ub(b) = number of unprotected pieces
       tb(b), tb(b) = number of pieces threatened by opponent
   Linear function:
       V(b) = w0pb(b)+ w1pw(b)+ w2ub(b)+ w3uw(b)+ w4tb(b)+
        w5tw(b)
   Learning:
       Estimation of parameters w0, …, w5
                                                                   31
Tuning Weights
   Given:
       board b
       Predicted value V(b)
       Desired value V*(b)
   Calculate
    error(b) = (V*(b) – V(b))2
    For each board feature fi
    wi wi + cerror(b)fi
   Stochastically minimizes
    b (V*(b)-V(b))2
                                 Gradient Descent Optimization

                                                                 32
    Obtain Boards




   Random boards
   Beginner plays
   Professionals plays
                          33
Obtain Target Values
   Person provides value V(b)
   Play until termination. If outcome is
       Win: V(b)  1     for all boards
       Loss: V(b)  -1   for all boards
       Draw: V(b)  0    for all boards
   Play one move: b  b’
    V(b)  V(b’)
   Play n moves: b  b’… b(n)
       V(b)  V(b(n))
                                            34
A General Framework
  MathematicalM             Finding Optimal
     odeling                  Parameters


    Statistics         +      Optimization



                 Machine Learning

                                              35
Today’s Topics
   Why is machine learning?
   Example: learning to play backgammon
   General issues in machine learning




                                           36
Importants Issues in Machine Learning
   Obtaining experience
       How to obtain experience?
            Supervised learning vs. Unsupervised learning
       How many examples are enough?
            PAC learning theory
   Learning algorithms
       What algorithm can approximate function well, when?
       How does the complexity of learning algorithms impact the learning accuracy?
       Whether the target function is learnable?
   Representing inputs
       How to represent the inputs?
       How to remove the irrelevant information from the input representation?
       How to reduce the redundancy of the input representation?



                                                                                  37

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:12
posted:8/18/2011
language:English
pages:37