Docstoc

Tangible User Interfaces and Reinforcement Learning _Smart Toys_

Document Sample
Tangible User Interfaces and Reinforcement Learning _Smart Toys_ Powered By Docstoc
					 Tangible User Interfaces and
   Reinforcement Learning
        (Smart Toys)


      An honours thesis presentation by…
Trent Apted <tapted@it.usyd.edu.au>
 Supervised by A/Prof Bob Kummerfeld
   Smart Internet Technology Research Group
   Tangible User Interfaces

• Not just a mouse
   – Although he can advance my slides
• Facilitate a more intimate interaction with the
  user
   –   Mainly targeted towards children
   –   Huggable, cute and cuddly
   –   Develop a relationship with the user
   –   Play games
           Toys - Motivation

• Plush (soft and furry) toys account
  for around 25% of toy store sales
• Over 17 million Furby toys were
  sold between October 1998 and
  December 1999
   – They had primitive learning
     capabilities
   – Mostly robot-like in appearance
   – They were also relatively cheap
     (unlike Sony’s Aibo ~$2,000+)
  Toys - Challenges

• Want to (cheaply) make a Smart Toy,
  derived from a plush doll
• Don’t want to adversely affect the
  original function
   – Namely, being soft, cute and cuddly
• Also want to be able to detect the
  usual ‘plush toy’ interactions
   – E.g. squeeze, carry, lie down with
• I am not an engineer…
  Reinforcement
    Learning
• Like training a dog with a ‘clicker’
• Need to associate the reward (click) with
  behaviour in a nearby temporal window
   – How to represent the behaviour
   – How to determine the window
• Apply learning that attempts to maximise
  all future possible rewards
• Many techniques
   – Q-learning, TD(l), Bayesian models, Markov
     models, neural networks, actor-critic,
     hierarchical
       Reinforcement Learning -
              Challenges

• Not all techniques can be applied to this scenario
   –   Infinite: no end to training examples
   –   Interactive: need to wait for the user to determine the reward
   –   Discrete: few training examples
   –   Future use: a (cheap) toy can not hold a lot of state
   –   Sensors are unsophisticated (Boolean)
• Also needs to be fun
   – Non-determinism
   – Anticipate possible actions without stimuli
• May not also be possible to punish the model
My Contributions –
Hardware / Systems

       • Design and implementation of
         the circuitry and sensors
       • Integration into a plush toy
       • A hardware  software
         interface (via parallel port) and
         event model
       • Many lessons learnt
          – E.g. limitations of high-level
            hardware (PDA)
         My Contributions –
             Software

• Reinforcement learning in the context of a
  Smart Toy
• Flexible learning architecture for further
  research and exploration (in other contexts)
• Evaluation of the reinforcement learning
  techniques implemented
• Implementation of a number of simple games
  to motivate learning of the toy (fun?)
 Some Results and Analysis




• Increasing the state space and re-presenting examples
  does not help interactive learning
• ‘Snapshot’ environments perform poorly and do not
  benefit from increasing the learner complexity
• Q-Learning combined with Markov models perform well
                    Future Work
• Improve the abilities of the toy
    – There’s spare wires - a speaker would be easy to add
    – Speech recognition would be harder
• Wireless
    – Remove the tether for more natural interaction
     Power source and increased expense
• Collaboration
    – ‘talking’ to other Smart Toys, collaborating in games
    – Collaborative learning
• Examine more learning models
• Psychological / Sociological aspects

				
DOCUMENT INFO