VIEWS: 16 PAGES: 10 POSTED ON: 5/19/2011
Tangible User Interfaces and Reinforcement Learning (Smart Toys) An honours thesis presentation by… Trent Apted <firstname.lastname@example.org> Supervised by A/Prof Bob Kummerfeld Smart Internet Technology Research Group Tangible User Interfaces • Not just a mouse – Although he can advance my slides • Facilitate a more intimate interaction with the user – Mainly targeted towards children – Huggable, cute and cuddly – Develop a relationship with the user – Play games Toys - Motivation • Plush (soft and furry) toys account for around 25% of toy store sales • Over 17 million Furby toys were sold between October 1998 and December 1999 – They had primitive learning capabilities – Mostly robot-like in appearance – They were also relatively cheap (unlike Sony’s Aibo ~$2,000+) Toys - Challenges • Want to (cheaply) make a Smart Toy, derived from a plush doll • Don’t want to adversely affect the original function – Namely, being soft, cute and cuddly • Also want to be able to detect the usual ‘plush toy’ interactions – E.g. squeeze, carry, lie down with • I am not an engineer… Reinforcement Learning • Like training a dog with a ‘clicker’ • Need to associate the reward (click) with behaviour in a nearby temporal window – How to represent the behaviour – How to determine the window • Apply learning that attempts to maximise all future possible rewards • Many techniques – Q-learning, TD(l), Bayesian models, Markov models, neural networks, actor-critic, hierarchical Reinforcement Learning - Challenges • Not all techniques can be applied to this scenario – Infinite: no end to training examples – Interactive: need to wait for the user to determine the reward – Discrete: few training examples – Future use: a (cheap) toy can not hold a lot of state – Sensors are unsophisticated (Boolean) • Also needs to be fun – Non-determinism – Anticipate possible actions without stimuli • May not also be possible to punish the model My Contributions – Hardware / Systems • Design and implementation of the circuitry and sensors • Integration into a plush toy • A hardware software interface (via parallel port) and event model • Many lessons learnt – E.g. limitations of high-level hardware (PDA) My Contributions – Software • Reinforcement learning in the context of a Smart Toy • Flexible learning architecture for further research and exploration (in other contexts) • Evaluation of the reinforcement learning techniques implemented • Implementation of a number of simple games to motivate learning of the toy (fun?) Some Results and Analysis • Increasing the state space and re-presenting examples does not help interactive learning • ‘Snapshot’ environments perform poorly and do not benefit from increasing the learner complexity • Q-Learning combined with Markov models perform well Future Work • Improve the abilities of the toy – There’s spare wires - a speaker would be easy to add – Speech recognition would be harder • Wireless – Remove the tether for more natural interaction Power source and increased expense • Collaboration – ‘talking’ to other Smart Toys, collaborating in games – Collaborative learning • Examine more learning models • Psychological / Sociological aspects
"Tangible User Interfaces and Reinforcement Learning _Smart Toys_"