Docstoc

MDP and distributed reinforcement learning

Document Sample
MDP and distributed reinforcement learning Powered By Docstoc
					                    MDP’s and distributed reinforcement learning


Rich Sutton

Nothing useful in IEEEXPLORE

His webpage: http://www-anw.cs.umass.edu/~rich/publications.html

       -Reinforcement Learning: An Introduction (book)
       http://www-anw.cs.umass.edu/~rich/book/the-book.html


Manuela Veloso
       -AI planning in supervisory control systems
      http://ieeexplore.ieee.org/search/srchabstract.jsp?arnumber=561490&isnumber=1
2217&punumber=4232&k2dockey=561490@ieeecnfs&query=(veloso%20m.%20%20a.
<in>au%20)&pos=0

      -RoboCup 2001
      http://ieeexplore.ieee.org/search/srchabstract.jsp?arnumber=1019487&isnumber=
21928&punumber=100&k2dockey=1019487@ieeejrns&query=(veloso%20m.<in>au%2
0)&pos=0

       -Multi-robot team response to a multi-robot opponent team
       http://ieeexplore.ieee.org/search/srchabstract.jsp?arnumber=1241933&isnumber=
27834&punumber=8794&k2dockey=1241933@ieeecnfs&query=(veloso%20m.<in>au%
20)&pos=4
       -Motion control in dynamic multi-robot environments
       http://ieeexplore.ieee.org/search/srchabstract.jsp?arnumber=810044&isnumber=1
7587&punumber=6589&k2dockey=810044@ieeecnfs&query=(veloso%20m.<in>au%20
)&pos=13
       -Individual and collaborative behaviors in a team of homogeneous robotic soccer
agents
       http://ieeexplore.ieee.org/search/srchabstract.jsp?arnumber=699074&isnumber=1
5164&punumber=5659&k2dockey=699074@ieeecnfs&query=(veloso%20m.<in>au%20
)&pos=2


Distributed + Reinforcement Learning

IEEEXPLORE – nothing amazing

http://ieeexplore.ieee.org/search/searchresult.jsp?query1=reinforcement+learning&scope
1=ab&op1=and&query2=distributed&scope2=ab&op2=and&query3=&scope3=&caseflg
=true&coll1=ieeejrns&coll2=ieejrns&coll3=ieeecnfs&coll4=ieecnfs&coll5=ieeestds&col
l6=preprint&py1=1950&py2=2004&SortField=Score&SortOrder=desc&ResultCount=15

Tambe
      -Towards flexible teamwork in persistent teams
      http://ieeexplore.ieee.org/search/srchabstract.jsp?arnumber=699065&isnumber=1
5164&punumber=5659&k2dockey=699065@ieeecnfs&query=(tambe%20m.<in>au%20
)&pos=4

Mataric
      -Coordination and learning in multirobot systems
      http://ieeexplore.ieee.org/search/srchabstract.jsp?arnumber=671083&isnumber=1
4787&punumber=5254&k2dockey=671083@ieeejrns&query=(mataric%20m.<in>au%2
0)&pos=5

      -Collective construction with multiple robots
      http://ieeexplore.ieee.org/search/srchabstract.jsp?arnumber=1041677&isnumber=
22328&punumber=8071&k2dockey=1041677@ieeecnfs&query=(mataric%20m.<in>au
%20)&pos=2
      -Learning from history for adaptive mobile robot control behaviour based
      http://ieeexplore.ieee.org/search/srchabstract.jsp?arnumber=724868&isnumber=1
5658&punumber=5876&k2dockey=724868@ieeecnfs&query=(mataric%20m.<in>au%2
0)&pos=10

Cooperative + Reinforcement Learning
         -Advantages of cooperation between reinforcement learning agents in difficult
stochastic problems
         http://ieeexplore.ieee.org/search/srchabstract.jsp?arnumber=839146&isnumber=1
8092&punumber=6771&k2dockey=839146@ieeecnfs&query=reinforcement+learning%
3Cand%3Ecooperative&pos=1
         -A REINFORCEMENT LEARNING APPROACH TO COOPERATIVE PROBLEM SOLVING
         http://ieeexplore.ieee.org/search/srchabstract.jsp?arnumber=699295&isnumber=1
5164&punumber=5659&k2dockey=699295@ieeecnfs&query=reinforcement+learning%
3Cand%3Ecooperative&pos=2
         -Extended stochastic reinforcement learning for the acquisition of cooperative
motion plans for dynamically constrained agents
         http://ieeexplore.ieee.org/search/srchabstract.jsp?arnumber=390719&isnumber=8
852&punumber=3093&k2dockey=390719@ieeecnfs&query=reinforcement+learning%3
Cand%3Ecooperative&pos=3
         -Interactive multiagent reinforcement learning with motivation rules
http://ieeexplore.ieee.org/search/srchabstract.jsp?arnumber=970456&isnumber=20917&p
unumber=7656&k2dockey=970456@ieeecnfs&query=reinforcement+learning%3Cand%
3Ecooperative&pos=6
         -Reinforcement learning approach to cooperation problem in a homogeneous
robot group
        http://ieeexplore.ieee.org/search/srchabstract.jsp?arnumber=931827&isnumber=2
0163&punumber=7417&k2dockey=931827@ieeecnfs&query=reinforcement+learning%
3Cand%3Ecooperative&pos=10
        -Cooperation and coordination between fuzzy reinforcement learning agents in
continuous state partially observable Markov decision processes
        http://ieeexplore.ieee.org/search/srchabstract.jsp?arnumber=793014&isnumber=1
7177&punumber=6417&k2dockey=793014@ieeecnfs&query=reinforcement+learning%
3Cand%3Ecooperative&pos=13
        -Multi-agent reinforcement learning: an approach based on the other agent's
internal model
        http://ieeexplore.ieee.org/search/srchabstract.jsp?arnumber=858456&isnumber=1
8605&punumber=6917&k2dockey=858456@ieeecnfs&query=reinforcement+learning%
3Cand%3Ecooperative&pos=8
        -Expertness based cooperative Q-learning
        http://ieeexplore.ieee.org/search/srchabstract.jsp?arnumber=979961&isnumber=2
1106&punumber=3477&k2dockey=979961@ieeejrns&query=reinforcement+learning%
3Cand%3Ecooperative&pos=10
        -Cooperative learning and planning for multiple robots
        http://ieeexplore.ieee.org/search/srchabstract.jsp?arnumber=882949&isnumber=1
9098&punumber=7088&k2dockey=882949@ieeecnfs&query=reinforcement+learning%
3Cand%3Ecooperative&pos=5
        Multi mobile robot navigation using distributed value function reinforcement
learning
        http://ieeexplore.ieee.org/search/srchabstract.jsp?arnumber=1241716&isnumber=
27829&punumber=8794&k2dockey=1241716@ieeecnfs&query=reinforcement+learning
%3Cand%3Ecooperative&pos=4

      -Competition and collaboration among fuzzy reinforcement learning agents
      http://ieeexplore.ieee.org/search/srchabstract.jsp?arnumber=687560&isnumber=1
5060&punumber=5612&k2dockey=687560@ieeecnfs&query=(berenji%20h.<in>au%20
)&pos=13
               -NEEDS SUBSCRIPTION

Kaelbling’s intro

http://www-2.cs.cmu.edu/afs/cs/project/jair/pub/volume4/kaelbling96a-html/rl-
survey.html

Origins of Q-learning

Watkins, 1989
      Watkins, C. J. C. H. (1989). Learning from Delayed Rewards. PhD thesis,
      Cambridge University, Cambridge, England.

Christopher J. C. H. Watkins and Peter Dayan. Q-learning. Machine Learning, 8(3):279-
292, 1992
John N. Tsitsiklis. Asynchronous stochastic approximation and Q-learning. Machine
Learning, 16(3), September 1994. pp. 185-202

Tommi Jaakkola, Michael I. Jordan, and Satinder P. Singh. On the convergence of
stochastic iterative dynamic programming algorithms. Neural Computation, 6(6),
November 1994.

-learning rates for Q-learning
http://www.ai.mit.edu/projects/jmlr//papers/volume5/evendar03a/evendar03a.pdf


-for correct convergence, all pairs (s,a) must continue to be updated.

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:7
posted:7/22/2011
language:English
pages:4