defense_talk by cuiliqing

VIEWS: 6 PAGES: 70

									Distributed Evolution for
Swarm Robotics

                Suranga Hettiarachchi
             Computer Science Department
                University of Wyoming
Committee Members:
Dr. William Spears – Computer Science (Committee Chair / Research Advisor)
Dr. Diana Spears – Computer Science
Dr. Thomas Bailey – Computer Science
Dr. Richard Anderson-Sprecher – Statistics
Dr. David Thayer – Physics and Astronomy
Outline
• Goals and Contributions
•   Robot Swarms
•   Physicomimetics Framework
•   Offline Evolutionary Learning
•   Novel Distributed Online Learning
•   Obstacle Avoidance with Physical Robots
•   Conclusion and Future Work
Goals
• To improve the state-of-the-art of
  obstacle avoidance in swarm robotics.
• To create a novel real-time learning
  algorithm for swarm robotics, to
  improve performance in changing
  environments.
    Contributions
•   Improved performance in obstacle avoidance:
    •   Scales to far higher numbers of robots and obstacles than
        the norm
•   Invented an online population-based learning
    algorithm:
    •   Demonstrate feasibility of algorithm with obstacle avoidance,
        in environments that change dynamically and are three
        times denser than the norm, with obstructed perception
•   Hardware Implementation
    •   Implemented obstacle avoidance algorithm on real robots
                             Obstacle
                             Avoidance
    Online Learning                              Hardware
    Algorithm                                    Implementation
Outline
• Goals and Contributions
• Robot Swarms
• Physicomimetics Framework
• Offline Evolutionary Learning
• Novel Distributed Online Learning
• Obstacle Avoidance with Physical Robots
• Conclusion and Future Work
Robot Swarms
• Robot swarms can act as distributed
  computers, solving problems that a
  single robot cannot
• For many tasks, having a swarm
  maintain cohesiveness while avoiding
  obstacles and performing the task is
  of vital importance
• Example Task: Chemical Plume
  Source Tracing
Chemical Plume Source Tracing




      Link to this movie may not work properly
Outline
• Goals and Contributions
• Robot Swarms
• Physicomimetics Framework
• Offline Evolutionary Learning
• Novel Distributed Online Learning
• Obstacle Avoidance with Physical Robots
• Conclusion and Future Work
Physicomimetics for Robot Control




•Biomimetics: Gain inspiration from biological
systems and ethology.


•Physicomimetics: Gain inspiration from
 physical systems. Good for formations.
 Physicomimetics Framework
Robots have limited sensor range,               Robots are controlled
and friction for stabilization
                a2                              via “virtual” forces
      a1
               F
                                                from nearby robots,
           F
                     F       a3                 goals, and obstacles.
               A
                                                F = ma control law.
        d               F
                                  Environment
                F

               a4

   Virtual forces F on a robot A by other
   robots ai and the environment cause a
      d displacement in its behavior.

                                                 Seven robots form a hexagon
Two Classes of Force Laws

       Gm i m j                2d c       12           6

F                    F  24  13  7 
            r   p
                               r    r 
 The “classic” law   Novel use of LJ force law for robot control



The left “Newtonian” force law, is good for creating
swarms in rigid formations. The right “Lennard-
Jones” force law (LJ) more easily models fluid
behavior, which is potentially better for maintaining
cohesion while avoiding obstacles.
    What do these force laws look
    like?

                                     Change in Force Magnitude
                                     With Varying Distance for
                                     Robot – Robot Interactions

                       Fmax = 1.0

                                          Fmax = 4.0



Desired Robot Separation Distance = 50
Outline
• Goals and Contributions
• Robot Swarms
• Physicomimetics Framework
• Offline Evolutionary Learning
• Novel Distributed Online Learning
• Obstacle Avoidance with Physical Robots
• Conclusion and Future Work
  Swarm Learning (Offline)
  • Typically, the interactions between the
    swarm robots are learned via simulation in
    “offline” mode.

                      Swarm Simulation

                  Rules                 Fitness

                Offline Learning, such as an         Final Rules
Initial Rules                                     that achieve the
                Evolutionary Algorithm (EA)
                                                  desired behavior
Swarm Simulation Environment
Offline Learning Approach
• An Evolutionary Algorithm (EA) is used to
  evolve the rules for the robots in the swarm.
• A global observer assigns fitness to the rules
  based on the collective behavior of the
  swarm in the simulation.
• Each member of the swarm uses the same
  rules. The swarm is a homogeneous
  distributed system.
•   For physicomimetics, the rules consists of force
    law parameters.
       Force Law Parameters
       • Parameters of the “Newtonian” force law
            G- “gravitational” constant of robot-robot interactions
            P- power of the force law for robot-robot interactions
            Fmax- maximum force of robot-robot interactions
         Similar 3-tuples for obstacle/goal-robot interactions.

               Gr-r          Pr-r    Fmaxr-r     Gr-o      Pr-o     Fmaxr-o    Gr-g   Pr-g   Fmaxr-g


       • Parameters of the LJ force law
            ε- strength of the robot-robot interactions
            c- non-negative attractive robot-robot parameter
            d- non-negative repulsive robot-robot parameter
            Fmax- maximum force of robot-robot interactions
         Similar 4-tuples for obstacle/goal-robot interactions.
εr-r    cr-r          dr-r      Fmaxr-r   εr-o      cr-o     dr-o     Fmaxr-   εr-g   cr-g    dr-g     Fmaxr-
                                                                        o                                g
Measuring Fitness
•   Connectivity (Cohesion) : maximum number of
    robots connected via a communication path.
•   Reachability (Survivability) : percentage of
    robots that reach the goal.
•   Time to Goal : time taken by at least 80% of the
    robots to reach the goal.


High fitness corresponds to high connectivity,
high reachability, and low time to goal.
                              connectivity             goal
                                                 4R
                                  reachability
    Summary of Results
•   We compared the performance of the best “Newtonian”
    force law found by the EA to the best LJ force law.

•   The “Newtonian” force law produces more rigid structures
    making it difficult to navigate through obstacles. This
    causes poor performance, despite high connectivity.

•   Lennard-Jones is superior, because the swarm acts as a
    viscous fluid. Connectivity is maintained while allowing the
    robots to reach the goal in a timely manner.

•   The Lennard-Jones force law demonstrates scalability in
    the number of robots and obstacles.
Connectivity of Robots
Time for 80% of the Robots to
Reach the Goal
Force Robots              Obstacles
Law
               20    40      60       80    100
     20        1160 1260 1290 1530 1920
Newt 100       -     -       -        -     -
     20        470   480     490      510   520
LJ   100       640   650     670      680   690
     A Problem

 •    The simulation assumes a certain environment.
      What happens if the environment changes
      when the swarm is fielded?
      • We can’t go back to the simulation world.
      • Can the swarm adapt “on-line” in the field?
Environment               Environment
 trained on.                changes.
                          Performance
                           degrades.
Frequently Proposed Solution
• Each robot has sufficient CPU power and
  memory to maintain a complete map of the
  environment.
• When environment changes, each robot runs
  an EA internally, on a simulation of the new
  environment.
• Robots wait until new rules are evolved.
                           4 days of simulation time

•   It is better to learn in the field, in real time.
Outline
• Goals and Contributions
• Robot Swarms
• Physicomimetics Framework
• Offline Evolutionary Learning
• Novel Distributed Online Learning
• Obstacle Avoidance with Physical Robots
• Conclusion and Future Work
    Example
 • The maximum velocity is increased by 1.5x.
 • Obstacles are tripled in size.
 • High obstacle density creates cul-de-sacs and
   robots are left behind. Collisions also occur.
 • Obstructed perception is also introduced.
 • The learned offline rules are no longer sufficient.
Environment                Environment
 trained on.                 changes.
                           Performance
                            degrades.
Novel Online Learning Approach
• Borrow from evolution.
  •   Each robot in the swarm is an individual in a
      population that interacts with its neighbors.
  •   Each robot contains a slightly mutated copy of
      the best rule set found with offline learning.
  •   When the environment changes, some
      mutations perform better than others.
  •   Better performing robots share their knowledge
      with poorer performing neighbors.
• We call this “Distributed Agent Evolution
  with Dynamic Adaptation to Local
  Unexpected Scenarios” (DAEDALUS).
DAEDALUS for Obstacle Avoidance
• Each robot is initialized with randomly
  perturbed (via mutation) versions of the
  force laws learned with the offline
  simulation.
• Robots are penalized if they collide with
  obstacles and/or are left behind.
• Robots that are most successful and are
  moving will retain the highest worth, and
  share their force laws with neighboring
  robots that were not as successful.
Experimental Setup
• There are five goals to reach in a long corridor.
• Between each goal is a different obstacle
  course.
• Robots that are left behind (due to obstacle
  cul-de-sacs) do not proceed to the next goal.
• The number of robots that survive to reach the
  last goal is low. We want the robots to learn to
  do better, while in the field.
DAEDALUS Results
• DAEDALUS succeeded in dramatically
  reducing the number of collisions and
  improving survivability, despite the
  difficulties caused by obstructed
  perception.
               20 minutes of
               simulation time


• Our results depended on the mutation
  rate. Can DAEDALUS learn that also?
Further DAEDALUS Results
• DAEDALUS also succeeded in learning the
  appropriate mutation rate for the robots.
  Hence, the system is striking a balance
  between exploration and exploitation.
Effect of Mutation Rate on
Survival
            Number of Robots Surviving with
               Different Mutation Rates
           1%   3%      5%      7%      9%
60-start   12    12     12      12      12
53-goal1   8     10     11      12      12
45-goal2   9     6      10       9      11
40-goal3   7     6      10       8       9
34-goal4   5     6       9       8       6
32-goal5   5     5       9       7       6
Collision Reduction




60 Robots moving towards 5 goals through 90 obstacles in between each goal
    Summary of DAEDALUS
• Creating rapidly adapting robots in changing
  environments is challenging.
• Offline learning can yield initial “seed” rules,
  which must then be perturbed.
• The key is to maintain “diversity” in the rules
  that control the members of the swarm.
• Collective behaviors still arise from the local
  interactions of diverse population of robots.
Outline
• Goals and Contributions
• Robot Swarms
• Physicomimetics Framework
• Traditional Offline Learning
• Novel Distributed Online Learning
• Obstacle Avoidance with Physical
  Robots
• Conclusion and Future Work
Obstacle Avoidance with Robots
• Use three Maxelbot robots
• Use 2D trilateration localization
  algorithm (Not a part of this thesis)
• Design and develop obstacle
  avoidance module (OAM)
• Implement physicomimetics on a real
  outdoor robot
Hardware Architecture of Maxelbot
MiniDRAGON for
  trilateration,   RF and acoustic sensors
 provides robot
  coordinates


           I2C

MiniDRAGON for
 motor control,
    executes
Physicomimetics
                        OAM
                   AtoD conversion
     I2C

                        I2C

                     IR sensors
Physicomimetics for Obstacle Avoidance
 • Constant “virtual” attractive goal force in
   front of the leader
 • “Virtual” repulsive forces from four sensors
   mounted on the front of the leader, if
   obstacles detected
 • The resultant force creates a change in
   velocity due to F = ma
 • Power supply to motors are changed based
   on the forces acting on the leader.
Obstacle Avoidance Methodology
• Measure the performance of physicomimetics
  with repulsion from obstacles
• All experiments are conducted outdoor in the
  “Prexy’s Pasture”
• Three Maxelbots: One leader and two followers
• Graphs show the correlation between raw
  sensor readings and motor power
• Leader uses the physicomimetics algorithm
  with the obstacle avoidance module
• Focus is on the obstacle avoidance by the
  leader, not the formation control
                                            Maxelbot Turning Left - Obstacle on the Right
                                                                                                     Right-most Sensor Reading
                                 800                                                                 Power to Left Motor


                                 700
Sensor Reading and Motor Power




                                 600


                                 500


                                 400


                                 300


                                 200


                                 100


                                   0
                                        1      1001    2001    3001    4001    5001    6001   7001       8001      9001

                                 -100
                                                                              Time


                                        If there is an obstacle on the right, power to left motor is reduced
                                            Maxelbot Turning Right - Obstacle on the Left         Left-most Sensor Reading
                                                                                                  Power to Right Motor
                                 800


                                 700
Sensor Reading and Motor Power




                                 600


                                 500


                                 400


                                 300


                                 200


                                 100


                                   0
                                        1       1001    2001    3001    4001    5001    6001   7001     8001     9001

                                 -100

                                    If there is an obstacle on the left, power to right motor is reduced
                                                                       Time
                                         Maxelbot Stopping Behavior - Both Middle Sensors Detect an
                                                                 Obstacle

                                  800                                           Ave. of the Two Middle Sensors
                                                                                Ave. of the Motor Power
Sensor Readings and Motor Power




                                  700


                                  600


                                  500


                                  400


                                  300


                                  200


                                  100


                                    0
                                         1     1001   2001    3001   4001    5001   6001    7001      8001   9001

                                  -100

                                                                            Time


                                     If there is an obstacle in front, power to both motors is reduced
Further Analysis of Sensor Reading
and Motor Power
• Scatter plots give
  more information
• Provide a broader
  picture of data
• Shows the correlation
  of motor power with
  distance to an obstacle
  in inches (the robots
  ignore obstacles
  greater than 30”
                            Movie of 3 Maxelbots,
  away)                       Leader has OAM
                                            Maxelbot Turning Right - Obstacle on the Left

                       80


                       70


                       60


                       50
                                 Left sensor
Power to Right Motor




                       40        sees obstacle

                       30


                       20


                       10


                        0
                             0       10     20      30      40         50         60         70         80    90   100
                       -10


                       -20
                                 Left middle sensor              Distance to obstacle on the left in inches
                                 also sees obstacle
Outline
• Goals and Contributions
• Robot Swarms
• Physicomimetics Framework
• Offline Evolutionary Learning
• Novel Distributed Online Learning
• Obstacle Avoidance with Physical Robots
• Conclusion and Future Work
    Contributions
•   Improved performance in obstacle avoidance:
    •   Applied a new force law for robot control, to improve performance
    •   Provided novel objective performance metrics for obstacle avoiding
        swarms
    •   Improved scalability of the swarm in obstacle avoidance
    •   Improved performance of obstacle avoidance with obstructed
        perception
•   Invented a real-time learning algorithm (DAEDALUS):
    •   Demonstrate that a swarm can improve performance by mutating
        and exchanging force laws
    •   Demonstrate feasibility of DAEDALUS with obstacle avoidance, in
        environments three times denser than the norm
    •   Explore the trade-offs of mutation on homogeneous and
        heterogeneous swarm learning
•   Hardware Implementation
    •   Present a novel robot control algorithm that merges
        physicomimetics with obstacle avoidance.
Future Work
• Use DAEDALUS to provide practical solutions to real
world problems
• Provide obstacle avoidance capability to all the robots in
the formation
• Develop robots with greater data exchange capability
• Adapt the physicomimetics framework to incorporate
performance feedback for specific tasks and situational
awareness
• Extend the physicomimetics framework for sensing and
performing tasks in a marine environment (with Harbor
Branch)
• Introduce robot/human roles and interactions to
distributed evolution architecture
Work Published
•   Spears W., Spears D., Heil R., Kerr W. and Hettiarachchi S. An overview of
    physicomimetics. Lecture Notes in Computer Science - State of the Art
    Series Volume 3342, 2004. Springer.
•   Hettiarachchi S. and Spears W., Moving swarm formations through obstacle
    fields. Proceedings of the 2005 International Conference on Artificial
    Intelligence, Volume 1, 97-103, CSREA Press.
•   Hettiarachchi S., Spears W., Green D., and Kerr W., Distributed agent
    evolution with dynamic adaptation to local unexpected scenarios .
    Proceedings of the 2005 Second GSFC/IEEE Workshop on Radical Agent
    Concepts. Springer.
•   Spears, W., D. Zarzhitsky, S. Hettiarachchi, W. Kerr. Strategies for multi-
    asset surveillance. IEEE International Conference on Networking, Sensing
    and Control, 2005, 929-934. IEEE Press.
•   Hettiarachchi, S. and W. Spears (2006). DAEDALUS for agents with
    obstructed perception. In SMCals/06 IEEE Mountain Workshop on Adaptive
    and Learning Systems, pp. 195-200. IEEE Press, Best Paper Award.
•   Hettiarachchi, S. (2006). Distributed online evolution for swarm robotics. In
    Doctoral Mentoring Program AAMAS06, T. Ishida and A. B. Hassine (Eds.),
    Autonomous Agents and Multi Agent Systems, pp. 17-18..
•   Hettiarachchi, S., P. Maxim, and W. Spears (2007). An architecture for
    adaptive swarms. In Robotics Research Trends, X. P Guo (Ed.). Nova
    Publishers (Book Chapter).
Thank You

Questions?
Backup Slides

Next set of slides may be confusing
because they are intended to be placed
between the slides from 1-49.
DAEDALUS for Reducing
Collisions


• Slightly mutate robot-obstacle force
  law interactions.
• Those robots that do not collide give
  their force laws to poorer performing
  robots.
DAEDALUS for Improving
Survival
• Previous experiment did not attempt to
  alleviate the situation where robots are left
  behind.
• This is caused by large number of cul-de-
  sacs produced by large obstacle density.
• Slightly mutate robot-robot interaction, if
  there is a nearby moving neighbor.
• Rapidly mutate robot-goal interaction, if
  there are no neighbors.
Improved Survival




  Two Online experiments are independent from each other.
   Task: Obstacle Avoidance
   with Obstructed Perception
Robots must organize                                      goal
themselves into a
formation and then move
toward a goal, while
avoiding obstacles.




                          •A robot may not see another robot, due
                          to the presence of obstacles.
                          •If r > minD, then robot A and robot B
                          have their perception obstructed.
DAEDALUS Results

                                                  DAEDALUS online
                                                  learning is
                                                  improving
                                                  performance.



                                                  We do not train
                                                  children on hard
                                                  problems
                                                  immediately,
                                                  instead, we train
                                                  them on easier
                                                  problems first. This
                                                  is counter to
                                                  accepted wisdom in
                                                  the EA community.
     Results averaged over 100 independent runs
Homogeneous DAEDALUS
• All robots had the same mutation rate,
  which was 5%.
• The results may depend quite heavily on
  choosing the correct mutation rate.
• The best mutation rate may also depend on
  the environment, and should potentially
  change as the environment changes.
• We decided to explore this effect by
  conducting several experiments with
  different mutation rates.
Heterogeneous DAEDALUS
• We attempted to address the problem of
  choosing the correct mutation rate.
• We divided the robots into five groups of
  equal size.
• Each group of 12 robots was assigned a
  mutation rate of 1%, 3%, 5%, 7%, and
  9%, respectively.
• This mimics the behavior of children that
  have different “comfort zones” in their rate
  of exploration.
Heterogeneous Results

                                                   The result at the
                                                   final goal is
                                                   essentially
                                                   identical
                                                   to the average
                                                   of the five
                                                   performance
                                                   curves in the
                                                   previous graph.
                                                   Can DAEDALUS
                                                   learn the proper
                                                   “comfort zone”,
                                                   instead?



      Results averaged over 100 independent runs
Analogy – Children Learning
• Borrowed from the analogy of a “swarm” of
  children learning some task.
• They share useful information as to the
  rules they might use, but they also share
  meta-information as to the level of
  exploration that is actually safe!
• Very bold children might encourage their
  more timid comrades to explore more than
  they would initially.
• If a very bold child has an accident, the
  rest of the children will become more timid.
Extended Heterogeneous
DAEDALUS - Results

                                                   DAEDALUS now
                                                   allows the robots
                                                   to receive a neighbor’s
                                                   mutation rate, in
                                                   addition to the
                                                   neighbor’s rules.
                                                   The results are close
                                                   to those achieved
                                                   by the homogenous
                                                   DAEDALUS with the
                                                   best mutation rate!




      Results averaged over 100 independent runs
Why Physicomimetics?
• Capable of maintaining formations of
  robots
• Designed as a leader-follower
  algorithm
• Allows robots to move quickly, due to
  minimal communication
• Can use theory to set parameters
Physcomimetics for Formation
Control
• The leader provides an attractive goal force
  for the followers
• The follower uses F = ma to compute the
  change in velocity that is required to follow
  the leader
• Power supply to motors are changed based
  on the changes in velocity
Formation Control Methodology
• Measure the quality of Physicomimetics without
  repulsions from obstacles
• All experiments are conducted outdoor in the
  “Prexy’s Pasture”
• Three Maxelbots: One leader and two followers
• Results averaged over 10 runs
• Leader remotely controlled (NO Physicomimetics)
• Leader DO NOT have obstacle avoidance
  capability
• Focus is on the formation control, not the
  obstacle avoidance
Triangular Formation
Triangular Formation Results
Linear Formation
Linear Formation Results
                                            Maxelbot Turning Right - Obstacle on the Left

                       80
                                 Lag in stopping due to physicomimetic inertia.
                                         Helps counteract noisy sensors.
                       70


                       60


                       50
                                 Left sensor
Power to Right Motor




                       40        sees obstacle

                       30


                       20
                                                         Lag in starting due to physicomimetic inertia.
                       10                                       Helps counteract noisy sensors.

                        0
                             0        10    20      30       40         50        60       70   80   90   100
                       -10


                       -20
                                 Left middle sensor        Distance to Obstacle (inches)
                                 sees obstacle
                                         Maxelbot Turning Left - Obstacle on the Right
                                Lag in stopping due to AP inertia.
                      80
                                Helps counteract noisy sensors.
                      70


                      60


                      50
                                Right sensor
Power to Left Motor




                      40
                                sees obstacle

                      30


                      20

                                                        Lag in starting due to AP inertia.
                      10
                                                        Helps counteract noisy sensors.
                       0
                            0       10      20     30       40       50       60        70   80   90   100
                      -10


                      -20       Right middle sensor
                                sees obstacle           Distance to Obstacle (inches)
                                                 Maxelbot Stopping Behavior - Both Middle Sensors Detect an
                                                                         Obstacle

                                        80


                                        70
Average of Left and Right Motor Power




                                        60


                                        50


                                        40


                                        30

                                                                         Power will be reduced if the
                                        20                               outermost sensors see an
                                                                         obstacle when the inner
                                        10                               sensors do not.

                                         0
                                             0       10    20     30       40       50       60        70   80   90   100
                                                                       Distance to Obstacle (inches)

								
To top