Heading in the Right Direction by nyut545e2


									                                       Heading in the Right Direction

                                     Hagit Shatkay               Leslie P. Kaelbling
                                            Department of Computer Science
                                                   Brown University
                                                 Providence, RI 02912
                                              hs,lpk @cs.brown.edu

                         Abstract                                Probabilistic models are widely used within the AI com-
                                                                 munity. Such models may allow continuous probabilities,
     Stochastic topological models, and hidden                   as demonstrated in work on Bayesian networks [7], hid-
     Markov models in particular, are a useful tool              den Markov models [5, 8], probabilistic clusters [2] and
     for robotic navigation and planning. In previ-              stochastic maps [19], to name a few. However, the assump-
     ous work we have shown how weak odometric                   tion underlying all the above work is that continuous dis-
     data can be used to improve learning topologi-              tributions are linear — that is — distributions that assign
     cal models, overcoming the common problems                  density to each point on the real line so that the area un-
     of the standard Baum-Welch algorithm. Odomet-               der the density curve, integrated over the whole real line, is
     ric data typically contain directional information,         ½ .1 Such models do not take into account directional data,
     which imposes two difficulties: First, the cyclic-           which is inherently cyclic. Under circular distributions the
     ity of the data requires the use of special circular        density of any point Ü on the real line is the same as that of
     distributions. Second, small errors in the head-            Ü   ·   where is any integer and is some real number.
     ing of the robot result in large displacements in           The need for circular distributions has long been realized
     the odometric readings it maintains. The cumu-              by statisticians [6], but the practice of using them has not
     lative rotational error leads to unreliable odomet-         found its way into the computer science community and
     ric readings. In the paper we present solutions             to the machine learning community in particular. One of
     to these problems by using a circular distribu-             the goals of this paper is to point out the usefulness of one
     tion and relative coordinate systems. We validate           specific circular distribution in the context of robotics, and
     their effectiveness through experimental results            provide a short tutorial on circular distributions.
     from a model-learning application.
                                                                 Another special aspect of directional data is its sensitiv-
                                                                 ity to errors. As most navigators, pilots and skippers have
1   INTRODUCTION                                                 experienced, a small angular deviation from the original
                                                                 course causes a big displacement at the final location. This
Directional data is information consisting of magnitude
                                                                 problem is very prominent in mobile robots, where drifts
and direction. Such data is an integral part of important ap-
                                                                 and drags of the wheels and disalignment of both engines
plications in various areas of computer science in general
                                                                 and floors can cause a robot to face in the wrong heading
and artificial intelligence in particular. In computer graph-
                                                                 with respect to its own odometric readings. Odometric in-
ics, automatic production of pen-and-ink drawings and the
                                                                 formation is recorded by the robot along three dimensions;
production of animation based on magnetic trackers data
                                                                 it consists of the changes along the Ü and the Ý axis as well
requires statistical manipulation of directional data. In cog-
                                                                 as a change in the heading of the robot within a global co-
nitive science, modeling routes chosen by animals [4] re-
                                                                 ordinate system. In our previous work on learning topolog-
quires a similar kind of statistical manipulation. In the area
                                                                 ical models [17] we made several assumptions about the
of machine learning we often use probabilistic models for
                                                                 odometric data:
robot movement. Most aspects of robot movement (arm
movement as well as the whole body movement) can be                  ¯       All odometric measures are normally distributed.
described in terms of location and heading change, requir-
ing the use and manipulation of directional data.                            Most often the distribution is Gaussian.
    ¯   All corridors are perpendicular to each other.           and the odometric measures are all subject to error. The
    ¯   The robot, when collecting the data, is using the per-   learning task is to deduce a model from the recorded obser-
        pendicularity assumption, and is collecting the data     vations and odometric information.
        with respect to one global coordinate system.            Our learning algorithm gets as an input an experience se-
This paper demonstrates the problematic aspects of these         quence of observations and odometric readings, and pro-
assumptions and introduces our solution to the problems,         duces as output an HMM 2 , , of the environment, such that
together with preliminary results that demonstrate the ef-       the likelihood, È Ö ´ µ, is locally maximized. Formally,
fectiveness of our solution. The rest of the paper is orga-      the standard HMM is defined as a tuple      Ë Ç           ,
nized as follows: Section 2 describes our application and        where:
motivates the need for circular distributions in the context      ¯   Ë       ×½     ×Æ is a finite set of Æ states;
of machine learning; Section 3 presents the von Mises dis-
tribution, which is a circular version of the normal distribu-    ¯   Ç            ½ Ç is a finite set of observation vectors

tion; Section 4 discusses the problems faced due to heading           length Ð; the th element of an observation vector is
deviations and presents our solution to the problem; Sec-             chosen from the finite set Ç ;
tion 5 presents experiments and results to demonstrate the        ¯       is a stochastic transition matrix, with
usefulness of our approach; Section 6 concludes the paper.                ´
                                                                      È Ö ÕØ·½ × ÕØ × ;       µ½              Æ ; ÕØ is the state
                                                                      at time Ø;
2       LEARNING TOPOLOGICAL MODELS                               ¯      is an array of Ð stochastic observation matrices, with
                                                                            Ó       ´ ℄
                                                                                È Ö ÎØ       Ó ÕØ × ; µ½          Ð     ½     Æ
Hidden Markov models (HMM s), as well as their gener-                 Ó ¾ Ç ; ÎØ is the observation vector at time Ø;
alization to models for partially observable Markov deci-
sion processes (POMDP models), are a useful tool for rep-
                                                                  ¯      is a stochastic initial probability vector describing
                                                                      the distribution of the initial state.
resenting environments such as road networks and office
buildings, which are typical for robot navigation and plan-      Odometric information gathered by the robot is not an in-
ning [1, 14, 18]. Previous work on planning with such mod-       herent part of the topological model, but is used by the
els typically assumed that the model is manually provided.       learning algorithm to better identify and distinguish states.
Manual acquisition of these models can be very tedious           To facilitate the use of this information we augment the
and hard. It is desirable to learn such models automati-         standard model with the odometric relation matrix:
cally, both for robustness and in order to cope with new and
changing environments. Since POMDP models are a simple
                                                                  ¯   Ê is a relation matrix, specifying for each pair of states,
                                                                      × and × , the mean and variance of the -dimensional
extension of HMM s, they can, theoretically, be learned with
a simple extension to the Baum-Welch algorithm [15] for               metric relation between them;
                                                                                                                    ´Ê      ℄µ is
learning HMM s. However, without a strong prior constraint            the mean of the Ø component of the relation be-
on the structure of the model, the Baum-Welch algorithm               tween × and × and       ´ µ  ¾ def   ¾
                                                                                                            ´Ê      ℄µ
                                                                                                                     , the vari-
does not perform very well: it is slow to converge, requires          ance, where    ½              . Furthermore, Ê is geo-
a great deal of data, and often becomes stuck in local max-           metrically consistent: for each component , the rela-
ima. In previous work [16, 17] we demonstrated how the                tion Ê   ´ µ   def
                                                                                          ´Ê     ℄µ  must satisfy the following
simple Baum-Welch algorithm can be enhanced with weak                 properties for all states , , and 
local odometric information to learn better models faster,
                                                                      ¥   Ê   ´ µ ¼;
                                                                              ´ µ  Ê ´ µ (anti-symmetry); and
under the assumption listed above. For the sake of com-
pleteness, we briefly review the essentials of this work here.         ¥   Ê

A robot moves through the corridors in an office environ-
                                                                      ¥   Ê   ´ 
µ Ê ´ µ · Ê ´ 
µ ´ Ø Ú ØÝ µ
ment. Low-level software provides a level of abstraction         The odometric information recorded by the robot at time Ø,
that allows the robot to move through hallways from inter-       ÖØ, consists of the change in the Ü and Ý coordinates of the
section to intersection and turn ninety degrees to the left      odometric readings when moving from state ÕØ ½ to state
or right. At each intersection, ultrasonic data interpretation   ÕØ, as well as the change of the robot’s heading, , between
lets the robot observe, in each of the four cardinal direc-      these states.
tions, whether there is an open space, a door, a wall, or        An arbitrary initial model ¼ is assumed. Then an expecta-
something unknown. The robot also has encoders on its            tion maximization algorithm [3] is executed as follows:
wheels that allow it to estimate its current pose (position
and orientation) with respect to its pose at the previous in-        2
                                                                       We discuss here HMM s rather than POMDP models. Extension
tersection. Of course, the action and perception routines        to POMDP s is straightforward, but notationally more cumbersome.
                                      a          b                                    However, we do not know in advance the angles between
                              < x, y, θ >    < x, y, θ +180 >
                                                                                      states. The data is a sequence of measurements recorded at
                                                                                      all the states. We estimate the probabilities of the states in
        Figure 1: Robot changes heading from state a to state b.                      which they were recorded, and take a weighted mean of the
    ¯   E-step: computes the state-occupation and transi-                             measurements in order to estimate the angular change be-
        tion probabilities, ­Ø              ´µ
                                          È Ö ÕØ ×         and  ´                 µ   tween every two states. Thus, we are facing the following
         Ø ´ µ            ´
                   È Ö ÕØ × ÕØ·½ ×              , respectively,     µ                 problem: What is the interpretation of a “mean angle”?
        at each time Ø in the sequence, given and the current                         As an example, suppose we want to estimate the heading
        model , and                                                                   change from state to state of Figure 1. We adopt the
    ¯   M-step: finds a new model               that maximizes                         convention of angles being expressed between               ½¼
                                                                                                                                               Æ and
        ÈÖ´      ­ .  µ                                                               ½¼  Æ. Also, suppose that the robot recorded two measure-
                                                                                      ments of angular distance from state to state :            ½
                                                                                                                                               Æ and
Introducing odometric information requires iterative up-
                                                                                      ½   Æ. The simple average between these measurements is
                                                                                      an estimate of the mean heading change of Æ . Obviously
dates of the odometric relations between pairs of states, in
the relation matrix, Ê. The updates need to maintain the
                                                                                      this value does not even approximate the change of head-
properties listed above, although currently the update pro-
                                                                                      ing between the two states. The same problem arises if
cedure only satisfies the first two.
                                                                                      we use any other convention for expressing angles (e.g. Æ           ¼
The learning task is further complicated by the special na-                           to  ¿¼  Æ ). The problem lies in the fact that angles that are
ture of the heading reading and the rotational errors ac-                             about ½¼     Æ away from the mean angle, indeed greatly de-
crued. The following section goes in more detail into the                             viate from this mean, while angles that deviate about        Æ     ¿¼
special issues of handling the heading information. The                               are actually very close to it. To capture this idea, the con-
rest of the paper deals with resolving the problems caused                            cept of circular distribution is required. We provide a brief
by rotational errors.                                                                 introduction to the concepts and techniques used for han-
                                                                                      dling directional data. In particular we concentrate on the
3        DIRECTIONAL DATA AND                                                         von Mises distribution — a circular version of the normal
         DISTRIBUTIONS                                                                distribution. Further discussion can be found in the statis-
                                                                                      tical literature [6, 10, 13]. Section 3.3 returns to show how
Suppose a robot is in state , which is in location Ü Ý                                the theory is applied in our model and learning algorithm.
facing in direction , as shown in figure 1. By turning
backwards, it transitions to state , and a respective change                          3.1 STATISTICS OF DIRECTIONAL DATA
of heading of approximately ¦                    ½¼
                                     Æ is recorded. Thus the
new recorded configuration of the robot is Ü ¯½ Ý ¯¾                     ·         ·   Directional data in the 2-dimensional space can be
        ½¼ ·
  ¦ Æ ¯¿ where ¯ is the error due to inaccuracy in
                                                                                      represented as a collection of 2-dimensional vectors,
                                                                                      ´ ܽ Ý ½                    µ
                                                                                                     ÜÒ ÝÒ , on the unit circle, as shown in Fig-
both measurement and movement. In earlier work [17],                                  ure 2. The points can also be represented as the corre-
we treated all errors — in both location (Ü Ý) and head-                              sponding angles between the radii from the center of the
ing ( ) — as if they were normally distributed. However,
the change in heading is different from changes in Ü and Ý,
                                                                                      unit circle and the x axis, ½       ´         µ
                                                                                                                           Ò , respectively. The
                                                                                      relationship between the two representations is:
since angular measurements are cyclic. That is, a change
in heading of Æ is the same as that of ¦          Æ , for any           ¿¼                   Ü       
Ó×´ µ     Ý             × Ò´ µ ´½Ò             µ
integer .                                                                             The vector mean of the Ò points, Ü Ý , is calculated as:
                                                                                                     ÈÒ                           ÈÒ
If we knew in advance, for every pair of states, the ap-                                     Ü            ½   
Ó×´ µ          Ý         ½   × Ò´ µ
proximate change in heading (                        ¡¢
                                     ) between them, we                                                 Ò                       Ò

could have modeled it as normal with mean             , and                  ¡¢       Using polar coordinates, we can express the mean vector in
small variance ¾ . We could have adopted a convention,                                terms of angle, , and length, , where (except for the case
normalizing all angles to be within a cyclic range, e. g.                             Ü Ý        ¼
    ½¼ ½¼℄
   Æ Æ , (similarly we may use radians), and always                                                          Ý
Ø Ò´ µ        ܾ Ý ¾ ¾´ · µ
chosen to take as the angular change between two points                                                      Ü
ÑÒ   ´¡ ¿ ¼    Æ              ¡ µ
                        , and assigned it the correct sign.
                                                                                                              ¼       ½
                                                                                       The angle is the mean angle, while the length is a
Such an approach of using a non-circular distribution is jus-                         measure (between and ) of how concentrated the sample
tified when the estimation of a position is based only on                              angles are around . The closer is to 1, the more concen-
readings a-priory known to be taken near this position, (see                          trated the sample is around the mean, which corresponds to
for example work by Thrun et al [20] and Lu et al [12]).                              a smaller sample variance.


                                    1            , 1
                                              <x 1 y >
                                                      <x 2 y >
                                                         , 2                                                    0.6

                                                              , 3
                                                           <x 3 y >
                                        θ1    θ2
                                                   θ3                                                           0.4
                  -1                                             1      x                                               k=1

                                                                                                                0.2     k=0.5

                                                                                                                                                                  in radians
                                                                                    -3        -2           -1                      1          2               3

    Figure 2: Directional data represented as angles and as vectors               Figure 3: The von Mises distribution with mode 0 and various
    on the unit circle.                                                           k values.

A function is a density function of a continuous circular                       “unwrapped” plot of the von Mises distribution for various
distribution if and only if: Ü
                                        ´µ ¼
                                     and ¼       Ü Ü                  ´µ    ½   values of where        .        ¼
A simple example of a circular distribution is the uniform
circular distribution, whose density function is        ¾
                                                                       ´µ       We now describe how to estimate the parameters and
                                                                                given a set of heading samples (angles ½         Ò ) from a
(where is measured in radians).                                                 von Mises distribution [13]. We are looking for maximum
One way of deriving a circular version of an unlimited lin-                     likelihood estimates for and . The likelihood function
ear distribution is through “wrapping” it around a circum-                      for the data generated by a von Mises distribution with pa-
ference of the unit circle. If Ü is a random variable on the                    rameters and is:
line with probability density function Ü , the wrapped       ´µ                                                                    ÈÒ
random variable ÜÛ       Ü       ÑÓ ¾ ℄
                                      is distributed according
                                                                                                                ´ µ                      ½
Ó×´         µ

to a wrapped distribution with the probability density func-
               Ƚ                                                                                                                  ´¾ µÒÁ ´ µÒ
tion: Û     ´µ      ½          ´ ·¾ µ
                                   . Applying this derivation                                            ½

to the normal distribution results in a circular version of
the normal distribution, but estimating its parameters from                     The maximum likelihood estimate for ,             , is:
sample data can be hard [6, 13]. An easier-to-estimate cir-                              Ö
Ø Ò´ µ
                                                                                         Ü , where Ý , Ü are as defined in equation 1.
cular version of the normal distribution was derived, by von
                                                                                The maximum likelihood estimate for                          is the       that solves
Mises [6, 13]. We use this distribution to model the robot
                                                                                the equation:
heading in this work, and it is described below.

3.2 THE VON MISES DISTRIBUTION                                                                     Á½   ´µ ½            Ò
Ó×´   µ
A circular random variable, ,                ¼              ¾
                                              , is said to have
                                                                                                   Á¼   ´µ Ò              ½

the von Mises distribution with parameters and , where
¼             ¾
              and              ¼
                        , if its probability density function
                                                                                If we don’t know and are only interested in estimating

                          ´ µ ¾ ½´ µ
                                                                                   with respect to the estimate , by using trigonometric
Ó×´   µ
                                                                                manipulation and the definition of (Equation 2), we can
                                                                                substitute the right hand side of equation 3 by and ob-
where Á¼     is the modified Bessel function of the first kind
                                                                                tain that the maximum likelihood estimate for is that
and order :                ½                 ½ ´½ µ                             satisfies: Á½ ´ µ     .
                  Á¼      ´µ    Ö¾
                                                                                           Á ´ µ

                          Ö ¼                                                   However, if we do have a given and want to find a max-
                                                                                imum likelihood estimate for the concentration of the
Similar to the linear normal distribution, this is a unimodal                   sample data around that specified , we need to use as a
distribution, symmetrical around . The mode is at                               maximum likelihood estimate for , that satisfies:
while the antimode is at                     ·
                                    . We observe that the ra-
tio of the density at the mode to the density at the antimode                              Ù       Ò                ¾         Ò              ¾      Ò                          ¾
                                                                                Á½ ´ µ     Ù
is ¾ , which indicates that the larger is, the more con-                                  ½Ø
Ó×´    µ ·               × Ò´   µ                × Ò´          µ
centrated the density is about the mode. Figure 3 shows an                      Á¼ ´ µ             ½                          ½                       ½
 The above estimation formulae agree with the intuition that        Finding      that satisfies this equation is done through the
the sample is more concentrated ( is larger) about the sam-         use of a lookup table listing values of the quotient Á½ Ü℄ .
                                                                                                                         Á¼ Ü℄
ple mean ( ) than about the true distribution mean ( ).
                                                                    The above reestimation formulae agree with the maximum
The rest of the section explains how the von Mises param-
                                                                    likelihood estimator formulae given in Section 3.1. Their
eters are incorporated into the Hidden Markov model, and
                                                                    correctness can be proved along the lines of the proof pro-
how the learning algorithm is adapted to learn these param-
                                                                    vided in our previous document [16].

3.3 HANDLING ANGULAR ODOMETRIC                                      4   STATE-RELATIVE COORDINATE
    READINGS                                                            SYSTEMS
To model the heading difference between each pair of                In our previous work we assumed that there is a sin-
states, the relation matrix Ê, described in Section 2, is 3-        gle global coordinate system within which the robot op-
dimensional, consisting of the components Ü Ý         . The         erates. Moreover, we assumed that the robot collects its
component Ê           ℄
                     represents the heading change of mov-          data within a perpendicular corridor framework and that
ing from state × to × , and is assumed to be distributed            it takes advantage of this single perpendicular framework
according to the von Mises distribution. The notation               while recording odometric information. This assumption
            Ê        ℄µ
                     represents the mean of the distribution        may be troublesome in practice. The rest of the paper dis-
for this heading change, while          Ê
                                                 ´   ℄µ
                                                represents          cusses the potential problems, presents a method for re-
                                                                    laxing the assumptions and addressing the problems, and
the concentration parameter around the mean . The three
                                                                    demonstrates the effectiveness of the solutions through ex-
constraints described before for the components of Ê, (ide-
                                                                    periments and results.
ally) hold for the component as well.
Similarly, every observed relation item, ÖØ, in the expe-           4.1 MOTIVATION
rience sequence , has a heading-change component, ,                 We tend to think about an environment as consisting of
which records the robot’s estimated change in heading be-           landmarks fixed in a global coordinate system and corri-
tween the state at time Ø, ÕØ, and the state ÕØ·½.                  dors or transitions connecting these landmarks. However,
The reestimation formula for the von Mises mean parame-             this view may be problematic when robots are involved.
ter of the heading change between states × and × is:                Conceptually, a robot has two levels in which it operates;
                ¼ Ì  ¾                                         ½    the abstract level, in which it centers itself through cor-
                          × Ò´ÖØ ℄µ Ø´ µ   × Ò´ÖØ ℄µ Ø´ µ℄          ridors, follows walls and avoids obstacles, and the phys-
                                                                    ical level in which motors turn the wheels as the robot
Ø Ò       Ø ¼
                    Ì ¾
                                                                    moves. In the physical level many inaccuracies can oc-
Ó×´ÖØ ℄µ Ø´ µ · 
Ó×´ÖØ ℄µ Ø´ µ℄          cur: unaligned wheels or unsynchronized motors can cause
                                                                    sidewards drift, an obstacle under a wheel can cause the
                    Ø ¼
                                                                    robot to slightly rotate around itself, or uneven floors may
The fraction denotes the ratio between the expected sine            cause the robot to slip in a certain direction. In addition,
and the expected cosine of the heading change from state            the odometric measuring instrumentation may be inaccu-
  to state . Since the heading change from to is iden-              rate in and of itself. In the abstract level, corrective actions
tical in magnitude but opposite in direction to the heading         are constantly executed to overcome the physical drift and
change from to , the transitions from to are also ac-               drag. For example, if the left wheel is disaligned and drags
cumulated – with reversed signs. By taking Ö
Ø Ò of this            the robot leftwards, a corrective action of moving to the
ratio we get an estimate for the mean heading change itself.        right is constantly taken in the higher level to keep the robot
                                                                    centered in the corridor.
To reestimate the concentration parameter, we need to find
    such that:                                                      Such phenomena greatly effect the odometry recorded by
                                                                    the robot, if it is interpreted with respect to one global
                      ÈÌ  ¾
       Á½       ℄         Ø ¼   Ø   ´ µ
Ó×´ÖØ ℄           µ℄        framework. For example, consider the robot depicted in
                                                                    Figure 4. It drifts to the left   Æ when moving from one
                                ÈÌ  ¾
       Á¼       ℄                    Ø ¼ Ø´   µ                     state to the next, and corrects for it by moving Æ to the
                                                                    right to maintain itself centered in the corridor, moving
     In contrast, Ü and Ý are normally distributed and have their   along the solid arrow. Let us assume that states are lo-
variance rather than concentration stored in Ê.



                               −φ                                                                                                15000



                                                                                -25000   -20000   -15000   -10000        -5000                 5000   10000

 Figure 4: The robot moves in a corridor along the solid arrow,     Figure 5: A path in a perpendicular environment, plotted based
 correcting for drift in the direction of the dashed arrow.         on odometric readings taken by the robot Ramona.

cated along the center of the corridor, which is aligned
with the Ý axis of the global coordinate system. The robot
steps back and forth in the corridor. Whenever it reaches
a state, its odometry reading changes by Ü Ý      along the
         heading dimensions, respectively. As the robot                                  Si

proceeds, the deviation with respect to the Ü axis becomes                                                                       ∆y
more and more severe. Thus, after going through several
transitions, the odometric changes recorded between every                                                                             x

pair of states, with respect to a global coordinate system,
become larger and larger (especially in the dimension).           Figure 6: Robot in state Ë , facing in the direction of the Ý axis.

Similar problems of inconsistent odometric changes                tem whose origin is anchored at the state × ; the Ý axis is
recorded between pairs of states can arise along any of the       aligned with the robot’s heading in state × and the Ü axis is
odometric dimensions. It is especially severe when such           perpendicular to it. This is depicted in figure 6. The robot
inconsistencies arise with respect to the heading, since this     is in state × facing in the direction pointed to by the Ý axis.
can lead to confusion between the        and the    axes, as      Its relationship to the state × is described in terms of the
well as confusion between forwards and backwards move-            coordinate system shown in the figure. Its heading in each
ment (when the deviation in the heading is around Æ or        ¼   state is denoted by the bold arrow.
½¼  Æ respectively). An example of our robot view of a per-
                                                                  To support this interpretation of the relation matrix we need
fectly perpendicular office environment, based on its odo-
                                                                  to revisit the formulation of the geometrical-consistency
metric readings within a global coordinate system, is shown
                                                                  constraints stated in Section 2, as well as the update for-
in Figure 5. The data was collected by our robot Ramona,
                                                                  mulae used when learning the model.
while moving along the corridors in an area of our depart-
ment, depicted in Figure 7.                                       The consistency constraints have to reflect the coordi-
                                                                  nate system with respect to which the odometry is repre-
A solution to such a situation is to model the odometric re-
                                                                  sented. Since the heading measurement is independent of
lations of moving from state × to state × using a changing
                                                                  any specific coordinate system, only the constraints over
coordinate system which is respective to state × , as op-
                                                                  the Ü and Ý components of the odometric relation need
posed to a global coordinate system anchored at the initial
                                                                  to be redefined. We denote by Ü Ý                the vector                   ´ µ
state. We formalize this idea and provide the update rules
for the odometric information based on this approach in the
                                                                    ´Ê Ü      ℄µ ´                ℄µ
                                                                                  Ê Ý . Let us define Ì to be the trans-
                                                                  formation which maps an Ü Ý pair represented with re-
rest of this section. We have implemented our solution, and
                                                                  spect to the coordinate system of state , to the same pair
demonstrate its effectiveness throughout Section 5.
                                                                  represented with respect to the coordinate system of state
4.2 LEARNING ODOMETRIC RELATIONS WITH                              , Ü Ý , (note that Ì        Ì  ½).
    CHANGING COORDINATES                                          More explicitly, as before, let     be the mean change   ´ µ
As before, our experience sequence consists of Ì pairs            in heading from state to state (recall that                                                 ´ µ
 ÖØ ÎØ of recorded odometric relations and observation                  ´ µ). The transformation Ì is defined as follows:
vectors. The odometric relations are still recorded with re-       Ü
                                                                          Ì      Ü                Ü        
Ó×´ ´ µµ   Ý × Ò´ ´ µµ
spect to the robot’s global coordinate system. However,
when learning the relation matrix from the odometric read-
                                                                   Ý             Ý                Ü        × Ò´ ´ µµ · Ý 
Ó×´ ´ µµ
ings, we interpret the entry Ê in the relation matrix Ê, as
                                                                  We can now redefine the consistency constraints for the Ü
encoding the information with respect to a coordinate sys-
                                                                  and Ý components of the odometric relation:
               3             4        5               6    7       8
           2                                                            9                                                      ¾

                                                                                                                           ½       ½¿




                                                                                                                       ½           ½¾

           0                                                            12
               16       15       14                                13                                         ½

    Figure 7: Model of a prescribed path through a true hallway                               Figure 8: Learned topological model.

¥     ÜÝ  ´ µ ¼¼  ;                                                          hallways from intersection to intersection and to turn ninety
¥     ÜÝ  ´ µ  Ì Ü Ý ´ µ¡ (anti-symmetry);                                   degrees to the left or right. Ultrasonic data interpretation

¥     Ü Ý ´ 
µ ÜÝ ´ µ·Ì   Ü Ý ´ 
µ¡´ Ø Ú Øݵ                                 let her perceive, in three directions – front, left and right
                                                                             – whether there is an open space, a door, a wall, or some-
The reestimation formulae for all the parameters except for                  thing unknown. Doors and intersections constitute states.
the Ü and Ý components of the relation matrix Ê, remain as                   When they are detected by Ramona, it stops and records its
before. However, the reestimation formulae for the Ü and                     observations, as well as its odometric change between the
Ý parameters are changed to reflect the relative coordinate                   previous and the current state. All recorded measures as
systems used. Ü and Ý are reestimated as follows:                            well as the actions are, of course, subject to error.
               Ì ¾
                             Ø´ µ         ÖØ Ü  ℄   Ì   Ø´ µÌ
                                                                 ÖØ Ü   ℄    The path Ramona followed consists of 4 connected corri-
                   Ø ¼
                                          ÖØ Ý  ℄ Ø   ¼
                                                                 ÖØ Ý   ℄    dors, which include 17 states, as shown in Figure 7. Black
                                                                             dots represent the physical locations of states. Multiple
                                          Ì ¾
                                                ´ Ø´ µ · Ø´ µµ               states (depicted as numbers in the plot) associated with a
                                                                             single location correspond to different orientations of the
                                          Ø ¼
                                                                             robot at that location. The larger black circle, at the bottom
These reestimation rules are guaranteed to satisfy the first                  left corner, represents the starting position. The observa-
two geometrical constraints, but not the additivity con-                     tions associated with each state are omitted for clarity. A
straint. Their correctness can be proved along the lines of                  projection of the odometric readings that Ramona recorded
the correctness proofs for all other formulae [16].                          along the Ü and Ý dimensions, is shown in figure 5.
                                                                             To statistically evaluate our algorithm, we use a simulated
5      EXPERIMENTS AND RESULTS                                               office environment in which the robot follows a prescribed
The goal of this work is to use odometry to improve the                      path. It is represented as an HMM consisting of 44 states,
learning of topological models, while using fewer iterations                 and the associated transition, observation, and odometric
and less data. We tested our algorithm in a simple robot-                    distributions. Figure 9 depicts this HMM . Arrows repre-
navigation world. In earlier stages of this work, a strong                   sent transitions that have probability   ¼¾or higher. Solid
assumption underlay our experiments: the corridors in the                    arrows represent the most likely transitions between the
environment are all perpendicular to each other, and the                     states. We generated 5 data sequences from the model, each
agent was using this perpendicularity to reset its position                  of length 800, using Monte Carlo sampling. One of these
while accumulating the odometric readings. Here we have                      sequences is depicted in Figure 10. Again, observations are
updated the algorithm and dropped the assumption. The ex-                    omitted, and this is a projection of the odometry readings
periments demonstrate that the use of odometry, even with                    onto a global 2-dimensional coordinate system. For each
accumulated rotational error and without using the perpen-                   sequence we ran our algorithm 10 times. We also ran the
dicularity assumption, is still very beneficial.                              standard Baum-Welch algorithm, not using odometric in-
                                                                             formation, 10 times on each sequence. For both algorithms
5.1 EXPERIMENTAL SETTING                                                     we started each run from a randomly picked initial model.
Our experiments use both real robot data and simulated
                                                                             5.2 RESULTS
data. We ran our robot Ramona, a modified RWI B21,
along a prescribed4 directed path in our department corri-                   We used our algorithm to learn a topological model of the
dors. Low-level routines let Ramona move forward through                     environment from the data gathered by Ramona. Figure 8
                                                                             shows the topology of one typical learned HMM . The bold
      Hence, no decisions are executed by the robot, and the model
is an HMM and not a complete POMDP .
                                                                             circle represents the initial state. The arrows semantics is

     12          10                       9
     13          11                       8
                                                                                        -15000           -10000             -5000                          5000
                                           6   7               42
                 22   20                                       0


                                          5        3           2    1
     14   15          17        24                             38       35
                                25                             36
     16                                                        37             34                                                            -10000
                                26 27     28

                                30 31                                    32


 Figure 9: Model of a prescribed path through the simulated
                                                                                     Figure 10: A data sequence generated by our simulator.
 hallway environment.

as stated before. It is clear that the learned topology corre-                                 25

sponds well to the topology of the true environment. The
observation distributions learned are omitted from the fig-                                                              No Odometry

ure, but they too correspond well to the walls, doors and                                 KL

openings encountered along the path, while incorporating                                       10

the identification error resulting from noisy sensors.                                            5
                                                                                                                   Odometry Used

Traditionally, in simulation experiments, learned models                                         0
                                                                                                     0            200             400
                                                                                                                              Seq. Length
                                                                                                                                                     600    800

are quantitatively compared to the actual model that gen-
erated the data. Each of the models induces a probabil-                            Figure 11: Average KL-divergence as a function of length.
ity distribution on strings of observations; the asymmetric
Kullback-Leibler divergence [11] between the two distri-                           we used the simple two-sample t-test. The models learned
butions is a measure of how far the learned model is from                          using odometric information have highly statistically sig-
the true model. We report our simulation results in terms                          nificantly (Ô          ¼
                                                                                                         ) lower average KL divergence than
of a sampled version of the KL divergence, as described by                         the others.
Juang and Rabiner [9]. It is based on generating sequences
                                                                                   In addition, the number of iterations required for con-
of sufficient length according to the distribution induced
                                                                                   vergence when learning using odometric information is
by the true model, and comparing their likelihoods accord-
                                                                                   smaller than required when ignoring such information.
ing to the learned model with the true model likelihoods.
We ignore the odometry information when applying the KL
                                                                                   Again, the t-test verifies the significance (Ô        ) of                ¼
                                                                                   this result.
measure, thus allowing comparison between purely topo-
logical models that are learned with and without odometry.                         To examine the influence of the amount of data on the qual-
                                                                                   ity of the learned models, we took one of the 5 sequences
Table 1 lists the KL divergence between the true and learned
model, as well as the number of runs until convergence was
                                                                                   (Seq.    ½ ) and used its prefixes of length 100 to 800 (the
                                                                                   complete sequence), in increments of 100, as individual se-
reached, for each of the 5 simulation sequences under the
                                                                                   quences. We ran the two algorithmic settings over each of
two learning settings, averaged over 10 runs per sequence.
                                                                                   the 8 prefix sequences, 5 times repeatedly. We then used
The table demonstrates that the KL divergence with respect                         the KL-divergence as described above to evaluate each of
to the true model for models learned using odometry, is                            the resulting models with respect to the true model. For
about 4-5 times smaller than for models learned without                            each prefix length we averaged the KL-divergence over the
odometric data. To check the significance of our results                            5 runs. Table 2 summarizes the results of this experiment.
                                                                                   It lists the mean KL-divergence over the 5 runs for each of
Table 1: Average results of 2 learning settings with 5 training                    the prefixes, as well as the standard deviation around this
sequences.                                                                         mean. The plot in Figure 11 depicts the KL-divergence as
     Seq. #             1         2       3              4            5            a function of the sequence length for each of the settings.
  With     KL         1.115     1.100   1.095          1.139        1.129          Both the table and the plot demonstrate that, in terms of the
  Odo Iter #          69.7       81.8    84.3           52.4        112.9          KL-divergence, our algorithm, which uses odometric infor-
   No      KL         5.575     4.499   4.997          4.491        5.791          mation, is robust in the face of data reduction. In contrast,
  Odo Iter #          120.4     107.5   116.2          113.3        120.6          learning without the use of odometry is much more sensi-
                                    Table 2: Average results with 8 incrementally longer sequences.
                     Seq. Length           800      700       600        500      400        300       200         100
                   With Mean KL           1.136    1.201     1.191      1.241    1.216      1.272     1.771      15.076
                   Odo    Std. Dev.       0.091    0.083     0.131      0.082    0.036      0.085     0.510      12.884
                   No     Mean KL         5.790    6.249     8.354     10.390    11.490    14.772     20.044     26.619
                   Odo    Std. Dev.       0.554    0.937     0.179      0.460    0.422      1.280     0.904       0.460

tive to reduction in the amount of data. Again, we applied                [6] E. G. Gumbel, J. A. Greenwood, D. Durand. The circular
the two-sample t-test, which verified the statistical signifi-                  normal distribution: Theory and tables. American Statistical
cance of these results.                                                       Society Journal, 48, 131–152, March 1953.

                                                                          [7] D. Heckerman, D. Geiger. Learning Bayesian networks: A
6    CONCLUSIONS                                                              unification for discrete and Gaussian domains. In Proc. of
                                                                              the ½½Ø Int. Conf. on Uncertainty in AI. 1995.
Directional information which comes up in various appli-
cations of computer science in general and machine learn-                 [8] B. H. Juang. Maximum likelihood estimation for mix-
                                                                              ture multivariate stochastic observations of Markov chains.
ing in particular, requires special treatment. Currently most                 AT&T Technical Journal, 64(6), July-August 1985.
statistical models and applications are based on distribu-
tions that are either discrete or continuous along the real               [9] B. H. Juang, L. R. Rabiner. A probabilistic distance measure
line, rather than circular. It is important to be aware of the                for hidden Markov models. AT&T Technical Journal, 64(2),
                                                                              391–408, February 1985.
need for circular distributions as well as of their existence.
Moreover, it would be useful to have widely used applica-                [10] S. Kotz, N. L. Johnson, eds. Encyclopedia of Statistical Sci-
tions such as Autoclass [2] support such distributions.                       ences, vol. 2, pp. 381–386. John Wiley and Sons, 1982.

A problematic aspect of directional data which manifests                 [11] S. Kullback, R. A. Leibler. On information and sufficiency.
itself when learning maps and models for robot navigation                     Annals of Mathematical Statistics, 22(1), 79–86, 1951.
is that of cumulative rotational errors. In the context of               [12] F. Lu, E. E. Millios. Globally consistent range scan align-
our work we have demonstrated that the use of relative co-                    ment for environment mapping. Autonomous Robots, 4,
ordinate systems rather than global ones supports learning                    333–349, 1997.
relationship between states. The main point shown by this                [13] K. V. Mardia. Statistics of Directional Data. Academic
paper is that through correct treatment of directional data,                  Press, 1972.
odometric information which is weak and very noisy still
provides a significant leverage when learning a purely topo-              [14] I. Nourbakhsh, R. Powers, S. Birchfield. Dervish: An office-
                                                                              navigating robot. AI Magazine, 16(1), 53–60, 1995.
logical map.
                                                                         [15] L. R. Rabiner. A tutorial on hidden Markov models and se-
Acknowledgments                                                               lected applications in speech recognition. Proc. of the IEEE,
We thank Sebastian Thrun for his insightful comments, and Dim-                77(2), 257–285, February 1989.
itris Michailidis for his editorial help. This work was supported by     [16] H. Shatkay, L. P. Kaelbling. Learning hidden Markov mod-
DARPA/Rome Labs Planning Initiative grant F30602-95-1-0020,                   els with geometric information. Tech. Rep. CS-97-04, Dept.
by NSF grants IRI-9453383 and IRI-9312395, and by the Brown                   of Computer Science, Brown University, 1997.
University Graduate Research Fellowship.
                                                                         [17] H. Shatkay, L. P. Kaelbling. Learning topological maps with
References                                                                    weak local odometric information. In Proc. of the ½ Ø Int.
 [1] A. R. Cassandra, L. P. Kaelbling, J. A. Kurien. Acting un-               Joint Conf. on AI. 1997.
     der uncertainty: Discrete Bayesian models for mobile-robot
     navigation. In Proc. of IEEE/RSJ Int. Conf. on Intelligent          [18] R. G. Simmons, S. Koenig. Probabilistic navigation in par-
     Robots and Systems. 1996.                                                tially observable environments. In Proc. of the Int. Joint
                                                                              Conf. on AI. 1995.
 [2] P. Cheeseman, et al. Autoclass: A Bayesian classification
     system. In J. W. Shavlik, T. G. Dietterich, eds., Readings in       [19] R. Smith, M. Self, P. Cheeseman. A stochastic map for un-
     Machine Learning. Morgan-Kaufmann, 1990.                                 certain spatial relationships. In S. S. Iyengar, A. Elfes, eds.,
                                                                              Autonomous Mobile Robots. IEEE Press, 1991.
 [3] A. P. Dempster, N. M. Laird, D. B. Rubin. Maximum like-
     lihood from incomplete data via the EM algorithm. Journal           [20] S. Thrun, W. Burgard, D. Fox. A probabilistic approach
     of the Royal Statistical Society, 39(1), 1–38, 1977.                     to concurrent map acquisition and localization for mobile
                                                                              robots. Machine Learning, 31, 29–53, 1998.
 [4] F. C. Dyer. Bees acquire route-based memories but not cog-
     nitive maps in a familiar landscape. Animal Behaviour, 41,
     239–246, 1991.
 [5] Z. Ghahramani, M. I. Jordan. Factorial hidden Markov mod-
     els. In ½ Ø Int. Conf. on Machine Learning. 1997.

To top