VIEWS: 10 PAGES: 9 POSTED ON: 5/27/2011
Heading in the Right Direction Hagit Shatkay Leslie P. Kaelbling Department of Computer Science Brown University Providence, RI 02912 hs,lpk @cs.brown.edu Abstract Probabilistic models are widely used within the AI com- munity. Such models may allow continuous probabilities, Stochastic topological models, and hidden as demonstrated in work on Bayesian networks [7], hid- Markov models in particular, are a useful tool den Markov models [5, 8], probabilistic clusters [2] and for robotic navigation and planning. In previ- stochastic maps [19], to name a few. However, the assump- ous work we have shown how weak odometric tion underlying all the above work is that continuous dis- data can be used to improve learning topologi- tributions are linear — that is — distributions that assign cal models, overcoming the common problems density to each point on the real line so that the area un- of the standard Baum-Welch algorithm. Odomet- der the density curve, integrated over the whole real line, is ric data typically contain directional information, ½ .1 Such models do not take into account directional data, which imposes two difﬁculties: First, the cyclic- which is inherently cyclic. Under circular distributions the ity of the data requires the use of special circular density of any point Ü on the real line is the same as that of distributions. Second, small errors in the head- Ü · where is any integer and is some real number. ing of the robot result in large displacements in The need for circular distributions has long been realized the odometric readings it maintains. The cumu- by statisticians [6], but the practice of using them has not lative rotational error leads to unreliable odomet- found its way into the computer science community and ric readings. In the paper we present solutions to the machine learning community in particular. One of to these problems by using a circular distribu- the goals of this paper is to point out the usefulness of one tion and relative coordinate systems. We validate speciﬁc circular distribution in the context of robotics, and their effectiveness through experimental results provide a short tutorial on circular distributions. from a model-learning application. Another special aspect of directional data is its sensitiv- ity to errors. As most navigators, pilots and skippers have 1 INTRODUCTION experienced, a small angular deviation from the original course causes a big displacement at the ﬁnal location. This Directional data is information consisting of magnitude problem is very prominent in mobile robots, where drifts and direction. Such data is an integral part of important ap- and drags of the wheels and disalignment of both engines plications in various areas of computer science in general and ﬂoors can cause a robot to face in the wrong heading and artiﬁcial intelligence in particular. In computer graph- with respect to its own odometric readings. Odometric in- ics, automatic production of pen-and-ink drawings and the formation is recorded by the robot along three dimensions; production of animation based on magnetic trackers data it consists of the changes along the Ü and the Ý axis as well requires statistical manipulation of directional data. In cog- as a change in the heading of the robot within a global co- nitive science, modeling routes chosen by animals [4] re- ordinate system. In our previous work on learning topolog- quires a similar kind of statistical manipulation. In the area ical models [17] we made several assumptions about the of machine learning we often use probabilistic models for odometric data: robot movement. Most aspects of robot movement (arm movement as well as the whole body movement) can be ¯ All odometric measures are normally distributed. described in terms of location and heading change, requir- 1 ing the use and manipulation of directional data. Most often the distribution is Gaussian. ¯ All corridors are perpendicular to each other. and the odometric measures are all subject to error. The ¯ The robot, when collecting the data, is using the per- learning task is to deduce a model from the recorded obser- pendicularity assumption, and is collecting the data vations and odometric information. with respect to one global coordinate system. Our learning algorithm gets as an input an experience se- This paper demonstrates the problematic aspects of these quence of observations and odometric readings, and pro- assumptions and introduces our solution to the problems, duces as output an HMM 2 , , of the environment, such that together with preliminary results that demonstrate the ef- the likelihood, È Ö ´ µ, is locally maximized. Formally, fectiveness of our solution. The rest of the paper is orga- the standard HMM is deﬁned as a tuple Ë Ç , nized as follows: Section 2 describes our application and where: motivates the need for circular distributions in the context ¯ Ë ×½ ×Æ is a ﬁnite set of Æ states; ÉÐ of machine learning; Section 3 presents the von Mises dis- tribution, which is a circular version of the normal distribu- ¯ Ç ½ Ç is a ﬁnite set of observation vectors tion; Section 4 discusses the problems faced due to heading length Ð; the th element of an observation vector is deviations and presents our solution to the problem; Sec- chosen from the ﬁnite set Ç ; tion 5 presents experiments and results to demonstrate the ¯ is a stochastic transition matrix, with usefulness of our approach; Section 6 concludes the paper. ´ È Ö ÕØ·½ × ÕØ × ; µ½ Æ ; ÕØ is the state at time Ø; 2 LEARNING TOPOLOGICAL MODELS ¯ is an array of Ð stochastic observation matrices, with Ó ´ ℄ È Ö ÎØ Ó ÕØ × ; µ½ Ð ½ Æ Hidden Markov models (HMM s), as well as their gener- Ó ¾ Ç ; ÎØ is the observation vector at time Ø; alization to models for partially observable Markov deci- sion processes (POMDP models), are a useful tool for rep- ¯ is a stochastic initial probability vector describing the distribution of the initial state. resenting environments such as road networks and ofﬁce buildings, which are typical for robot navigation and plan- Odometric information gathered by the robot is not an in- ning [1, 14, 18]. Previous work on planning with such mod- herent part of the topological model, but is used by the els typically assumed that the model is manually provided. learning algorithm to better identify and distinguish states. Manual acquisition of these models can be very tedious To facilitate the use of this information we augment the and hard. It is desirable to learn such models automati- standard model with the odometric relation matrix: cally, both for robustness and in order to cope with new and changing environments. Since POMDP models are a simple ¯ Ê is a relation matrix, specifying for each pair of states, × and × , the mean and variance of the -dimensional extension of HMM s, they can, theoretically, be learned with a simple extension to the Baum-Welch algorithm [15] for metric relation between them; def ´Ê ℄µ is learning HMM s. However, without a strong prior constraint the mean of the Ø component of the relation be- on the structure of the model, the Baum-Welch algorithm tween × and × and ´ µ ¾ def ¾ ´Ê ℄µ , the vari- does not perform very well: it is slow to converge, requires ance, where ½ . Furthermore, Ê is geo- a great deal of data, and often becomes stuck in local max- metrically consistent: for each component , the rela- ima. In previous work [16, 17] we demonstrated how the tion Ê ´ µ def ´Ê ℄µ must satisfy the following simple Baum-Welch algorithm can be enhanced with weak properties for all states , , and : local odometric information to learn better models faster, ¥ Ê ´ µ ¼; ´ µ Ê ´ µ (anti-symmetry); and under the assumption listed above. For the sake of com- pleteness, we brieﬂy review the essentials of this work here. ¥ Ê A robot moves through the corridors in an ofﬁce environ- ¥ Ê ´ µ Ê ´ µ · Ê ´ µ ´ Ø Ú ØÝ µ ment. Low-level software provides a level of abstraction The odometric information recorded by the robot at time Ø, that allows the robot to move through hallways from inter- ÖØ, consists of the change in the Ü and Ý coordinates of the section to intersection and turn ninety degrees to the left odometric readings when moving from state ÕØ ½ to state or right. At each intersection, ultrasonic data interpretation ÕØ, as well as the change of the robot’s heading, , between lets the robot observe, in each of the four cardinal direc- these states. tions, whether there is an open space, a door, a wall, or An arbitrary initial model ¼ is assumed. Then an expecta- something unknown. The robot also has encoders on its tion maximization algorithm [3] is executed as follows: wheels that allow it to estimate its current pose (position and orientation) with respect to its pose at the previous in- 2 We discuss here HMM s rather than POMDP models. Extension tersection. Of course, the action and perception routines to POMDP s is straightforward, but notationally more cumbersome. a b However, we do not know in advance the angles between < x, y, θ > < x, y, θ +180 > states. The data is a sequence of measurements recorded at all the states. We estimate the probabilities of the states in Figure 1: Robot changes heading from state a to state b. which they were recorded, and take a weighted mean of the ¯ E-step: computes the state-occupation and transi- measurements in order to estimate the angular change be- tion probabilities, Ø ´µ È Ö ÕØ × and ´ µ tween every two states. Thus, we are facing the following Ø ´ µ ´ È Ö ÕØ × ÕØ·½ × , respectively, µ problem: What is the interpretation of a “mean angle”? at each time Ø in the sequence, given and the current As an example, suppose we want to estimate the heading model , and change from state to state of Figure 1. We adopt the ¯ M-step: ﬁnds a new model that maximizes convention of angles being expressed between ½¼ Æ and ÈÖ´ . µ ½¼ Æ. Also, suppose that the robot recorded two measure- ments of angular distance from state to state : ½ Æ and Introducing odometric information requires iterative up- ½ Æ. The simple average between these measurements is an estimate of the mean heading change of Æ . Obviously dates of the odometric relations between pairs of states, in the relation matrix, Ê. The updates need to maintain the this value does not even approximate the change of head- properties listed above, although currently the update pro- ing between the two states. The same problem arises if cedure only satisﬁes the ﬁrst two. we use any other convention for expressing angles (e.g. Æ ¼ The learning task is further complicated by the special na- to ¿¼ Æ ). The problem lies in the fact that angles that are ture of the heading reading and the rotational errors ac- about ½¼ Æ away from the mean angle, indeed greatly de- crued. The following section goes in more detail into the viate from this mean, while angles that deviate about Æ ¿¼ special issues of handling the heading information. The are actually very close to it. To capture this idea, the con- rest of the paper deals with resolving the problems caused cept of circular distribution is required. We provide a brief by rotational errors. introduction to the concepts and techniques used for han- dling directional data. In particular we concentrate on the 3 DIRECTIONAL DATA AND von Mises distribution — a circular version of the normal DISTRIBUTIONS distribution. Further discussion can be found in the statis- tical literature [6, 10, 13]. Section 3.3 returns to show how Suppose a robot is in state , which is in location Ü Ý the theory is applied in our model and learning algorithm. facing in direction , as shown in ﬁgure 1. By turning backwards, it transitions to state , and a respective change 3.1 STATISTICS OF DIRECTIONAL DATA of heading of approximately ¦ ½¼ Æ is recorded. Thus the new recorded conﬁguration of the robot is Ü ¯½ Ý ¯¾ · · Directional data in the 2-dimensional space can be ½¼ · ¦ Æ ¯¿ where ¯ is the error due to inaccuracy in represented as a collection of 2-dimensional vectors, ´ Ü½ Ý ½ µ ÜÒ ÝÒ , on the unit circle, as shown in Fig- both measurement and movement. In earlier work [17], ure 2. The points can also be represented as the corre- we treated all errors — in both location (Ü Ý) and head- sponding angles between the radii from the center of the ing ( ) — as if they were normally distributed. However, the change in heading is different from changes in Ü and Ý, unit circle and the x axis, ½ ´ µ Ò , respectively. The relationship between the two representations is: since angular measurements are cyclic. That is, a change in heading of Æ is the same as that of ¦ Æ , for any ¿¼ Ü Ó×´ µ Ý × Ò´ µ ´½Ò µ integer . The vector mean of the Ò points, Ü Ý , is calculated as: ÈÒ ÈÒ If we knew in advance, for every pair of states, the ap- Ü ½ Ó×´ µ Ý ½ × Ò´ µ proximate change in heading ( ¡¢ ) between them, we Ò Ò (1) could have modeled it as normal with mean , and ¡¢ Using polar coordinates, we can express the mean vector in small variance ¾ . We could have adopted a convention, terms of angle, , and length, , where (except for the case normalizing all angles to be within a cyclic range, e. g. Ü Ý ¼ ): ½¼ ½¼℄ Æ Æ , (similarly we may use radians), and always Ý Ö Ø Ò´ µ Ü¾ Ý ¾ ¾´ · µ ½ (2) chosen to take as the angular change between two points Ü ÑÒ ´¡ ¿ ¼ Æ ¡ µ , and assigned it the correct sign. ¼ ½ The angle is the mean angle, while the length is a Such an approach of using a non-circular distribution is jus- measure (between and ) of how concentrated the sample tiﬁed when the estimation of a position is based only on angles are around . The closer is to 1, the more concen- readings a-priory known to be taken near this position, (see trated the sample is around the mean, which corresponds to for example work by Thrun et al [20] and Lu et al [12]). a smaller sample variance. y f() k=4 1 , 1 <x 1 y > <x 2 y > , 2 0.6 k=2 , 3 <x 3 y > θ1 θ2 θ3 0.4 -1 1 x k=1 0.2 k=0.5 -1 in radians -3 -2 -1 1 2 3 Figure 2: Directional data represented as angles and as vectors Figure 3: The von Mises distribution with mode 0 and various on the unit circle. k values. A function is a density function of a continuous circular “unwrapped” plot of the von Mises distribution for various distribution if and only if: Ü Ê¾ ´µ ¼ and ¼ Ü Ü ´µ ½ values of where . ¼ A simple example of a circular distribution is the uniform circular distribution, whose density function is ¾ ½ ´µ We now describe how to estimate the parameters and given a set of heading samples (angles ½ Ò ) from a (where is measured in radians). von Mises distribution [13]. We are looking for maximum One way of deriving a circular version of an unlimited lin- likelihood estimates for and . The likelihood function ear distribution is through “wrapping” it around a circum- for the data generated by a von Mises distribution with pa- ference of the unit circle. If Ü is a random variable on the rameters and is: line with probability density function Ü , the wrapped ´µ ÈÒ random variable ÜÛ Ü ÑÓ ¾ ℄ is distributed according Ä Ò ´ µ ½ Ó×´ µ to a wrapped distribution with the probability density func- È½ ´¾ µÒÁ ´ µÒ tion: Û ´µ ½ ´ ·¾ µ . Applying this derivation ½ ¼ to the normal distribution results in a circular version of the normal distribution, but estimating its parameters from The maximum likelihood estimate for , , is: sample data can be hard [6, 13]. An easier-to-estimate cir- Ö Ø Ò´ µ Ý Ü , where Ý , Ü are as deﬁned in equation 1. cular version of the normal distribution was derived, by von The maximum likelihood estimate for is the that solves Mises [6, 13]. We use this distribution to model the robot the equation: heading in this work, and it is described below. 3.2 THE VON MISES DISTRIBUTION Á½ ´µ ½ Ò Ó×´ µ A circular random variable, , ¼ ¾ , is said to have Á¼ ´µ Ò ½ (3) the von Mises distribution with parameters and , where ¼ ¾ and ¼ , if its probability density function If we don’t know and are only interested in estimating ´ µ ¾ ½´ µ with respect to the estimate , by using trigonometric is: Ó×´ µ manipulation and the deﬁnition of (Equation 2), we can Á¼ ´µ substitute the right hand side of equation 3 by and ob- where Á¼ is the modiﬁed Bessel function of the ﬁrst kind ¼ tain that the maximum likelihood estimate for is that and order : ½ ½ ´½ µ satisﬁes: Á½ ´ µ . Á¼ ´µ Ö¾ ¾Ö ¾ Á ´ µ ¼ Ö ¼ However, if we do have a given and want to ﬁnd a max- imum likelihood estimate for the concentration of the Similar to the linear normal distribution, this is a unimodal sample data around that speciﬁed , we need to use as a distribution, symmetrical around . The mode is at maximum likelihood estimate for , that satisﬁes: while the antimode is at · . We observe that the ra- Ú tio of the density at the mode to the density at the antimode Ù Ò ¾ Ò ¾ Ò ¾ Á½ ´ µ Ù is ¾ , which indicates that the larger is, the more con- ½Ø Ò Ó×´ µ · × Ò´ µ × Ò´ µ centrated the density is about the mode. Figure 3 shows an Á¼ ´ µ ½ ½ ½ The above estimation formulae agree with the intuition that Finding that satisﬁes this equation is done through the the sample is more concentrated ( is larger) about the sam- use of a lookup table listing values of the quotient Á½ Ü℄ . Á¼ Ü℄ ple mean ( ) than about the true distribution mean ( ). The above reestimation formulae agree with the maximum The rest of the section explains how the von Mises param- likelihood estimator formulae given in Section 3.1. Their eters are incorporated into the Hidden Markov model, and correctness can be proved along the lines of the proof pro- how the learning algorithm is adapted to learn these param- vided in our previous document [16]. eters. 3.3 HANDLING ANGULAR ODOMETRIC 4 STATE-RELATIVE COORDINATE READINGS SYSTEMS To model the heading difference between each pair of In our previous work we assumed that there is a sin- states, the relation matrix Ê, described in Section 2, is 3- gle global coordinate system within which the robot op- dimensional, consisting of the components Ü Ý . The erates. Moreover, we assumed that the robot collects its component Ê ℄ represents the heading change of mov- data within a perpendicular corridor framework and that ing from state × to × , and is assumed to be distributed it takes advantage of this single perpendicular framework according to the von Mises distribution. The notation while recording odometric information. This assumption def ´ Ê ℄µ represents the mean of the distribution may be troublesome in practice. The rest of the paper dis- for this heading change, while Ê def ´ ℄µ represents cusses the potential problems, presents a method for re- laxing the assumptions and addressing the problems, and 3 the concentration parameter around the mean . The three demonstrates the effectiveness of the solutions through ex- constraints described before for the components of Ê, (ide- periments and results. ally) hold for the component as well. Similarly, every observed relation item, ÖØ, in the expe- 4.1 MOTIVATION rience sequence , has a heading-change component, , We tend to think about an environment as consisting of which records the robot’s estimated change in heading be- landmarks ﬁxed in a global coordinate system and corri- tween the state at time Ø, ÕØ, and the state ÕØ·½. dors or transitions connecting these landmarks. However, The reestimation formula for the von Mises mean parame- this view may be problematic when robots are involved. ter of the heading change between states × and × is: Conceptually, a robot has two levels in which it operates; ¼ Ì ¾ ½ the abstract level, in which it centers itself through cor- × Ò´ÖØ ℄µ Ø´ µ × Ò´ÖØ ℄µ Ø´ µ℄ ridors, follows walls and avoids obstacles, and the phys- ical level in which motors turn the wheels as the robot Ö Ø Ò Ø ¼ Ì ¾ moves. In the physical level many inaccuracies can oc- Ó×´ÖØ ℄µ Ø´ µ · Ó×´ÖØ ℄µ Ø´ µ℄ cur: unaligned wheels or unsynchronized motors can cause sidewards drift, an obstacle under a wheel can cause the Ø ¼ robot to slightly rotate around itself, or uneven ﬂoors may The fraction denotes the ratio between the expected sine cause the robot to slip in a certain direction. In addition, and the expected cosine of the heading change from state the odometric measuring instrumentation may be inaccu- to state . Since the heading change from to is iden- rate in and of itself. In the abstract level, corrective actions tical in magnitude but opposite in direction to the heading are constantly executed to overcome the physical drift and change from to , the transitions from to are also ac- drag. For example, if the left wheel is disaligned and drags cumulated – with reversed signs. By taking Ö Ø Ò of this the robot leftwards, a corrective action of moving to the ratio we get an estimate for the mean heading change itself. right is constantly taken in the higher level to keep the robot centered in the corridor. To reestimate the concentration parameter, we need to ﬁnd such that: Such phenomena greatly effect the odometry recorded by the robot, if it is interpreted with respect to one global ÈÌ ¾ Á½ ℄ Ø ¼ Ø ´ µ Ó×´ÖØ ℄ µ℄ framework. For example, consider the robot depicted in Figure 4. It drifts to the left Æ when moving from one ÈÌ ¾ Á¼ ℄ Ø ¼ Ø´ µ state to the next, and corrects for it by moving Æ to the right to maintain itself centered in the corridor, moving 3 In contrast, Ü and Ý are normally distributed and have their along the solid arrow. Let us assume that states are lo- variance rather than concentration stored in Ê. 30000 25000 20000 −φ 15000 10000 5000 -25000 -20000 -15000 -10000 -5000 5000 10000 Figure 4: The robot moves in a corridor along the solid arrow, Figure 5: A path in a perpendicular environment, plotted based correcting for drift in the direction of the dashed arrow. on odometric readings taken by the robot Ramona. y cated along the center of the corridor, which is aligned with the Ý axis of the global coordinate system. The robot steps back and forth in the corridor. Whenever it reaches ∆x a state, its odometry reading changes by Ü Ý along the Sj heading dimensions, respectively. As the robot Si ∆θ proceeds, the deviation with respect to the Ü axis becomes ∆y more and more severe. Thus, after going through several transitions, the odometric changes recorded between every x pair of states, with respect to a global coordinate system, become larger and larger (especially in the dimension). Figure 6: Robot in state Ë , facing in the direction of the Ý axis. Similar problems of inconsistent odometric changes tem whose origin is anchored at the state × ; the Ý axis is recorded between pairs of states can arise along any of the aligned with the robot’s heading in state × and the Ü axis is odometric dimensions. It is especially severe when such perpendicular to it. This is depicted in ﬁgure 6. The robot inconsistencies arise with respect to the heading, since this is in state × facing in the direction pointed to by the Ý axis. can lead to confusion between the and the axes, as Its relationship to the state × is described in terms of the well as confusion between forwards and backwards move- coordinate system shown in the ﬁgure. Its heading in each ment (when the deviation in the heading is around Æ or ¼ state is denoted by the bold arrow. ½¼ Æ respectively). An example of our robot view of a per- To support this interpretation of the relation matrix we need fectly perpendicular ofﬁce environment, based on its odo- to revisit the formulation of the geometrical-consistency metric readings within a global coordinate system, is shown constraints stated in Section 2, as well as the update for- in Figure 5. The data was collected by our robot Ramona, mulae used when learning the model. while moving along the corridors in an area of our depart- ment, depicted in Figure 7. The consistency constraints have to reﬂect the coordi- nate system with respect to which the odometry is repre- A solution to such a situation is to model the odometric re- sented. Since the heading measurement is independent of lations of moving from state × to state × using a changing any speciﬁc coordinate system, only the constraints over coordinate system which is respective to state × , as op- the Ü and Ý components of the odometric relation need posed to a global coordinate system anchored at the initial to be redeﬁned. We denote by Ü Ý the vector ´ µ state. We formalize this idea and provide the update rules for the odometric information based on this approach in the ´Ê Ü ℄µ ´ ℄µ Ê Ý . Let us deﬁne Ì to be the trans- formation which maps an Ü Ý pair represented with re- rest of this section. We have implemented our solution, and spect to the coordinate system of state , to the same pair demonstrate its effectiveness throughout Section 5. represented with respect to the coordinate system of state 4.2 LEARNING ODOMETRIC RELATIONS WITH , Ü Ý , (note that Ì Ì ½). CHANGING COORDINATES More explicitly, as before, let be the mean change ´ µ As before, our experience sequence consists of Ì pairs in heading from state to state (recall that ´ µ ÖØ ÎØ of recorded odometric relations and observation ´ µ). The transformation Ì is deﬁned as follows: vectors. The odometric relations are still recorded with re- Ü Ì Ü Ü Ó×´ ´ µµ Ý × Ò´ ´ µµ spect to the robot’s global coordinate system. However, when learning the relation matrix from the odometric read- Ý Ý Ü × Ò´ ´ µµ · Ý Ó×´ ´ µµ ings, we interpret the entry Ê in the relation matrix Ê, as We can now redeﬁne the consistency constraints for the Ü encoding the information with respect to a coordinate sys- and Ý components of the odometric relation: ½ 3 4 5 6 7 8 2 9 ¾ ½¼ ½ ½¿ ¼ 10 1 ¿ 11 ½ ½¾ 0 12 16 15 14 13 ½ ½½ Figure 7: Model of a prescribed path through a true hallway Figure 8: Learned topological model. environment. ¥ ÜÝ ´ µ ¼¼ ; hallways from intersection to intersection and to turn ninety ¥ ÜÝ ´ µ Ì Ü Ý ´ µ¡ (anti-symmetry); degrees to the left or right. Ultrasonic data interpretation ¥ Ü Ý ´ µ ÜÝ ´ µ·Ì Ü Ý ´ µ¡´ Ø Ú ØÝµ let her perceive, in three directions – front, left and right – whether there is an open space, a door, a wall, or some- The reestimation formulae for all the parameters except for thing unknown. Doors and intersections constitute states. the Ü and Ý components of the relation matrix Ê, remain as When they are detected by Ramona, it stops and records its before. However, the reestimation formulae for the Ü and observations, as well as its odometric change between the Ý parameters are changed to reﬂect the relative coordinate previous and the current state. All recorded measures as systems used. Ü and Ý are reestimated as follows: well as the actions are, of course, subject to error. Ì ¾ Ø´ µ ÖØ Ü ℄ Ì Ø´ µÌ ¾ ÖØ Ü ℄ The path Ramona followed consists of 4 connected corri- Ü Ø ¼ ÖØ Ý ℄ Ø ¼ ÖØ Ý ℄ dors, which include 17 states, as shown in Figure 7. Black dots represent the physical locations of states. Multiple Ý Ì ¾ ´ Ø´ µ · Ø´ µµ states (depicted as numbers in the plot) associated with a single location correspond to different orientations of the Ø ¼ robot at that location. The larger black circle, at the bottom These reestimation rules are guaranteed to satisfy the ﬁrst left corner, represents the starting position. The observa- two geometrical constraints, but not the additivity con- tions associated with each state are omitted for clarity. A straint. Their correctness can be proved along the lines of projection of the odometric readings that Ramona recorded the correctness proofs for all other formulae [16]. along the Ü and Ý dimensions, is shown in ﬁgure 5. To statistically evaluate our algorithm, we use a simulated 5 EXPERIMENTS AND RESULTS ofﬁce environment in which the robot follows a prescribed The goal of this work is to use odometry to improve the path. It is represented as an HMM consisting of 44 states, learning of topological models, while using fewer iterations and the associated transition, observation, and odometric and less data. We tested our algorithm in a simple robot- distributions. Figure 9 depicts this HMM . Arrows repre- navigation world. In earlier stages of this work, a strong sent transitions that have probability ¼¾or higher. Solid assumption underlay our experiments: the corridors in the arrows represent the most likely transitions between the environment are all perpendicular to each other, and the states. We generated 5 data sequences from the model, each agent was using this perpendicularity to reset its position of length 800, using Monte Carlo sampling. One of these while accumulating the odometric readings. Here we have sequences is depicted in Figure 10. Again, observations are updated the algorithm and dropped the assumption. The ex- omitted, and this is a projection of the odometry readings periments demonstrate that the use of odometry, even with onto a global 2-dimensional coordinate system. For each accumulated rotational error and without using the perpen- sequence we ran our algorithm 10 times. We also ran the dicularity assumption, is still very beneﬁcial. standard Baum-Welch algorithm, not using odometric in- formation, 10 times on each sequence. For both algorithms 5.1 EXPERIMENTAL SETTING we started each run from a randomly picked initial model. Our experiments use both real robot data and simulated 5.2 RESULTS data. We ran our robot Ramona, a modiﬁed RWI B21, along a prescribed4 directed path in our department corri- We used our algorithm to learn a topological model of the dors. Low-level routines let Ramona move forward through environment from the data gathered by Ramona. Figure 8 4 shows the topology of one typical learned HMM . The bold Hence, no decisions are executed by the robot, and the model is an HMM and not a complete POMDP . circle represents the initial state. The arrows semantics is 5000 12 10 9 13 11 8 -15000 -10000 -5000 5000 23 6 7 42 43 22 20 0 21 19 -5000 5 3 2 1 4 41 18 14 15 17 24 38 35 25 36 16 37 34 -10000 40 29 39 26 27 28 33 30 31 32 -15000 -20000 Figure 9: Model of a prescribed path through the simulated Figure 10: A data sequence generated by our simulator. hallway environment. as stated before. It is clear that the learned topology corre- 25 sponds well to the topology of the true environment. The 20 observation distributions learned are omitted from the ﬁg- No Odometry 15 ure, but they too correspond well to the walls, doors and KL openings encountered along the path, while incorporating 10 the identiﬁcation error resulting from noisy sensors. 5 Odometry Used Traditionally, in simulation experiments, learned models 0 0 200 400 Seq. Length 600 800 are quantitatively compared to the actual model that gen- erated the data. Each of the models induces a probabil- Figure 11: Average KL-divergence as a function of length. ity distribution on strings of observations; the asymmetric Kullback-Leibler divergence [11] between the two distri- we used the simple two-sample t-test. The models learned butions is a measure of how far the learned model is from using odometric information have highly statistically sig- the true model. We report our simulation results in terms niﬁcantly (Ô ¼ ) lower average KL divergence than of a sampled version of the KL divergence, as described by the others. Juang and Rabiner [9]. It is based on generating sequences In addition, the number of iterations required for con- of sufﬁcient length according to the distribution induced vergence when learning using odometric information is by the true model, and comparing their likelihoods accord- smaller than required when ignoring such information. ing to the learned model with the true model likelihoods. We ignore the odometry information when applying the KL Again, the t-test veriﬁes the signiﬁcance (Ô ) of ¼ this result. measure, thus allowing comparison between purely topo- logical models that are learned with and without odometry. To examine the inﬂuence of the amount of data on the qual- ity of the learned models, we took one of the 5 sequences Table 1 lists the KL divergence between the true and learned model, as well as the number of runs until convergence was (Seq. ½ ) and used its preﬁxes of length 100 to 800 (the complete sequence), in increments of 100, as individual se- reached, for each of the 5 simulation sequences under the quences. We ran the two algorithmic settings over each of two learning settings, averaged over 10 runs per sequence. the 8 preﬁx sequences, 5 times repeatedly. We then used The table demonstrates that the KL divergence with respect the KL-divergence as described above to evaluate each of to the true model for models learned using odometry, is the resulting models with respect to the true model. For about 4-5 times smaller than for models learned without each preﬁx length we averaged the KL-divergence over the odometric data. To check the signiﬁcance of our results 5 runs. Table 2 summarizes the results of this experiment. It lists the mean KL-divergence over the 5 runs for each of Table 1: Average results of 2 learning settings with 5 training the preﬁxes, as well as the standard deviation around this sequences. mean. The plot in Figure 11 depicts the KL-divergence as Seq. # 1 2 3 4 5 a function of the sequence length for each of the settings. With KL 1.115 1.100 1.095 1.139 1.129 Both the table and the plot demonstrate that, in terms of the Odo Iter # 69.7 81.8 84.3 52.4 112.9 KL-divergence, our algorithm, which uses odometric infor- No KL 5.575 4.499 4.997 4.491 5.791 mation, is robust in the face of data reduction. In contrast, Odo Iter # 120.4 107.5 116.2 113.3 120.6 learning without the use of odometry is much more sensi- Table 2: Average results with 8 incrementally longer sequences. Seq. Length 800 700 600 500 400 300 200 100 With Mean KL 1.136 1.201 1.191 1.241 1.216 1.272 1.771 15.076 Odo Std. Dev. 0.091 0.083 0.131 0.082 0.036 0.085 0.510 12.884 No Mean KL 5.790 6.249 8.354 10.390 11.490 14.772 20.044 26.619 Odo Std. Dev. 0.554 0.937 0.179 0.460 0.422 1.280 0.904 0.460 tive to reduction in the amount of data. Again, we applied [6] E. G. Gumbel, J. A. Greenwood, D. Durand. The circular the two-sample t-test, which veriﬁed the statistical signiﬁ- normal distribution: Theory and tables. American Statistical cance of these results. Society Journal, 48, 131–152, March 1953. [7] D. Heckerman, D. Geiger. Learning Bayesian networks: A 6 CONCLUSIONS uniﬁcation for discrete and Gaussian domains. In Proc. of the ½½Ø Int. Conf. on Uncertainty in AI. 1995. Directional information which comes up in various appli- cations of computer science in general and machine learn- [8] B. H. Juang. Maximum likelihood estimation for mix- ture multivariate stochastic observations of Markov chains. ing in particular, requires special treatment. Currently most AT&T Technical Journal, 64(6), July-August 1985. statistical models and applications are based on distribu- tions that are either discrete or continuous along the real [9] B. H. Juang, L. R. Rabiner. A probabilistic distance measure line, rather than circular. It is important to be aware of the for hidden Markov models. AT&T Technical Journal, 64(2), 391–408, February 1985. need for circular distributions as well as of their existence. Moreover, it would be useful to have widely used applica- [10] S. Kotz, N. L. Johnson, eds. Encyclopedia of Statistical Sci- tions such as Autoclass [2] support such distributions. ences, vol. 2, pp. 381–386. John Wiley and Sons, 1982. A problematic aspect of directional data which manifests [11] S. Kullback, R. A. Leibler. On information and sufﬁciency. itself when learning maps and models for robot navigation Annals of Mathematical Statistics, 22(1), 79–86, 1951. is that of cumulative rotational errors. In the context of [12] F. Lu, E. E. Millios. Globally consistent range scan align- our work we have demonstrated that the use of relative co- ment for environment mapping. Autonomous Robots, 4, ordinate systems rather than global ones supports learning 333–349, 1997. relationship between states. The main point shown by this [13] K. V. Mardia. Statistics of Directional Data. Academic paper is that through correct treatment of directional data, Press, 1972. odometric information which is weak and very noisy still provides a signiﬁcant leverage when learning a purely topo- [14] I. Nourbakhsh, R. Powers, S. Birchﬁeld. Dervish: An ofﬁce- navigating robot. AI Magazine, 16(1), 53–60, 1995. logical map. [15] L. R. Rabiner. A tutorial on hidden Markov models and se- Acknowledgments lected applications in speech recognition. Proc. of the IEEE, We thank Sebastian Thrun for his insightful comments, and Dim- 77(2), 257–285, February 1989. itris Michailidis for his editorial help. This work was supported by [16] H. Shatkay, L. P. Kaelbling. Learning hidden Markov mod- DARPA/Rome Labs Planning Initiative grant F30602-95-1-0020, els with geometric information. Tech. Rep. CS-97-04, Dept. by NSF grants IRI-9453383 and IRI-9312395, and by the Brown of Computer Science, Brown University, 1997. University Graduate Research Fellowship. [17] H. Shatkay, L. P. Kaelbling. Learning topological maps with References weak local odometric information. In Proc. of the ½ Ø Int. [1] A. R. Cassandra, L. P. Kaelbling, J. A. Kurien. Acting un- Joint Conf. on AI. 1997. der uncertainty: Discrete Bayesian models for mobile-robot navigation. In Proc. of IEEE/RSJ Int. Conf. on Intelligent [18] R. G. Simmons, S. Koenig. Probabilistic navigation in par- Robots and Systems. 1996. tially observable environments. In Proc. of the Int. Joint Conf. on AI. 1995. [2] P. Cheeseman, et al. Autoclass: A Bayesian classiﬁcation system. In J. W. Shavlik, T. G. Dietterich, eds., Readings in [19] R. Smith, M. Self, P. Cheeseman. A stochastic map for un- Machine Learning. Morgan-Kaufmann, 1990. certain spatial relationships. In S. S. Iyengar, A. Elfes, eds., Autonomous Mobile Robots. IEEE Press, 1991. [3] A. P. Dempster, N. M. Laird, D. B. Rubin. Maximum like- lihood from incomplete data via the EM algorithm. Journal [20] S. Thrun, W. Burgard, D. Fox. A probabilistic approach of the Royal Statistical Society, 39(1), 1–38, 1977. to concurrent map acquisition and localization for mobile robots. Machine Learning, 31, 29–53, 1998. [4] F. C. Dyer. Bees acquire route-based memories but not cog- nitive maps in a familiar landscape. Animal Behaviour, 41, 239–246, 1991. [5] Z. Ghahramani, M. I. Jordan. Factorial hidden Markov mod- els. In ½ Ø Int. Conf. on Machine Learning. 1997.