Document Sample

Stereo Vision for Mobile Robotics Marti Gaëtan, Micro-Engineering Laboratory, Swiss Federal Institute of Technology of Lausanne. Abstract The Virtual Reality and Advanced Interfaces 2 Stereo vision (VRAI) group 1 is currently investigating stereo The geometric basis key problem in stereo vision is to vision for mobile robots. This paper provides an find corresponding points in stereo images. overview of both computational and biological Corresponding points are the projections of a single 3D approaches to stereo vision, stereo image point in the different image spaces. The difference in the processing and robot navigation. position of corresponding points in their respective images is called disparity (see Figure 1). Disparity is a 1 Introduction function of both the position of the 3D scene point and of Two eyes or cameras looking at the same scene from the position, orientation, and physical characteristics of different perspectives provide a mean for determining the stereo devices (e.g. cameras). three-dimensional shape and position. Scientific investigation of this effect (called variously stereo vision, stereopsis or single vision) has a rich history in P psychology, biology and more recently, in the computational model of perception. Stereo is an important method for machine perception because it leads to direct depth measurements. Additionally, unlike monocular techniques, stereo does not infer depth from Ir weak or unverifiable photometric and statistical Il Ir Il assumptions, nor does it require specific detailed objects models. Once stereo images have been brought into Pl Pr Pr Pl point-to-point correspondence, recovering depth by triangulation is straightforward. Fl Left camera Fr Right camera = disparity Current range imagers have achieved real-time or near system (a) system (b) real-time performance on images of modest size. For example, stereo algorithms on standard hardware are capable of returning dense 128 x 128 range images at 10 Figure 1 (a) A system with two cameras is shown. The Hz, while scanning laser range-finders can operate at 2 focal points are F l and Fr, the image planes are I l and I r. Hz on 256 x 256 images. To take advantage of these A point P in the 3D scene is projected onto P l in the left devices, researchers have proposed numerous methods image and onto P r in the right image, (b) cyclopean for extracting 3-D information from range images. These methods operate either in 3-D Cartesian space view: the disparity is the difference in the position of (volumetric representations) or in a 2.5-D range image the projection of the point P onto the two stereo image space (contour map method). Contour map methods are planes. particularly attractive for computation bound applications such as mobile robots. In addition to providing the function that maps pair of We begin with a discussion of the geometric basis and corresponding images points onto scene points, a camera a computational model for stereo vision. Next, we briefly model can be used to constraint the search for describe biological aspects of depth perception and a corresponding image point to one dimension. Any point contour map method for depth processing. Finally, we in the 3D world space together with the centers of present an obstacle avoidance technique for mobile projection of two cameras systems, defines an epipolar robots using real-time stereo vision. plane. The intersection of such a plane with an image plane is called an epipolar line (see Figure 2). Every point of a given epipolar line must correspond to a single point on the corresponding epipolar line. The search for a match of a point in the first image may therefore be reduced to a one-dimensional neighborhood in the second 1 image plane (as opposed to a 2D neighborhood). VRAI Group, IMT-DMT / EPFL, Dr. C. Baur -1- [ I ( x i, y j ) I ( x i, y j )] 1 2 2 P C1 ( x, y, ) i, j P I 2 1 (x i, y j ) I 2 2 (x i, y j ) ’ i, j i, j It is important to know if a match is reliable or not. Epipolar The form of the correlation curve (for example C 1) can be plane used to decide if the probability of the match to be an Il error is high or not. Indeed, errors occur when a wrong Epipolar lines Ir peak slightly higher than the right one is chosen. Thus, if Pl= P’l P’r Pr in the correlation curve we find several peaks with approximately the same height, the risk of choosing the Fl Fr wrong one increases, especially if the image is noisy. Stereo baseline However, a confidence coefficient , proportional to the Left camera system Right camera system difference of height between the most important peaks may be defined. Other important information may also be Figure 2 Epipolar lines and epipolar planes. extracted from the correlation curve as, for instance, bland areas. When the stereo cameras are oriented such that there is a known horizontal displacement between them, disparity 3 Human depth perception can only occur in the horizontal direction and the stereo For human beings, correlation (as described in the images are said to be in correspondence. When a stereo previous section) is only a local mechanism of pair is in correspondence, the epipolar lines are stereoscopic vision [Bruce and Green 1990]. However, coincident with the horizontal scan lines of the digitized imagine the following experiment: pictures2. Ideally, one would like to find the correspondence of D1 D2 D3 G1 G2 G3 every individual pixel in both images of a stereo pair. However, it is obvious that the information content in the intensity value of a single pixel is too low for unambiguous matching. In practice, continuous areas of image intensity are the basic units that are matched. This Interlace of approach (called area matching) usually involves some points form of cross-correlation to establish correspondences. 2.1 Matching The main problem in matching is to find an effective definition of what we call a valid correlation. Correlation scores are computed by comparing a fixed window in the first image against a shifting window in the second. The second window is moved in the second Figure 3 Stereoscopic fusion false target problem. image by integer increments along the corresponding epipolar line and a correlation score curve is generated A stereoscopic system displays the set of points D 1, for integer disparity values. The measured disparity can D2, D3 for the right eye (see Figure 3) and G 1, G2, G3 for then be taken to be the one that provides the largest peak. the left one. An observer should be able to see any To quantify the similarity between two correlation interlace of points (grey points in the light grey area) but, windows, we must choose among many different criteria instead, they all succeed to the dark ones. This that produce reliable results in a minimum computation experiment shows that a global mechanism, based on time. We denote by I 1(x,y) and I 2(x,y) the intensity value criteria other than local correlation, is used. Among these at pixel (x,y). The correlation window has dimensions ones, the following are taken in account: (2n 1) (2m 1) . Therefore, the indexes which appear in the formula below vary between -n and +n for the i-index A principle of correlation based on the contours and between -m and +m for the j-index : of the image (see Figure 4a); A mechanism of cognitive interpretation which has, in some cases, more priority than the local mechanism of stereo vision [Maar 1982]; 2 This is quite difficult to obtain in practice. -2- A mechanism of pictorial clues of depth (relative intersecting the terrain. These cutting planes induce a size, relative height, perspective, shade, “fog” quantization of the 3-D space based on elevation. Our effect and interposition (see Figure 4b)); approach uses this basic idea, and consists of the following steps: A principle of dynamic clues, such as motion 1. Constructing a set of volumes in 3-D space using a parallax; set of cutting surfaces (not necessarily planar); Other mechanisms such as correlation of 2. Projecting the cutting surfaces back to the range frequency filtered images [Poggio and Poggio image to induce a quantization of the range data (see 1984]; Figures 5c and 5d); 3. Using the quantized range image to construct terrain models or other abstractions; In cases where the desired segmentation is relative to the sensor viewpoint, the first two steps can be achieved off-line, leading to significant computational savings, especially when the cutting surfaces are complicated. In addition, step 3 can often be performed in the range image space, which is much more efficient than working in the volumetric space. Also, in contrast to the grid- based approaches, the cutting surfaces need not be regular, and can be sized to take in account the precision (a) (b) and error characteristics of the range data. Figure 4 (a) “Illusory” contours defining a square giving the impression that this shape is placed in front of 4 circles, (b) interposition principle (cognitive interpretation). This list is not exhaustive but presents the most significant criteria belonging to the global system of the human stereoscopic vision 3. In summary, the human stereo system uses a number of (a) interesting methods, which work together to recover depth. On one hand, this system is very powerful because (b) even using one eye, it is possible to perceive depth. On the other hand, it is also very subjective because a trompe-l’œil can fool our perceptive system. In the next section, we will focus on an example of computational model for range (disparity) image parsing. 1 2 4 Contour map method For many applications, working in 3D Euclidean space (c) (d) turns out to be unnecessary and difficult to manage. To reduce the amount of data which has to be processed, we Figure 5 (a) Range image where light (green) pixels introduce a method of quantifying volumes that allows us represent points closer than dark ones, (b) Elevation to manipulate range images (see Figure 5a) directly, map composed of the superposing of 8 contours (a without having to first transform to 3-D space. The single contour is represented with a special pattern), (c) method is similar to the use of contour maps to represent and (d) projection of the cutting surfaces back in the elevations; hence, we call it the contour method range image for the special contour of image b. [Chauvin, Marti and Konolige 1997]. A contour represents the elevation at a particular height (see Figure 5b); all terrain between one contour line and the next is at 5 Mobile robot obstacle avoidance an elevation between that represented by the contour The contour method is well suited for use with the lines. Contour creation can be visualized as a set of vector field histogram (VFH) algorithm for mobile robot planes parallel to the ground at specified heights, obstacle avoidance [Borenstein and Koren 1991]. Originally developed with sonar sensors, the method used 3 three steps: See [Bruce and Green 1990] for more details. -3- 1. A regular 2-D histogram grid in plan view, holding the results of sonar sensor readings around the robot. The key step is calculating the histogram value h k for The value of each grid point represents the number each sector. Roughly speaking, this value represents the of sonar readings that indicated an object within the probability of finding an obstacle close to the robot in the point (see Figure 7a); direction of sector k. The simplest idea is to use a single 2. A polar histogram is computed from the histogram cutting surface at elevation over the ground plane grid, with k regular angular sectors instead of a sufficient to constitute an obstacle for the robot. Any rectilinear grid. The value h k of each sector in the points in the resulting contour (see Figure 6b) are polar histogram represents the obstacle density in obstacles, and we can use the number of such points in a that direction (see Figure 8); column and their distance to determine a histogram 3. Steering and velocity values are extracted from the value. polar histogram; Obstacle -45 +45 Figure 8 Polar histogram corresponding the obstacles of figure 7. Slice (a) The details of the weighting scheme we use are not critical; we expect almost any reasonable method that combines distance and number of points will work reasonably well. In our implementation we used a stereo system with disparity as the range metric, and let each contour cell m contributes ln[r (m)] to its sector value. Contour This measure compensates for the fact that the disparity increases hyperbolically as an object gets closer. (b) Figure 6 Representation of an obstacle (a) in the Cartesian space, (b) in the contour image space. In range images, each column of the image represents a polar sector whose angular width is determined by the camera parameters (see Figure 7b). We let the k sectors correspond to the columns of the range image. Thus, we (a) (b) can construct the polar histogram directly from the contour representation, without having to convert to Cartesian space. Sector 1 0 Sector k -45 +45 (c) S. 1 S. k Figure 9 Experimental results. (a) image of the scene, (b) corresponding disparity image, (c) from left to right: contour, object detection and polar histogram (the vertical line corresponds to the direction followed by the robot and the horizontal line, its (a) speed). Obstacle (b) Figure 7 Two obstacles (a) in the polar grid, (b) in the contour image grid. -4- The VFH method was implemented using a small research in the field of stereo-vision should be investigated stereo system 4 for range images and a PC for processing for the implementation of biological models as for example the VFH algorithm. The stereo system returned images multiple frequency filtering, in order to increase the quality at a 5 Hz rate, and the VFH processing took less than 10 of the range image. ms per image to format the polar histogram and extract the desired direction and speed of travel. Data were then 7 Acknowledgments sent to a robot navigation program 5 in order to steer a Koala6 or Pioneer7 robot. I would first like to thanks the Experimental Figure 9 shows a calculation of the polar histogram Psychology Department of University of Geniva directed from a typical range image and a single-contour by Prof. M. Flueckiger, witch gave my the opportunity to segmentation. In this case, the sensor covers about a 70 write this article. degree angle, and each sector is about 0.5 degrees. I would also like to thank the VRAI group of Swiss Image (a) is an intensity image of the scene, and (b) Federal Institute of Technology of Lausanne, especially shows the disparity map computed by a stereo system. Mr Terry Fong, Mr Didier Guzonni, Dr Charles Baur and Brighter green values are higher disparities, hence closer Mr Nicolas Chauvin for their help in the process of to the camera. The final set of images (c) shows the writing this paper. contour (left side), a segmentation of some obstacles On the American side, I would like to thank Mr Kurt (middle), and finally the polar histogram. The middle of Konolige and the SRI International. the histogram is straight along the camera optical axis, and the vertical line indicates the direction of travel that 8 References the VFH algorithm has found. From the picture, this is [Borenstein and Koren 1991] J. Borenstein and Y. the direction through the open door. Koren. The Vector Field Histogram - Fast Obstacle Because of the small sector size, there is considerable Avoidance for Mobile Robots. IEEE Journal of Robotics variation in adjacent histogram values, and the result and Automation 7(3), June 1991. could benefit from low-pass filtering. Additional enhancements in the construction of the [Bruce and Green 1990] V. Bruce and P. Green. Visual polar histogram from contours are under study, among Perception, Physiology and Ecology. Lawrence Erlbaum them: Associates Ltd. Publishers, 1990 Ground plane detection. A contour representing the [Chauvin, Marti and Konolige 1997] N. Chauvin, G. ground plane would give an indication that there Marti and K. Konolige. Contour Maps for Real-Time actually was a reasonable path in front of the robot. Range Image Parsing. Not yet published, January 1997. Here, the value ln[r (m)] for each cell in a sector would be subtracted from some initial constant for the sector; [Maar 1982] D. Maar. Vision : A computational Holes. A contour underneath the ground plane could investigation into the human representation and be used to check for holes near the robot. The addition processing of visual information. W.H. Freeman and Co. to the histogram would be the same as for positive- [Marti 1997] G.Marti. Diploma Work : Stereoscopic elevation obstacles; camera real time processing and robot navigation, March Small-height obstacles. Instead of a single contour 1997 at an appropriate height for obstacles, several could be positioned starting from just over the ground plane. [Poggio and Poggio 1984] G. Poggio and T. Poggio. The The contribution of elements from the lower contours analysis of stereopsis. Annuel Review of Neuroscience, 7, would be weighted by a fraction q depending on their 379-412. height. Thus, the robot would prefer smooth terrain to bumpy, even though it could negotiate the latter; 6 Conclusion Stereoscopic systems for robot navigation are currently possible using a new technology of low-resolution real-time devices. Although these devices don’t have the same performances as the human depth perception system, they seem efficient for simple applications such as obstacle avoidance using vector field histograms. Some more 4 The Small Vision ModuleTM developed by SRI International. 5 Saphira TM developed by SRI International and EPFL. 6 KoalaTM robot is developed by the K-team (EPFL). 7 PioneerTM robot is developed by SRI International. -5-

DOCUMENT INFO

Shared By:

Categories:

Tags:

Stats:

views: | 13 |

posted: | 10/24/2011 |

language: | English |

pages: | 5 |

OTHER DOCS BY pengxiuhui

Docstoc is the premier online destination to start and grow small businesses. It hosts the best quality and widest selection of professional documents (over 20 million) and resources including expert videos, articles and productivity tools to make every small business better.

Search or Browse for any specific document or resource you need for your business. Or explore our curated resources for Starting a Business, Growing a Business or for Professional Development.

Feel free to Contact Us with any questions you might have.