License Plate Recognition (1) What is LPR? (2) Application areas (3) Top-Level description of the prototype system (4) General methods Extraction of plate region I. Edge extraction II. Hough transform III. Histograms IV. Morphing Recognition I. Syntactic (a) Intersection method (b) Pixel method (c) Boundary-constraint-based OCR (d) Grid based OCR (e) Region matching based OCR II. Structural III. Neural networks General overview (5) Noise Removal Averaging (6) What to look for? (7) Case Study:- Speed Detection (1) What is LPR? LPR (License Plate Recognition) is an image-processing technology used to identify vehicles by their license plates. This technology is used in various security and traffic applications, such as the access-control system featured in the following animation: In the above example: while the vehicle approaches the gate, the LPR unit automatically "reads" the license plate registration number, compares to a predefined list and opens the gate if there is a match. (2) Application areas of LPR 1. This system can be used as "doorman" in entrance to restricted areas. 2. Traffic law enforcement. 3. Fee highways. 4. Parking 5. Access control 6. Traffic surveillance (3) Top-Level description of the prototype system: First a digital camera is used to record images of a passing car. A computer that received input from this tape makes the further analysis. The developed software analyzes an input video stream and cuts from it pictures, suspected to contain LP (License Plate). Afterwards, the picture is passed to OCR module that extracts a number from the picture. (4) General methods In the commercial systems following methodologies are being used: Extraction of Plate Region For extracting the plate region, techniques such as edge extraction, Hough transform , histogram analysis and morphological operators have been applied. Edge extraction An edge-based approach is normally simple and fast. However, it is too sensitive to the unwanted edges, which may happen to appear in the front part of a car. Therefore, this method cannot be used independently. Hough Transform Hough transform for line detection gives positive effect on images with a large plate region where it can be assumed the shape of license plate is defined by lines. However, it needs a large memory space and considerable amount of computing time. Histogram analysis In histogram analysis we make use of thresholding to map each pixel value to either foreground or background values. The process of assigning 1 of 2 values to each pixel is sometimes called binarization. A region is a connected portion of an image in which the pixels are considered uniform. When thresholding anything, there‘s a fundamental problem: how should we select the threshold value? The effect of various threshold values is depicted below T = 124 T = 137 T = 159 T = 213 One approach to selecting a threshold is Compute a histogram of the gray-scale image Analyze the histogram to identify significant concentrations of intensity values Select a threshold that separates the 2 concentrations Histogram A histogram is a representation of the statistical distribution of observed values. For an image, a histogram h(i) of image intensity values indicates the number of pixels having value i. A histogram looks something like this:- A histogram is often used to estimate the probability distribution of image intensities. In some cases, it is useful to think of a histogram as the sum of several Gaussian distributions. Histograms are often used to determine thresholds for images. A histogram does not contain information about the position of image contents. We can think of a histogram as a vector. Histograms are not limited to image intensities; we could compute the histogram of any set of numbers. The histogram based approach does not work properly on the image with noises and the image with tilted plate. Morphing This morphological operator is a composition of three basic operators: a dilation, an erosion of the input image by the input structuring element and a subtraction of these two results. Morphological operators often take a binary image and a structuring element as input and combine them using a set operator (intersection, union, inclusion, complement). They process objects in the input image based on characteristics of its shape, which are encoded in the structuring element. Usually, the structuring element is sized 3×3 and has its origin at the center pixel. It is shifted over the image and at each pixel of the image its elements are compared with the set of the underlying pixels. If the two sets of elements match the condition defined by the set operator (e.g. if the set of pixels in the structuring element is a subset of the underlying image pixels), the pixel underneath the origin of the structuring element is set to a pre-defined value (0 or 1 for binary images). A morphological operator is therefore defined by its structuring element and the applied set operator. For the basic morphological operators the structuring element contains only foreground pixels (i.e. ones) and `don't care's'. These operators, which are all a combination of erosion and dilation, are often used to select or suppress features of a certain shape, e.g. removing noise from images or selecting objects with a particular direction. The more sophisticated operators take zeros as well as ones and `don't care's' in the structuring element. The most general operator is the hit and miss, in fact, all the other morphological operators can be deduced from it. Its variations are often used to simplify the representation of objects in a (binary) image while preserving their structure, e.g. producing a skeleton of an object using skeletonization and tidying up the result using thinning. One important application of the morphologic gradient in binary images is to find their boundaries. An example is shown below Input image Negation of the gradient Morphology has been known to be strong to noise signals, but it is rarely used in real time system because of its slow operation. Morphological operators are applied to graylevel images to reduce noise or to brighten the image etc. However, for many applications, other methods like a more general spatial filter produces better results. Recognition The recognition process can be divided into 1) segmentation process 2) recognition process The most of conventional segmentation methods are rule-based methods utilizing for the specific placement of the characters, labeling, histogram, and so on. In the methods, for coping with unevenness of the color depth depending on the lighting condition and the dirtiness, the binarization process is more important than the other process. Nevertheless, if a noise such as a dust and a stain exists on or near a character, the character is being broken off and is segmented to too large or small area in segmentation process. This missegmentation is reason for decreasing of the recognition rate. Approaches for Recognition A) Syntactic approach: the ridge patterns and minutiae are approximated by a string of primitives – also called template. B) Structural approach: features based on minutiae are extracted and then represented using a graph data structure. Using the topology of the features does the matching. C) Neural networks approach: a feature vector is constructed and classified by a neural network classifier. A) The correlation or template matching approach to character recognition is straight- forward and can be reliable, provided the target is "cooperative" and the application remains invariant. As the name implies, once each character is isolated, the recognition engine attempts to match it against a set of predefined standards. Any condition-- lighting, viewing angle, obscuration, plate size, font-- that causes a character to vary from the standard is likely to confuse the engine and return a questionable result. Some approaches are :- Intersection Method The intersection method of character recognition is accomplished by placing a number of horizontal and vertical lines across the image of the character. Each line intersects the character a number of times. The number of intersections for each line forms a pattern, which is used to recognize the character. This method is very tolerant to the position of the character within the grid formed by the lines. If the character is placed in different locations within the grid, the pattern will be shifted, but will still be intact. Pixel Method The pixel method of recognition uses a library that contains an ideal version of each character that can be recognized. To recognize a character, the image of the character is compared pixel by pixel to each image in the library. For example, the image is compared to the library version of the letter A, B, C, and so on. The comparison that produces the highest percentage of similar pixels will be the software’s guess. Comparing equally spaced pixels instead of comparing every pixel of the image speeds up the process, with a loss of accuracy. A variation of this method involves checking a subset of pixels instead of comparing them all. Instead of comparing all of the pixels of the letter A, only the pixels in the general shape of an A are checked. This cuts down on the number of comparisons needed, but will generally be less accurate than comparing all of the pixels. The pixel method has two disadvantages. The first is that there are many comparisons performed, which will lead to long processing times. The second disadvantage is that for the pixel method to work, both the image of the character and the library version must be scaled and aligned very well to provide good results. Boundary-constraint-based OCR Boundary-constraint-based OCR recognizes characters by examining the boundaries of a character at specific points. First, two sets of three points are positioned above and below the character. These three points form a line, with one point at the left edge of the character, one at the right edge of the character, and one at the center. As shown in Figure below these two sets of points are then moved vertically towards the center of the character until they hit parts of the character. Boundary-Based OCR These six points are enough information to form a unique match with many characters in the English language. Where ambiguities exist, or to improve the accuracy of the recognition, the process can be repeated horizontally as shown in next figure Grid Based OCR Grid Based OCR . Even greater accuracy can be achieved by adding more points to the boundary lines like in next technique Region-Matching-Based OCR Region-Matching-Based OCR . Depending on the computational power available and the accuracy desired, it is not uncommon to use five or more boundary points on each side. While it is not needed for English capital letters and numbers, some character sets also require an "inner boundary" to be detected. This procedure is similar to that used to determine the outer boundaries, but these constraints expand from the center instead of contracting from the outside. B) Structural analysis uses a decision-tree to assess the geometric features of each character's contour. The technique can be somewhat tolerant of variations in size, tilt, and perspective. As a simple example, the characters B, D, 6, and 9. Features that might be used to distinguish them are the number of loops in each character-- one or two-- and the vertical position of the loop-- top, central, or bottom. Two loops point to a B, and one loop leads to the next branch of the tree. A loop at the top indicates a 9; if the loop is central, it's a D, and a loop at the bottom means a 6. Characters without loops (E, M, N, hyphen, etc.) require additionally complex, time-consuming analysis. C) Neural networks are trained by example instead of being programmed in a conventional sense. While learning to recognize a recurring pattern, the network constructs statistical models that adapt to individual characters' distinctive features. Therefore, neural networks tend to be resilient to noise, and performance usually is not compromised under changing operational conditions. However, each modification (e.g. a new font) that is presented to the neural network may require a significant investment in retraining. We now describe this popular method in little more detail:- A Neural network consists of at least three layers of units: an input layer, at least one intermediate hidden layer, and an output layer. The connection weights in a Neural network are one way. Typically, units are connected in a feed-forward fashion with input units fully connected to units in the hidden layer and hidden units fully connected to units in the output layer. When a Neural network is cycled, an input pattern is propagated forward to the output units through the intervening input-to-hidden and hidden-to-output weights. With Neural networks, learning occurs during a training phase in which each input pattern in a training set is applied to the input units and then propagated forward. The pattern of activation arriving at the output layer is then compared with the correct (associated) output pattern to calculate an error signal. The error signal for each such target output pattern is then backpropagated from the outputs to the inputs in order to appropriately adjust the weights in each layer of the network. After a Neural network has learned the correct classification for a set of inputs, it can be tested on a second set of inputs to see how well it classifies untrained patterns. Thus, an important consideration in applying Neural learning is how well the network generalizes. A typical Neural network looks something like this Typical architecture of a backpropagation network (5) Noise Removal Averaging There exist many kinds of noise in a surveillance system. If the noise is introduced by playing and digitization, it might be worthwhile to play the recording more often and average the final result. One danger with this method is however that the tape can be damaged if it is played more often. Averaging over multiple frames can also reduce the noise. This can be valuable if there is a night recording with a camera and a scene that does not move. One problem arises when in certain frames moving people are seen (figure 2). Since these moving people are not present in most of the frames (that are averaged), obviously they wouldn’t be visible in the final averaged frame. The value of this method depends on the part of the image that has to be visualized. (6) What to Look For? Evaluating a system's capabilities can require that one look beyond the recognition engine's proficiency with a standard character set under nearly ideal test conditions. It is prudent to determine all factors that can influence operations and to learn the effects of those variables that cannot be held constant. They include: Vehicle speed Volume of vehicle flow Ambient illumination (day, night, sun, shadow) Weather Vehicle type (passenger car, truck, tractor-trailer, etc.) Plate mounting (rear only or front and rear) Plate variety Plate jurisdiction (and attendant fonts) Camera-to-plate distance Plate tilt, rotation, skew Intervehicular spacing Presence of a trailer hitch, decorative frame, or other obscuring factors (7) Case Study :- Speed Detection An off-line system comprises of two parts; namely, the image capture hardware and the image processing system. A video camera is used to capture several series of video frames in different locations. A database is built up using these real-life captured video frames. The system comprises of PC hardware to digitize the video information and software to process the digitized images. The video is converted into a suitable format (like AVI) in the laboratory, a sequence of frames can be extracted from the AVI file. With the captured frames, image-processing techniques are used to extract the multiple objects from individual video frame. Subsequent to that, the location of each moving object is identified and the corresponding speed is abstracted. Foreground and background separation is first performed followed by filtering out the noise on the image by Morphology and other image filters. Locating each moving object by clustering and connectivity follows by estimation of the distance traveled for each moving object in a succession of images. Each object is distinguished by its license plate using LPR software. A background of the road is used as a reference for the subsequent image processing. An analysis is made of a typical video frame with multiple moving objects. The frame is processed with the moving objects being extracted from the background. A particular vehicle is identified with a LPR. With consecutive video frames, a succession of time sequence frame with multiple objects being extracted can be obtained. Knowing the timing between consecutive frame, speed of individual objects can be evaluated accordingly. Suppose we know the distance between camera and LP at an instant and again at some other instant, calculation of speed is a child’s play. The reference value can be some benchmark on road, given that all LP have fixed dimensions. Such a situation is quite rare. Hence, to take three frames as sample values bypasses all constraints written above. No need for fixed size LP, no benchmark, no prior knowledge of any distance. To include the case of curvi-linear motion, we suggest taking change of height of LP w.r.t time (assuming flat ground) instead of its width. The method just described uses one camera only to accomplish the job. A simpler technique is to use two cameras that are a fixed distance apart and detect LP of a vehicle as they pass a benchmark. Knowing time elapsed, we can calculate speed.The best advantage of such image based systems is that they are immune to electronic warfare. We thus have so far learnt that what is LPR , various techniques of implementation and applications of such systems in practical life. Such systems make our lives more safe , secure and convenient.
Pages to are hidden for
"license plate recognition"Please download to view full document