VIEWS: 239 PAGES: 6 CATEGORY: Emerging Technologies POSTED ON: 11/2/2010
Vol. 8 No. 7 October 2010 International Journal of Computer Science and Information Security
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 8, No. 7, October 2010 Vectorization Algorithm for Line Drawing and Gap filling of Maps Ms.Neeti Daryal Dr Vinod Kumar Lecturer,Department of Computer Science, Reader,Department of Mathematics M L N College, Yamuna Nagar J.V.Jain College,Saharanpur Abstract Vectorization, i.e. raster-to-vector conversion is heart of graphics recognition problems, as it deals with converting the scanned image to a vector form suitable for further analysis. Many vectorization methods have been designed. This paper deals with the method of raster-to-vector conversion which proposed for capturing line drawing images. .In the earliest works on vectorization, only one kind of method was introduced. The proposed algorithm combines the features of thinning method and medial line extraction method so as to produce best line fitting algorithm. There are several steps in this process. The first step is Pre-processing, in which find the line into original raster image. Second is developing an algorithm for gap filling between the adjacent lines to produce vectorization for scanned map. Result and Literature about the above mentioned methods is also included in this paper. Key Words: Vectorization, Gap filling, Line drawing, Thinning algorithm, Medial algorithm 1. INTRODUCTION into vector lines automatically. In this paper, a new raster-to-vector conversion Graphics recognition is concerned with method is proposed for capturing high- the analysis of graphics-intensive quality vectors in a line drawing. documents, such as technical drawings, maps or schemas. Vectorization, i.e. Bitmap Image: Vector Graphic: raster-to-vector conversion, is of course a central part of graphics recognition problems, as it deals with converting the scanned map to a vector form suitable for further analysis. Line drawing management systems store visual objects as graphic entities. Many techniques have already been proposed for the extraction and recognition of graphic entities from scanned binary Figure 1[1]: Raster Figure1[2] Vector Graphics Graphics maps. In particular, various raster-to- vector conversion methods have been developed which convert image lines There are two kinds of computer graphics - raster (composed of pixels) 253 http://sites.google.com/site/ijcsis/ ISSN 1947-5500 (IJCSIS) International Journal of Computer Science and Information Security, Vol. 8, No. 7, October 2010 and vector (composed of edges of the shape) before the medial paths)[1]. Raster images are more axis between the two side edges is commonly called bitmap images. Vector graphics are called object-oriented graphics as shown in Figure 1[2]. 2. NEED OF VECTORIZATION In general, vector data structure produces smaller file size than raster image because a raster image needs space for all pixels while only point Figure 2: Defects of the thinning method coordinates are stored in vector representation [3]. This is even truer in found. The midpoint of two parallel lines the case when the graphics or images is given by the midpoint of a have large homogenous regions and the perpendicular line projected from one boundaries and shapes are the primary side to the other, and these midpoints are interest. coordinates which represent vectors [5].The medial line extraction method 3. RELATED WORK often misses pairs of contour lines at Vectorization techniques have been branches as shown in Figure 3[6] developed in various domains and a consequently it fails to find the midpoint number of methods have been proposed of parallel lines [8]. and implemented. These methods are roughly divided into two classes: Thinning based methods and Non- thinning based methods [4]. Thinning based methods are applied in most of the earlier vectorization schemes [4]. These methods usually employ an iterative boundary erosion process to remove outer pixels until only one- pixel-wide skeleton remains like Figure 3: Defects of the medial line extraction “peeling an onion” [5]. A polygonal method approximation procedure is then applied to convert the skeleton to a vector, which Other classes of non-thinning based may be a line segment or a plotline. The methods that also preserve line width thinning method tends to create noisy have been developed recently [5]. These junctions at corners, intersections, and include run graph based methods mesh branches as shown in the Figure 2[6]. pattern based methods and the Among the non-thinning based methods. Orthogonal Zig-Zag (OZZ) method. Medial line extraction methods, These methods are not included in this surveyed in were also popular in the paper. We are working with above said early days of vectorization [7]. Methods two methods only. of this class extract image contours (the The disadvantages of thinning based methods and medial line extraction 254 http://sites.google.com/site/ijcsis/ ISSN 1947-5500 (IJCSIS) International Journal of Computer Science and Information Security, Vol. 8, No. 7, October 2010 methods lead to a failure in fitting a line (1) Linking short line Segments into properly. But the thinning method is able longer integrated ones. to maintain connectivity but loses shape information. Interestingly, the medial (2) Correcting the defects at junctions. line extraction method has the complementary features; that is, it (3) Modifying vector attributes such as maintains shape information but tends to endpoints intermediate points and line lose line connectivity. In combination, width. they could be realized; good-quality extracted lines could be obtained. Linking short line segments into longer ones may yield the correct line width 4. PROPOSED and overcome some junction problems. VECTORIZATION PROCESS Other defects at junctions, such as corners and branches are subject to The following is an implementation of special processing [9]. The precise the line fitting concept. The purpose of intersection points. i.e. the endpoints of the particular method has been the vectors, are calculated. carefully designed to offer practical The combination has several steps in this performance with both acceptable process. processing speed and good vector The first step is preprocessing in which quality. Figure 4 shows a flowchart for find the line into original raster image. the whole procedure [5]. Second is Gap filling between the adjacent lines. 4.1 PREPROCESSING • A scanned line drawing is converted from binary raster image data to run length code data. • Processed into skeletons and Tracked for contours. • Each skeleton fragment is linked to neighboring contour fragments. • Processed into skeleton and contour fragment respectively. 4.2 GAP FILLING Figure 4: Flowchart of line fitting method based on contours and skeletons In a contour image the contour lines are split and the different contour levels are Basic vectorization requires the written in the gap. This causes problems following tasks: in automatic vectorization of images. Since the text are erased and not taken into account while vectorizing, the final 255 http://sites.google.com/site/ijcsis/ ISSN 1947-5500 (IJCSIS) International Journal of Computer Science and Information Security, Vol. 8, No. 7, October 2010 output has gaps in between lines. Gaps Using these coordinates we perform are also produced due to noise. Thus gap least square parabola fitting to get the filling should be given prime importance values of the coefficient, a, b and c. after processed into skeleton and contour Using the values of a, b and c and the x fragment respectively. A poor-quality coordinates of the two lines we can get line drawing often has gaps which an approximate value of y. There are prevent correct vector extraction [10]. other cases where we can directly extend Following algorithm shows the steps for the line and we do not have to gap filling approximate the curve. The X and Y coordinate are chosen based on four cases as shown below. Let us consider ALGORITHM that (x1, y1) and (x2, y2) are the end Step 1: Reading the input and getting points of two lines whose distance is less the x and y coordinates of the line. than the threshold value. Consider Figure 6, the end points are Step2: Get x and y coordinates of the highlighted in red. Here we can see that endpoints. x1≠ x2 and y1 ≠ y2 and x1 ≠ y1 and x2 ≠ y2. In this case since x ≠ y we cannot Step3: Find distance between endpoints. connect it using a straight line and so we After finding the end points we find the will use Least Square parabola to distance between the end points using interpolate the points in between the the Euclidean distance formula which endpoints. Using the x and y can be mathematically represented as, coordinates of the two lines we get the D = p(x1 − x2)2 + (y1 − y2)2 value of a, b and c using the steps Where D is the Euclidean distance and explained in the above section. After we (x1, y1) and (x2, y2) are endpoints. get the values of a, b and c we increment minimum value of x by 1 until it reaches Step 4: IF distance < threshold then set maximum value of x and substitute the the threshold otherwise stop. vale of x in the following equation to get the corresponding y value, Step5: Setting the threshold. Step6: Get the x and y coordinate of end points and five adjacent points corresponding to the line then we get the x and y coordinate of the end points that have distance that is less than the set threshold. Step7: Check if any of the distance are equal then we go to step 8 (slope Figure 5: Example of Gapfilling function) otherwise go to step 9 (Least Square Parabola). Step8: Slope Function Step9: Least Square Parabola fitting. 256 http://sites.google.com/site/ijcsis/ ISSN 1947-5500 (IJCSIS) International Journal of Computer Science and Information Security, Vol. 8, No. 7, October 2010 interpolate and get the x coordinate to get the corresponding y coordinate. 5. Results The result obtained has been shown using all the foresaid discussed methods displayed in the form of results as follows. Figure 8 is the scanned image and Figure 9 the corresponding gap filled image. Since this is an iterative Figure 6: Case 4: Gap Filling process all the gaps that are within the f(x) = a + bx + cx2 threshold are filled. Where f(x) = y. The least squares line method uses this equation to get the parabola graph. After getting the value of y we approximate the number to a natural number. The condition for approximation being that if the decimal value is greater than or equal to 0.5 then it is approximated (rounded) to the next number and if it is less than 5 then it is Figure 8: Contour Image with Gap approximated to the real number. For example if the value of y is 4.75 then it is approximated to 5 and if the value of y = 4.30 then it is approximated to 4. An example of gapfilling of this case is shown in Figure 7. Figure 9: Gap Filled Contour Image 6. CONCLUSION Figure 7: Example of Gapfilling In this paper, we have discussed the line Rounding the number or approximating formation, which has been done through is only done for raster images and not for the combination of line fragment and vector data since there is no need to contour fragment algorithm for building rasterized the curve. LSP is used only for a vectorization method which leads to case four because in the other cases we filling the gap between the lines. More get the exact coordinates by just specifically, the gap between the lines extending the line and we do not have to have been filled by Least Square 257 http://sites.google.com/site/ijcsis/ ISSN 1947-5500 (IJCSIS) International Journal of Computer Science and Information Security, Vol. 8, No. 7, October 2010 Parabola fitting algorithm This resultant of this method has been applied for the [7] Kasturi, S. T. Bow. W. El-Masri. J. correction of scanned map, shown as Shah, J. R. Gattiker, and U. B. Mokate; Figure 8 & 9. ”A System for Interpretation of Line Drawings”, IEEE Trans. on PAMI, 12( 6. REFRENCES IO), pp978-992, 1990. [l] J.Jimenez and J .L.Navalon, “Some [8] Borgefors. Distance Transforms in experiments in image vectorization,” Digital Images. Computer Vision, IBM J. Res. Develop. 26, pp.724- Graphics and Image Processing, 34:344- 734(1982) [4] R.O.Duda, P.E.Hart, “Use 371, 1986. of Hough transformation to detect lines [9] J.Canny. A Computational Approach and curves in pictures,” Commun.ACM, to Edge Detection. IEEE Transactions on 15, 1, pp.11-15(1972) [5] J. Jimenez and PAMI, 8(6):679-698, 1986.\ J.L. Navalon, -‘Some Experiments in Image Vectorization’ , IBM J. Res. [10] R.W. Smith, “Computer Processing Develop. 26, pp724-734, 1982. of Line Images: “A Survey”, Patteni Recognition, 20( l), pp7-15, 1987. [2] Smith R.W. (1978). Computer processing of line images: A survey. Pattern Recognition x; 20(1):7-15. [3] R.Kasturi, S.Siva, and L.O’Gorman, “Techniques for Line Drawing Interpretation: An Overview,” Proc. IAPR Workshop on Machine Vision Applications, pp. 15 1-160( 1990) [4] H.Tamura, “A Comparison of line thinning algorithms from digital geometry viewpoint,” Proc.4th Int. Jt Conf. on Pattern Recognition, Kyoto, Japan, pp715719, IEEE(1978). [5] F.Chang, Y.-C. Lu, and T. Pavlidis. Feature Analysis Using Line Weep Thinning Algorithm. IEEE Transactions on PAMI, 21(2):145-158, Feb. 1999. [6] Tainura, “A Comparison of Line Thinning Algorithms from Digital Geometry Viewpoint”, Proc. of 4th hit. Jt. Conf. on Pattem Recognition. Kyoto. Japan, pp715-719, 1978. 258 http://sites.google.com/site/ijcsis/ ISSN 1947-5500