ocr by xuyuzhu


									Optical Character Recognition
    Introduction
    Document Scanning
    Intensity Histogram
    Image Segmentation
    Noise Removal
    Blob Coloring
    Blob Matching
    Spatial Analysis
    Textural Analysis

    OCR is one of the oldest and most successful CV applications
    OCR once required custom/expensive hardware (Kurzweil)
    OCR software now bundled on inexpensive flatbed scanners
    OCR on printed text is 95-99% accurate
    OCR on hand written text is much harder and used in limited domains

Document Scanning
   Typical OCR documents are digitized with a flatbed scanner
   Other options are digital video cameras or cell phones
   Images are often 8-bits/pixel or 1-bit/pixel
   Resolution varies from 100dpi to 600dpi
   Scanning an 8.5x11 document can produce very large images

Intensity Histogram
    The first step in OCR is to find the characters on the page
    We need to convert 8-bit image down to 1-bit (0=background, 1=character)
    The intensity histogram of an image shows the distribution of pixel values
    Dark pixels 0..50 typically correspond to characters
    Bright pixels 200..250 typically correspond to background

Image Segmentation
   The easiest form of image segmentation is intensity thresholding
   We select a threshold value T to separate character and background
   If pixel[y][x] < T we have character
   If pixel[y][x] > T we have background
   Threshold T can be chosen manually or automatically
   Optimal thresholding techniques were explored at length in the 70’s

Noise Removal
    In some cases scanner noise or dirt will result in small holes/specks
    These can be removed by smoothing the original image before thresholding
    Typical methods include: Binomial filter, Gaussian Filter, or Median filter
       Can also use mathematical morphology (erode/dilate/open/close) to
        remove holes/specks after thresholding the image

Blob Coloring
    Need to identify connected components in binary image
    Most of the time, these will correspond to individual characters
    Some characters are made up of two or more components (i, j)
    Multiple characters may be connected (e.g. underlined or hand written)
    4-connected pixels are adjacent in either X or Y direction
    8-connected pixels are adjacent in diagonal directions
    4 connected blob is made up of 4-connected pixels with same value
    8 connected blob is made up of 8-connected pixels with same value
    The process of locating all pixels in each blob in an image and assigning each
      a unique number is called blob coloring
    Recursive blob coloring is easy to implement but uses a lot of memory
         o Go over algorithm
         o Go over code
    Scan line blob coloring is tricky to implement but very effecient
         o Go over algorithm
         o Go over code

Blob Matching

Spatial Analysis

Textural Analysis

To top