Learning Center
Plans & pricing Sign in
Sign Out

39 Paper 31031092 IJCSIS Camera Ready pp. 254-259


Volume 8 No. 1, International Journal of Computer Science and Information Security

More Info
									                                              (IJCSIS) International Journal of Computer Science and Information Security,
                                              Vol. 8, No. 1, April 2010

    Recognition of Printed Bangla Document from Textual Image Using Multi-Layer
                          Perceptron (MLP) Neural Network
    Md. Musfique Anwar, Nasrin Sultana Shume, P. K. M. Moniruzzaman and Md. Al-Amin Bhuiyan
             Dept. of Computer Science & Engineering, Jahangirnagar University, Bangladesh

Abstract                                                      the process to classify the input character according
This paper focuses on the segmentation of printed             to the predefined character class. There is a
Bangla characters for efficient recognition of the            particular character set for each language in the
characters. The segmentation of characters is an              world and Bangla language has also its own
important step in the process of character                    character set with 49 characters, 10 digits,
recognitions because it allows the system to                  punctuations and other symbols.
classify the characters more accurately and quickly.          Bangla letters are formed in two-dimensional space
The system takes the scanned image file of the                based on mostly horizontal, vertical and are stroke
printed document as its input. A structural feature           [4].
extraction method is used to extract the feature. In          The Bangla characters are classified in two
this case, each individual Bangla character is                categorizes as follows:
converted to a M × N feature matrix. A Multi-                 i) Sorborno: ‘Shorborno’ like vowel of English
Layer Perceptron (MLP) neural network with back                    Language Character. There are eleven
propagation algorithm is chosen to feed the feature                ‘Shorborno’ characters. The first six characters
matrix to train with the set of input patterns and to              or letters have full matra, the 7th has half matra
develop knowledge to classify the character. The                   and the last four have no matra.
effectiveness of the system has been tested with              ii) Banjonborno: ‘Banjonborno’ is like as the
several printed documents and the success rates in                 consonant. There are 39 ‘Banjonborno’ in
all cases are over 90%.                                            Bangla letter. Here we are concerned about
Keywords:                                                          only the characters.
Character segmentation, Character recognition,
Feature extraction, Multi-Layer Perceptron (MLP),             Bangla scripts are moderately complex patterns.
etc.                                                          Each word in Bangla scripts is composed
                                                              of several characters joined by a horizontal line
                                                              (called ‘Matra’ or head-line) at the top. The
1. Introduction
                                                              concept of upper and lower case (as in English)
                                                              character is absent here. There are many
Optical character recognition [1] is one of the
                                                              composite characters, called “Jukto barna” as
attractive fields of image processing [2]. A
                                                              shown in Fig. 1. There are more that about 253
character recognition technique associates a
                                                              compound characters composed of 2, 3, or 4
symbolic identity with the image of a character. Lot
                                                              consonants (i.e. Banjonborno) [5]. There are some
of research works on Bangla Character recognition
                                                              other types of characters used in Bangla dictionary,
has been done through last few years. In the
                                                              called suffix-prefix characters as shown in Fig. 2.
modern approach, adaptive tools have been applied
to pattern recognition system. The Artificial Neural
Network (ANN) is the most popular adaptive tool                                  (a) Shorbarna
that is used for character recognition [3]. Most
application use feed forward ANN and a numerous
variant of classical backpropagation algorithm and
other training algorithms. The area of this research                              (b) Benjonbarno
is not only individual character recognition but it
attempts to retrieve a complete paragraph from its
optical image created by a scanner. In this paper we                            (c) Bangla numerals
proposed a way to recognize printed Bangla
document from textual image using multilayer
perceptron with backpropagation algorithm for                         (d) A few Bangla composite characters
individual character recognition.
                                                              Fig. 1 Some Bangla mainstream characters used for
                                                                            images recognition.
2. Bangla Character Set
Character is the fundamental attribute for writing
and reading a language. Character recognition is                   Fig. 2 Suffix-prefix determiner characters

                                                                                          ISSN 1947-5500
                                               (IJCSIS) International Journal of Computer Science and Information Security,
                                               Vol. 8, No. 1, April 2010

3. The System Overview                                         computer is able to classify characters within a
                                                               paragraph as human can identify them. The overall
The main phases of character recognition system is             method of the implemented system is illustrated in
the segmentation of text into characters so that the           the Fig. 3.

                                 Fig. 3 Overall diagram of the recognition system
3.1 Data Acquisition                                           lower boundary of the text line. There exist about 8
                                                               to 10 blank rows between two text lines.
The input images are acquired from documents
containing printed Bangla text by using scanner as             3.3.2 Word Segmentation
an input device. Scanned images are then stored as
an image file (.JPEG). Pre-processing is required to           Normally, in Bangle word there is no character
make the raw data of the image into usable format              spacing due to Matra (⎯⎯). We have to detect the
[6] because the scanned image does not happen to               Matra of a text line at first. Matra line is that row
be always in suitable form. This image is then                 that where the number of black pixels is the
passed for boundary detection.                                 maximum [1, 7]. After detecting a line, the system
                                                               scans the image vertically from the upper boundary
3.2 Boundary Detection                                         of the line and count the number of black pixels in
                                                               each column. Start position of a word is the first
We need to scan from the upper left and the bottom             column where black pixels found first. The system
right of the image to find the processing area of the          continues scanning until a column whose next
printed text document. The scanning is halted when             column has no black pixels, which is the end
it faces a single pixel.                                       position of the word. There exist about 4 to 6 blank
                                                               columns between two words.
3.3 Segmentation
                                                               3.3.3 Character Segmentation
In this phase text is partitioned into its elementary
entities i.e. characters. First the system detects the         To perform the separation of characters in a word,
region of a text line of the paragraph. Then the text          the system scans vertically from the start position
lines are segmented into words and the words are               of the word which is also the start position of the
divided into characters.                                       first character of the word. After finding the start
                                                               position of the character, it continues scanning until
3.3.1 Text Line Detection                                      a column whose next column has no black pixels,
                                                               which is the end position of the character. Every
Text line detection is performed by scanning the               consecutive character in a word contains 2 to 3
image row by row horizontally and keeps the                    blank columns shown in Fig. 4 .
numbers of black pixels in each row. Now the
boundary may be detected from the array by
counting the frequency of pixels in each line. In our
experiment we found the number of pixels of a
blank line in the image vary from 0 to 10. So the
                                                                Fig. 4 Character separation from below the Matra
number of pixels where text is present in the image
is much larger than that of blank in the paper.
                                                               3.4 Feature Extraction
There is a general concept that between two lines
more than two blank lines are present. In this way
                                                               Feature extraction is a subject of effective character
we detect the boundary of a text line.
                                                               recognition and it helps easing classification task.
                                                               Maximum height and width of Bangla characters
Upper boundary of a line is the first row where the
                                                               (without compound characters) of SutonnyMJ font
more black pixels are found. After finding the
                                                               with 10 font size is 6 × 7 and maximum 12 × 9 in
upper boundary, it continues scanning until a row
                                                               case of compound characters. After determining the
whose next row has no black pixels, which is the
                                                               start and end position of a character, the region of
                                                               that character is converted to a 6 × 7 matrix or

                                                                                          ISSN 1947-5500
                                                 (IJCSIS) International Journal of Computer Science and Information Security,
                                                 Vol. 8, No. 1, April 2010

12 × 9    matrix (for compound characters)                       In a back-propagation neural network, the learning
containing 0 and 1, where 1 represents the presence              algorithm has two phases. First, a training input
of character component and 0 represents the                      pattern (Bengali characters) is presented to the
absence of the character component.                              network input layer. The network then propagates
                                                                 the input pattern from layer to layer until the output
The boundary of all characters are not of equal size,            pattern is generated by the output layer. If this
i.e., the extracted matrices are not of equal size. If           pattern is different from the desired output, an error
some matrices are of smaller or greater height and               is calculated and then propagated backwards
width of our standard size then we scale the matrix,             through the network from the output layer to the
but, if the height is equal but width is less then, we           input layer. The weights are modified as the error is
add 0 to fill up the matrix to our standard size. The            propagated.
character matrix acts as input to the recognition
stage. The input matrix is then fed to the neural                A back-propagation neural network is determined
network.                                                         by the connections between neurons, the activation
                                                                 function used by the neurons, and the learning
3.5 Recognition Engine and Classifier                            algorithm that specifies the procedure for adjusting
                                                                 weights. The network architecture for the
                                                                 backpropagtion neural network is shown in Fig. 5.

                                  Fig. 5 Back-propagation neural network topology
A neuron determines its output by computing the                  Where t=1, 2, 3 and Yd,k (t) is the desired output
net weighted input:
      n                                                          of neuron k at iteration t.
X = ∑ x i w i − θ ………… (1)                                       Neuron k, which is located in the output layer, is
     i=1                                                         supplied with a desired output of its own. Hence
Where n is the number of inputs, and θ is                        we may use a straightforward procedure to update
threshold applied to the neuron. Next, this input
                                                                 weight w jk :
value is passed through the sigmoid activation
function:                                                        w jk (t + 1) = w jk (t) + Δw jk (t) ………… (4)
   Sigmoid       1
 Y         =           ………… (2)
                  −X                                             Where Δw jk (t) is the weight correction, given by:
             1+ e
To derive the back-propagation learning law, let us              Δw jk = α × y j (t) × δ k (t) ………… (5)
consider the three-layer network shown in Fig. 5.
The indices i, j, k here refer to neurons in the input,          Where δ k (t) is the error gradient at neuron k in
hidden and output layers, respectively. The symbol               the output layer at iteration t.
wij denotes the weight for the connection between
neuron i in the input layer and neuron j in the                  In order to calculate the weight correction for the
                                                                 hidden layer, we can apply the same equation as for
hidden layer, and the symbol w jk the weight                     the output layer:
between neuron j in the hidden layer and neuron k                 w ij (t + 1) = w ij (t) + Δw ij (t) ………… (6)
in the output layer.
To propagate error signals, we start at the output
                                                                 Where Δw ij (t) is the weight correction, given by:
layer and work backward to the hidden layer. The
error signal at the output of neuron k at iteration t is         Δw ij = α × x i (t) × δ j (t) ………… (7)
defined by:
 e k (t) = Yd,k (t) − Ya,k (t) ………… (3)                          Where δ j (t) represents the error gradient at neuron
                                                                 j in the hidden layer:

                                                                                            ISSN 1947-5500
                                                     (IJCSIS) International Journal of Computer Science and Information Security,
                                                     Vol. 8, No. 1, April 2010

                                     l                                          n
δ j (t) = y j (t) × [1 − y j (t)] × ∑ δ k (t)w jk (t) … (8)          X i (t) = ∑ x i (t) × w ij (t) − θ j ………… (10)
                                   k =1                                        i=1
Where l is the number of neurons in the output                      Where n is the number of neurons in the input
layer and,                                                          layer.
                   1                                                In our work, we use backpropagation neural
y j (t) =
                   − x i (t) ………… (9)                               network consisting of 42 neurons in input layer, 30
            1+ e                                                    neurons in the hidden layer and one output neuron
                                                                    in the output layer for character matrix of
                         Start                                      size 6 × 7 . And for character matrix of size 12 × 9 ,
                                                                    backpropagation neural network consists of 108
                                                                    neurons (i.e. as inputs), 80 neurons in the hidden
            Input the image of the paragraph                        layer and one output neuron in the output layer.
               which will be recognized                             The system recognizes a character if the output of
                                                                    the network is very close to one of the characters
                                                                    with a certain acceptable tolerance. If the output is
               Set index = 0,                                       far apart from all the possible outputs, then the
        maximum_epochs = 1000000000                                 system cannot identify the character. This process
                                                                    continues until the end of the text document. The
                                                                    entire operation of the system can be easily
                                                                    understood from the flow-chart shown in Fig. 6.
        Detect the boundary of the printed
          text document to perform the                              4. EXPERIMENTAL RESULT
           segmentation of characters
                                                                    We used bswing1_0_beta package for Bangla text
                                                                    output and neuralj-0.0.4 package to implement
       Select the character matrix of size
                                                                    backpropagation neural network in Java. The
            6 × 7 or 12 × 9 (for compound
                                                                    number of neurons of hidden layer is always set to
                      character)                                    (3/4) th of the number of neurons of input layer.
                                                                    We use ‘PatternSet’ class which represents a set of
                                                                    patterns.       The       function       ‘addPattern
                   Input the matrix
                                                                    (Pattern pattern)’ is used to add the required
                       to ANN
                                                                    patterns for all Bangla characters. The pattern for
                                                                    Bangla character looks like:
                   Calculate Output                                 ‘pattern_set.addPattern(new
                   Vector and error                                 Pattern("0;0;1;0;1;0;0;0;0;        0;0;1;0;0;1;0;0;0;
                                                                    0;0;1;0;0;0;1;0;0;                 0;0;1;0;0;0;0;0;0;
                                                                    0;0;0;0;0;0;0;0;0;                 0;0;0;0;0;0;0;0;0;
                                                                    0;0;0;0;0;0;0;0;0;                 0;0;0;0;0;0;0;0;0;
                    error ≤ 0.001                                   0;0;0;0;0;0;0;0;0;                 0;0;0;0;0;0;0;0;0;
                                                                    0;0;0;0;0;0;0;0;0;               0;0;0;0;0;0;0;0;0;",
        No                                           Yes
                                                                    where 0;0;1;0;1 ………………….. 0;0;0; is the
   Print “the character                  Add the character          input vector and ‘matrix_output_Str’ is the output
   is unrecognized”                      to output list             vector. We set the value of the following fields of
                                                                    ‘BackPropagation’ class as:

                                                                              Field                         value
                                                                             ‘desired_error’              0.001
                   Whole document                                          ‘maximum_epochs’           1000000000
                                                                    Then the training of backpropagation neural
                                               Yes                  network starts. After the training, the system scans
                              No                                    Bangla paragraph image and try to find the
                                                     Stop           correctly recognized characters and display those
         index = index + 1                                          characters as output. Fig. 7 illustrates the snapshot
                                                                    of the implemented method. Results for different
    Fig. 6 Flow-chart of the recognition system                     types of sentences are furnished in Table 1.

                                                                                                ISSN 1947-5500
                                                 (IJCSIS) International Journal of Computer Science and Information Security,
                                                 Vol. 8, No. 1, April 2010

                                   Fig. 7 Sample output of the proposed system

 Table 1: Success rate for experimental results

  Total no. of      Correctly         Success             300
   characters      recognized         rate (%)            250
                    characters                            200
      165              162             98.18              100
      288              275             95.49               0
                                                                  Total no. of       Correctly     Success rate (%)
                                                                  characters        recognized
      356              337             94.66                                        characters

                                                            Fig. 8 Success Rate Graph of experimental results

5. Conclusion                                                    backpropagation neural network. The system
                                                                 sometimes fails to recognize composite characters.
In this paper, we proposed a recognition system                  So to improve the performance of the system, the
emphasizing on the segmentation phase. The                       segmentation process can be improved to deal with
proposed system is capable of separating Bangla                  composite characters. In future, the proposed
letters, digits successfully from printed document.              recognition system may further be improved using
It recognizes the segmented characters using                     spell-checker.


[1]   M. E Hoque, M. J. H. Siddiqi, S.M. Kamruzzaman             [3]   S. M. M. Rahman, S. M. Rahman and M.A.
      and M. S. Chowdhury, “Efficient Method of Size                   Rashid, “Kohonen Neural Network in Character
      Independent     Printed     Bangla     Paragraph                 Recognition Applications”, Proceedings of
      Recognition Using ANN and Efficient                              NCCIS, pp. 106-110 (1997).
      Heuristics”,   Proceedings     of   International          [4]   M. R. Bashar, M. A. F. M. R. Hasan, M. F. Khan,
      Conference on Computer and Information                           “Bangla Off-Line Handwritten Size Independent
      Technology (ICCIT), Dhaka, Bangladesh, pp.                       Character Recognition Using Artificial Neural
      755-758 (2003).                                                  Netwroks Based on Windowing Technique”
[2]   Rafael C. Gonzalez, Digital Image Processing, 2nd                Proceedings of International Conference on
      Edition, Pearson Education publisher, New York,                  Computer and Information Technology (ICCIT),
      2002.                                                            Dhaka, Bangladesh, pp. 351-354 (2003).

                                                                                            ISSN 1947-5500
                                              (IJCSIS) International Journal of Computer Science and Information Security,
                                              Vol. 8, No. 1, April 2010

[5]   M. F. Zibran, A. Tanvir, R. Shammi and Md.                    Machine Intelligence, vol.19, no.5, pp.535-539
      Abdus Sattar, “Computer Representation Of                     (May 1997)
      Bangla Characters And Sorting Of Bangla
      Words”, Proceedings of International Conference         [7]   M. A. Sattar, K. Mahmud, H. Arafat and A. F. M.
      on Computer and Information Technology                        Noor-Uz-Zaman, “Segmenting Bangla Text for
      (ICCIT), Dhaka, Bangladesh, pp. 191-195 (2002).               Optical     Recognition”,    Proceedings     of
[6]   T.M. Ha and H. Bunke, “Off-line Handwritten                   International Conference on Computer and
      Numerical Recognition by Perturbation Method”,                Information Technology (ICCIT), Dhaka,
      IEEE Transactions on Pattern Analysis and                     Bangladesh, pp. 283-286 (2003).

                      Md.     Musfique      Anwar                                    P. K. M. Moniruzzaman
                      completed his B.Sc (Engg.) in                                  received his B.Sc (Hons) in
                      Computer      Science    and                                   Electronics and Computer
                      Engineering from Dept. of                                      Science     and    M.S.    in
                      CSE,           Jahangirnagar                                   Computer      Science     and
                      University, Bangladesh in                                      Engineering from Dept. of
                      2006. He is now a Lecturer in                                  CSE,            Jahangirnagar
the Dept. of CSE, Jahangirnagar University, Savar,                                   University, Bangladesh. He
Dhaka, Bangladesh. His research interests include             successfully completed his post-graduate project on
Artificial Intelligence, Neural Networks, Image               Image Processing under the supervision of Dr. Md.
Processing,     Pattern   Recognition,    Software            Al-Amin Bhuiyan. He is now working as a
Engineering and so on.                                        Database Administrator in a renowned commercial
                                                              bank in Dhaka, Bangladesh. His main research
                       Nasrin Sultana Shume                   interests include Natural Language Processing,
                       completed her B.Sc (Engg.)             Artificial Intelligence, Data Mining and so on.
                       in Computer Science and
                       Engineering from Dept. of                                     Md.     Al-Amin      Bhuiyan
                       CSE,          Jahangirnagar                                   received his B.Sc (Hons) and
                       University, Bangladesh in                                     M.Sc. in Applied Physics and
                       2006. She is now a Lecturer                                   Electronics from University
in the Dept. of CSE, Green University of                                             of Dhaka, Dhaka, Bangladesh
Bangladesh, Mirpur, Dhaka, Bangladesh. Her                                           in     1987     and     1988,
research interests include Artificial Intelligence,                                  respectively. He got the Dr.
Neural Networks, Image Processing, Pattern                                           Eng. Degree in Electrical
Recognition, Database and so on.                              Engineering from Osaka City University, Japan, in
                                                              2001. He has completed his Postdoctoral in the
                                                              Intelligent Systems from National Informatics
                                                              Institute, Japan. He is now a Professor in the Dept.
                                                              of CSE, Jahangirnagar University, Savar, Dhaka,
                                                              Bangladesh. His main research interests include
                                                              Image Face Recognition, Cognitive Science, Image
                                                              Processing,      Computer       Graphics,     Pattern
                                                              Recognition, Neural Networks, Human-machine
                                                              Interface, Artificial Intelligence, Robotics and so

                                                                                         ISSN 1947-5500

To top