Docstoc

RACHSU Algorithm based Handwritten Tamil Script Recognition

Document Sample
RACHSU Algorithm based Handwritten Tamil Script Recognition Powered By Docstoc
					                                                             (IJCSIS) International Journal of Computer Science and Information Security,
                                                             Vol. 8, No. 7, October 2010




      RACHSU Algorithm based Handwritten Tamil
                Script Recognition
                        C.Sureshkumar                                                              Dr.T.Ravichandran
           Department of Information Technology,                                     Department of Computer Science & Engineering,
           J.K.K.Nataraja College of Engineering,                                          Hindustan Institute of Technology,
                Namakkal, Tamilnadu, India.                                                  Coimbatore, Tamilnadu, India
            Email: ck_sureshkumar@yahoo.co.in                                             Email: dr.t.ravichandran@gmail.com

                                                                              describing the language of the classical period. There are
Abstract- Handwritten character recognition is a difficult problem            several other famous works in Tamil like Kambar Ramayana
due to the great variations of writing styles, different size and             and Silapathigaram but few supports in Tamil which speaks
orientation angle of the characters. The scanned image is segmented           about the greatness of the language. For example, Thirukural
into paragraphs using spatial space detection technique, paragraphs           is translated into other languages due to its richness in content.
into lines using vertical histogram, lines into words using horizontal
histogram, and words into character image glyphs using horizontal
                                                                              It is a collection of two sentence poems efficiently conveying
histogram. The extracted features considered for recognition are              things in a hidden language called Slaydai in Tamil. Tamil has
given to Support Vector Machine, Self Organizing Map, RCS, Fuzzy              12 vowels and 18 consonants. These are combined with each
Neural Network and Radial Basis Network. Where the characters are             other to yield 216 composite characters and 1 special character
classified using supervised learning algorithm. These classes are             (aayutha ezhuthu) counting to a total of (12+18+216+1) 247
mapped onto Unicode for recognition. Then the text is reconstructed           characters. Tamil vowels are called uyireluttu (uyir – life,
using Unicode fonts. This character recognition finds applications in         eluttu – letter). The vowels are classified into short (kuril) and
document analysis where the handwritten document can be converted             long (five of each type) and two diphthongs, /ai/ and /auk/, and
to editable printed document. Structure analysis suggested that the           three "shortened" (kuril) vowels. The long (nedil) vowels are
proposed system of RCS with back propagation network is given
higher recognition rate.
                                                                              about twice as long as the short vowels. Tamil consonants are
                                                                              known as meyyeluttu (mey - body, eluttu - letters). The
Keywords - Support Vector, Fuzzy, RCS, Self organizing map,                   consonants are classified into three categories with six in each
Radial basis function, BPN                                                    category: vallinam - hard, mellinam - soft or Nasal, and
                                                                              itayinam - medium. Unlike most Indian languages, Tamil does
                         I. INTRODUCTION                                      not distinguish aspirated and unaspirated consonants. In
                                                                              addition, the voicing of plosives is governed by strict rules in
Hand written Tamil Character recognition refers to the process                centamil. As commonplace in languages of India, Tamil is
of conversion of handwritten Tamil character into Unicode                     characterised by its use of more than one type of coronal
Tamil character. Among different branches of handwritten                      consonants. The Unicode Standard is the Universal Character
character recognition it is easier to recognize English                       encoding scheme for written characters and text. The Tamil
alphabets and numerals than Tamil characters. Many                            Unicode range is U+0B80 to U+0BFF. The Unicode characters
researchers have also applied the excellent generalization                    are comprised of 2 bytes in nature.
capabilities offered by ANNs to the recognition of characters.
Many studies have used fourier descriptors and Back                                         II. TAMIL CHARACTER RECOGNITION
Propagation Networks for classification tasks. Fourier
descriptors were used in to recognize handwritten numerals.                   The schematic block diagram of handwritten Tamil Character
Neural Network approaches were used to classify tools. There                  Recognition system consists of various stages as shown in
have been only a few attempts in the past to address the                      figure 1. They are Scanning phase, Preprocessing,
recognition of printed or handwritten Tamil Characters.                       Segmentation, Feature Extraction, Classification, Unicode
However, less attention had been given to Indian language                     mapping and recognition and output verification.
recognition. Some efforts have been reported in the literature
for Tamil scripts. In this work, we propose a                                 A. Scanning
recognitionsystem for handwritten Tamil characters.Tamil is a                 A properly printed document is chosen for scanning. It is placed
South Indian language spoken widely in TamilNadu in India.                    over the scanner. A scanner software is invoked which scans the
Tamil has the longest unbroken literary tradition amongst the                 document. The document is sent to a program that saves it in
Dravidian languages. Tamil is inherited from Brahmi script.                   preferably TIF, JPG or GIF format, so that the image of the
The earliest available text is the Tolkaappiyam, a work                       document can be obtained when needed.




                                                                         56                              http://sites.google.com/site/ijcsis/
                                                                                                         ISSN 1947-5500
                                                          (IJCSIS) International Journal of Computer Science and Information Security,
                                                          Vol. 8, No. 7, October 2010




B.Preprocessing                                                           strength of such a line varies with changes in language and
This is the first step in the processing of scanned image. The            script type. Scholkopf, Simard expand on this method,
scanned image is preprocessed for noise removal. The                      breaking the document image into a number of small blocks,
resultant image is checked for skewing. There arepossibilities            and calculating the dominant direction of each such block by
of image getting skewed with either left or right orientation.            finding the Fourier spectrum maxima. These maximum values
Here the image is first brightened and binarized. The function            are then combined over all such blocks and a histogram
for skew detection checks for an angle of orientation between             formed. After smoothing, the maximum value of this
±15 degrees and if detected then a simple image rotation is               histogram is chosen as the approximate skew angle. The exact
carried out till the lines match with the true horizontal axis,           skew angle is then calculated by taking the average of all
which produces a skew corrected image.                                    values within a specified range of this approximate. There is
                                                                          some evidence that this technique is invariant to document
                       Scan the Document                                  layout and will still function even in the presence of images
                                                                          and other noise. The task of smoothing is to remove
                                                                          unnecessary noise present in the image. Spatial filters could be
                          Preprocessing                                   used. To reduce the effect of noise, the image is smoothed
                                                                          using a Gaussian filter. A Gaussian is an ideal filter in the
                                                                          sense that it reduces the magnitude of high spatial frequencies
                          Segmentation                                    in an image proportional to their frequencies. That is, it
                                                                          reduces magnitude of higher frequencies more. Thresholding
                                                                          is a nonlinear operation that converts a gray scale image into a
                                                                          binary image where the two levels are assigned to pixels that
                       Classification (RCS)
                                                                          are below or above the specified threshold value. The task of
                                                                          thresholding is to extract the foreground from the background.
                                                                          Global methods apply one threshold to the entire image while
                        Feature Extraction                                local thresholding methods apply different threshold values to
                                                                          different regions of the image. Skeletonization is the process
                                                                          of peeling off a pattern as any pixels as possible without
                        Unicode Mapping                                   affecting the general shape of the pattern. In other words, after
                                                                          pixels have been peeled off, the pattern should still be
                                                                          recognized. The skeleton hence obtained must be as thin as
                       Recognize the Script                               possible, connected and centered. When these are satisfied the
                                                                          algorithm must stop. A number of thinning algorithms have
                                                                          been proposed and are being used. Here Hilditch’s algorithm
Figure 1. Schematic block diagram of handwritten Tamil Character
Recognition system                                                        is used for skeletonization.

Knowing the skew of a document is necessary for many                      C. Segmentation
document analysis tasks. Calculating projection profiles, for             After preprocessing, the noise free image is passed to the
example, requires knowledge of the skew angle of the image                segmentation phase, where the image is decomposed [2] into
to a high precision in order to obtain an accurate result. In             individual characters. Figure 2 shows the image and various
practical situations, the exact skew angle of a document is               steps in segmentation.
rarely known, as scanning errors, different page layouts, or
even deliberate skewing of text can result in misalignment. In            D.Feature extraction
order to correct this, it is necessary to accurately determine the        The next phase to segmentation is feature extraction where
skew angle of a document image or of a specific region of the             individual image glyph is considered and extracted for
image, and, for this purpose, a number of techniques have                 features. Each character glyph is defined by the following
been presented in the literature. Figure 1 shows the histograms           attributes: (1) Height of the character. (2) Width of the
for skewed and skew corrected images and original character.              character. (3) Numbers of horizontal lines present short and
Postal found that the maximum valued position in the Fourier              long. (4) Numbers of vertical lines present short and long. (5)
spectrum of a document image corresponds to the angle of                  Numbers of circles present. (6) Numbers of horizontally
skew. However, this finding was limited to those documents                oriented arcs. (7) Numbers of vertically oriented arcs. (8)
that contained only a single line spacing, thus the peak was              Centroid of the image. (9) Position of the various features.
strongly localized around a single point. When variant line               (10) Pixels in the various regions.
spacing’s are introduced, a series of Fourier spectrum maxima
are created in a line that extends from the origin. Also evident                       II. NEURALNETWORK APPROACHES
is a subdominant line that lies at 90 degrees to the dominant             The architecture chosen for classification is Support Vector
line. This is due to character and word spacing’s and the                 machines, which in turn involves training and testing the use



                                                                     57                              http://sites.google.com/site/ijcsis/
                                                                                                     ISSN 1947-5500
                                                               (IJCSIS) International Journal of Computer Science and Information Security,
                                                               Vol. 8, No. 7, October 2010




of Support Vector Machine (SVM) classifiers [1]. SVMs have                   weight vector of the same dimension as the input data vectors
achieved excellent recognition results in various pattern                    and a position in the map space. The usual arrangement of
recognition applications. Also in handwritten character                      nodes is a regular spacing in a hexagonal or rectangular grid.
recognition they have been shown to be comparable or even                    The self organizing map describes a mapping from a higher
superior to the standard techniques like Bayesian classifiers or             dimensional input space to a lower dimensional map space.
multilayer perceptrons. SVMs are discriminative classifiers
based on vapnik’s structural risk minimization principle.                    C.Algorithm for Kohonon’s SOM
Support Vector Machine (SVM) is a classifier which performs                  (1)Assume output nodes are connected in an array, (2)Assume
classification tasks by constructing hyper planes in a                       that the network is fully connected all nodes in input layer are
multidimensional space.                                                      connected to all nodes in output layer. (3) Use the competitive
                                                                             learning                                             algorithm.
A.Classification SVM Type-1                                                  | ωi − x |≤| ωκ − x | ∀κ (5)
For this type of SVM, training involves the minimization of
the error function:                                                           wk (new) = wk (old ) + μχ (i, k )( x − w k ) (6)
1 T           N                                                              Randomly choose an input vector x, Determine the "winning"
  w w + c ∑ ξi (1)                                                           output node i, where wi is the weight vector connecting the
2         i −1                                                               inputs to output node.
subject to the constraints:                                                  A new neural classification algorithm and Radial- Basis-
yi ( wT φ ( xi ) + b) ≥ 1 − ξ i andξ i ≥ 0, i = 1,..., N (2)                 Function Networks are known to be capable of universal
Where C is the capacity constant, w is the vector of                         approximation and the output of a RBF network can be related
Coefficients, b a constant and ξi are parameters for handling                to Bayesian properties. One of the most interesting properties
no separable data (inputs). The index i label the N training                 of RBF networks is that they provide intrinsically a very
cases [6, 9]. Note that y±1 represents the class labels and xi is            reliable rejection of "completely unknown" patterns at
the independent variables. The kernel φ is used to transform                 variance from MLP. Furthermore, as the synaptic vectors of
data from the input (independent) to the feature space. It                   the input layer store locationsin the problem space, it is
should be noted that the larger the C, the more the error is                 possible to provide incremental training by creating a new
penalized.                                                                   hidden unit whose input synaptic weight vector will store the
                                                                             new training pattern. The specifics of RBF are firstly that a
B.Classification SVM Type-2                                                  search tree is associated to a hierarchy of hidden units in order
In contrast to Classification SVM Type 1, the Classification                 to increase the evaluation speed and secondly we developed
SVM Type 2 model minimizes the error function:                               several constructive algorithms for building the network and
                                                                             tree.
1 T         1 N
  w w − vρ + ∑ ξi (3)
2           N i −1                                                           D. RBFCharacter Recognition
subject to the constraints:                                                  In our handwritten recognition system the input signal is the
                                                                             pen tip position and 1-bit quantized pressure on the writing
yi ( wT φ ( xi ) + b) ≥ ρ − ξ i andξ i ≥ 0, i − 1,..., N ; ρ ≥ 0             surface. Segmentation is performed by building a string of
(4)                                                                          "candidate characters" from the acquired string of strokes [16].
                                                                             For each stroke of the original data we determine if this stroke
A self organizing map (SOM) is a type of artificial neural                   does belong to an existing candidate character regarding
network that is trained using unsupervised learning to produce               several criteria such as: overlap, distance and diacriticity.
a low dimensional (typically two dimensional), discredited                   Finally the regularity of the character spacing can also be used
representation of the input space of the training samples,                   in a second pass. In case of text recognition, we found that
called a map. Self organizing maps are different than other                  punctuation needs a dedicated processing due to the fact that
artificial neural networks in the sense that they use a                      the shape of a punctuation mark is usually much less
neighborhood function to preserve the topological properties                 important than its position. it may be decided that the
of the input space.                                                          segmentation was wrong and that back tracking on the
          This makes SOM useful for visualizing low                          segmentation with changed decision thresholds is needed.
dimensional views of high dimensional data, akin to                          Here, tested two encoding and two classification methods. As
multidimensional scaling. SOMs operate in two modes:                         the aim of the writer is the written shape and not the writing
training and mapping. Training builds the map using input                    gesture it is very natural to build an image of what was written
examples. It is a competitive process, also called vector                    and use this image as the input of a classifier.
quantization [7]. Mapping automatically classifies a new input                         Both the neural networks and fuzzy systems have
vector.                                                                      some things in common. They can be used for solving a
          The self organizing map consists of components                     problem (e.g. pattern recognition, regression or density
called nodes or neurons. Associated with each node is a                      estimation) if there does not exist any mathematical model of



                                                                        58                                http://sites.google.com/site/ijcsis/
                                                                                                          ISSN 1947-5500
                                                          (IJCSIS) International Journal of Computer Science and Information Security,
                                                          Vol. 8, No. 7, October 2010




the given problem. They solely do have certain disadvantages              The Fourier coefficients a (n), b (n) and the invariant
and advantages which almost completely disappear by                       descriptors s (n), n = 1, 2....... (L-1) were derived for all of the
combining both concepts. Neural networks can only come into               character specimens [5].
play if the problem is expressed by a sufficient amount of
observed examples [12]. These observations are used to train              G.RACHSU Algorithm
the black box. On the one hand no prior knowledge about the               The major steps of the algorithm are as follows:
problem needs to be given. However, it is not straightforward             1. Initialize all Wij s to small random values with Wij being
to extract comprehensible rules from the neural network's                 the value of the connection weight between unit j and unit i in
structure. On the contrary, a fuzzy system demands linguistic             the layer below.
rules instead of learning examples as prior knowledge.                    2. Present the 16-dimensional input vector y0, input vector
Furthermore the input and output variables have to be                     consists of eight fourier descriptors and eight border transition
described linguistically. If the knowledge is incomplete,                 values. Specify the desired outputs. If the net is used as a
wrong or contradictory, then the fuzzy system must be tuned.              classifier then all desired outputs are typically set to zero
Since there is not any formal approach for it, the tuning is              except for that corresponding to the class the input is from.
performed in a heuristic way. This is usually very time                   3. Calculate the outputs yj of all the nodes using the present
consuming and error prone.                                                value of W, where Wij is the value of connection weight
                                                                          between unit j and the unit4 in the layer below:
E. Hybrid Fuzzy Neural Network                                                           1
Hybrid Neuro fuzzy systems are homogeneous and usually                    yi =                       (11)
resemble neural networks. Here, the fuzzy system is                              1 + exp(−∑ yi wij )
interpreted as special kind of neural network. The advantage                                   i
of such hybrid NFS is its architecture since both fuzzy system            This particular nonlinear function is called a function sigmoid
and neural network do not have to communicate any more                    4.Adjust weights by :
with each other. They are one fully fused entity [14]. These              Wij (n + 1) = Wij (n) + αδ j yi + ξ (Wij (n) − Wij (n − 1))
systems can learn online and offline. The rule base of a fuzzy
system is interpreted as a neural network. Thus the                       where0 < ξ < 1
optimization of these functions in terms of generalizing the               (12)
data is very important for fuzzy systems. Neural networks can             where (n+l), (n) and (n-1) index next, present and previous,
be used to solve this problem.                                            respectively. The parameter ais a learning rate similar to step
                                                                          size in gradient search algorithms, between 0 and 1 which
F. RACHSU Script Recognition                                              determines the effect of past weight changes on the current
Once a boundary image is obtained then Fourier descriptors                direction of movement in weight space. Sj is an error term for
are found. This involves finding the discrete Fourier                     node j. If node j is an output node, dj and yi stand for,
coefficients a[k] and b[k] for 0 ≤ k ≤ L-1, where L                       respectively, the desired and actual value of a node, then
Is the total number of boundary points found, by applying                 δ i = (d j − yi ) yi (1 − yi ) (13)
equations (7) and (8)
               L                                                          If node j is an internal hidden node, then :
a[k ] − 1 / L ∑ x[m]e    − jk ( 2π / L ) m
                                             (7)                          δj = y j (1 − y j )∑ δ k wk   (14)
             m =1                                                                              k
              L                                                           Where k is over all nodes in the layer above node j.
b[k ] = 1 / L ∑ y[m]e jk ( 2π / L ) m (8)                                 5. Present another input and go back to step (2). All the
              m =1                                                        training inputs are presented cyclically until weights stabilize
   Where x[m] and y[m] are the x and y co-ordinates                       (converge).
respectively of the mth boundary point. In order to derive a set
of Fourier descriptors that have the invariant property with              H.Structure Analysis of RCS
respect to rotation and shift, the following operations are               The recognition performance of the RCS will highly depend
defined [3,4]. For each n compute a set of invariant descriptors          on the structure of the network and training algorithm. In the
r (n).                                                                    proposed system, RCS has been selected to train the network
(n ) = [a (n ) 2 + b (n ) 2 ]                                             [8]. It has been shown that the algorithm has much better
                               1/ 2
                                      (9)                                 learning rate. Table 1 shows the comparison of various
It is easy to show that r (n) is invariant to rotation or shift. A        approach classification. The number of nodes in input, hidden
further refinement in the derivation of the descriptors is                and output layers will determine the network structure.
realized if dependence of r (n) on the size of the character is
eliminated by computing a new set of descriptors s (n) as                                 TABLE 1 COMPARISON OF CLASSIFIERS

 ()       ( ) ()
 s n = r n / r 1 (10)                                                     Type of classifier             Error           Efficiency




                                                                     59                               http://sites.google.com/site/ijcsis/
                                                                                                      ISSN 1947-5500
                                                                      (IJCSIS) International Journal of Computer Science and Information Security,
                                                                      Vol. 8, No. 7, October 2010




  SVM
  S                      0.001               91%                                                          ly,             g            d
                                                                                     97%. Understandabl the training set produced much higher
 S
 SOM                     0.02                 88%                                    recogn               n               Structure analy suggested
                                                                                           nition rate than the test set. S             ysis
 F
 FNN                     0.06           %
                                      90%                                            that R
                                                                                          RCS with 5 hid                 s              r
                                                                                                         dden nodes has lower number of epochs as
  BN
 RB                      0.04           %
                                      88%                                                 as
                                                                                     well a higher recognition rate.
  R
  RCS                    0                  97%
                                                                                                                    IV. CONCLU
                                                                                                                             USION

                                                                                     Charaacter Recognitiion is aimed at recognizing handwritten
                                                                                                                                     g
                                                                                          l             The
                                                                                     Tamil document. T input docu     ument is read preprocessed,
               91 88 90 88 97                                                                            nd
                                                                                     feature extracted an recognized and the recog   gnized text is
   100                                                                                    ayed in a pictu box. The T
                                                                                     displa             ure                         er
                                                                                                                      TamilCharacte Recognition
                                                                                          plemented usin a Java Neura Network. A complete tool
                                                                                     is imp             ng            al
    50                                                                                    s              ed                         g
                                                                                     bar is also provide for training, recognizing and editing
                                                          ERROR                           ns.                         age.           ng
                                                                                     option Tamil is an ancient langua Maintainin and getting
      0                                                   EFFICIENCY                      ontents from an to the book is very difficult. In a way
                                                                                     the co              nd           ks
                                            ERROR                                    Charaacter Recognit tion provides a paperless environment.
             SVM
                   SOM
                         FNN




                                                                                     Charaacter Recognit tion provides knowledge exchange by
                               RBF
                                     RCS




                                                                                     easier means. If a k             se           mil
                                                                                                         knowledge bas of rich Tam contents is
                                                                                          ed,
                                                                                     create it can be a  accessed by pe              ing
                                                                                                                       eople of varyi categories
                                                                                          ease and comfo
                                                                                     with e             ort.

          Figure 2 Character Recognitio Efficiency and E
                                      on               Error report                                              ACKNOWLEDG
                                                                                                                          GEMENT

  Number of Hidd Layer Node
I.N              den              es                                                      esearchers wou like to than S. Yasodha and Avantika
                                                                                     The re               uld       nk
    e
The number of hidden node will heavil influence t
                                 es              ly             the                                                             nd
                                                                                     for his assistance in the data collection an manuscript
nettwork perform mance. Insuffic cient hidden n  nodes will cauuse                        ration of this ar
                                                                                     prepar               rticle.
   der           ere             k
und fitting whe the network cannot recog         gnize the numeeral
beccause there are not enough ad djustable param meter to model or                                                     REFERENC
                                                                                                                             NCES
                  ut                             ure
to map the inpu output relationship. Figu 2 shows t             the
chaaracter recogn nition efficien ncy and err    ror report. TThe                    [1]     B.
                                                                                             B Heisele, P. Ho, and T. Poggio, “C  Character Recogni  ition with Support
                                                                                             Vector Machines: Global Versus Component Base Approach,” in
                                                                                             V                                                       ed
min                r             ken             ze
   nimum number of epochs tak to recogniz a character a        and                           ICCV, 2006, vol. 0 no. 1. pp. 688–
                                                                                             I                 02,                –694.
recognition effici                ng              test
                  iency of trainin as well as t character set                       [2]Julie DDelon, Agnès Des  solneux, “A Nonp                     ach
                                                                                                                                  parametric Approa for Histogram
                 of              des
as the number o hidden nod is varied. In the propos            sed                           S
                                                                                             Segmentation,” IE EEE Trans. On im                      vol.
                                                                                                                                 mage processing., v 16, no. 1, pp.
  stem the trainin set recognit
sys               ng                             hieved and in t
                                  tion rate is ach              the                          235-241. 2007
                                                                                             2
                                                                                     [3]     B.
                                                                                             B Sachine, P. M                      a,
                                                                                                                Manoj, M.Ramya “Character Se         egmentations,” in
    t            gnized speed fo each charac is 0.1sec a
test set the recog                or             cter          and                           Advances in Neural Inf. Proc. Systems, vol. 10. M Press, 2005,
                                                                                             A                                                       MIT
  curacy is 97% The trainin set produce much high
acc              %.              ng               ed            her                          v                 610–616.
                                                                                             vol.01, no 02 pp. 6
recognition rate t               et.
                  than the test se Structure an  nalysis suggestted                  [4]     O Chapelle, P. Haffner, and V. Va
                                                                                             O.                                   apnik, “SVMs for Histogram-based
  at              en
tha RCS is give higher reco      ognition rate.HHence Unicode is                             I
                                                                                             Image Classificatioon,” IEEE Transactions on Neural N    Networks, special
                                                                                             i
                                                                                             issue on Support VVectors, vol 05, no 01, pp. 245-252, 2 2007.
choosen as the en                me
                  ncoding schem for the cu       urrent work. T
                                                              The                    [5] Sim                    rco                ial                rks
                                                                                            mone Marinai, Mar Gori, “Artifici Neural Networ for Document
  anned image is passed throug various blo
sca               s               gh                           ons
                                                 ocks of functio                             A
                                                                                             Analysis and      Recognition “IEEE Transactions on pattern analysis
                                                                                                               R                   E                 n
   d
and finally comp                  e
                  pared with the recognition details from t     the                          a machine intell
                                                                                             and                                  o.1,
                                                                                                                ligence, vol.27, no Jan 2005, pp. 6  652-659.
maapping table from which corresponding unicodes ag             are                  [6]     M.
                                                                                             M Anu, N. Viji, and M. Suresh, “Segmentatio Using Neuralon
                                                                                             Network,” IEEE T
                                                                                             N                 Trans. Patt. Anal. MMach. Intell., vol. 23, pp. 349–361,
  cessed and prin
acc               nted using stanndard Unicode fonts so that t  the                          2006.
                                                                                             2
Cha aracter Recogn               ved.
                  nition is achiev                                                   [7]     B.
                                                                                             B Scholkopf, P. S Simard, A. Smola, and V. Vapnik, “    “Prior Knowledge
                                                                                             i Support Vector Kernels,” in Adva
                                                                                             in                                                      nf.
                                                                                                                                   ances in Neural In Proc. Systems,
                         III. EXPERIMEN
                                      NTAL RESULTS                                           vol.              s,
                                                                                             v 10. MIT Press 2007, pp. 640–64      46.
                                                                                     [8]     Olivier Chapelle, Patrick Haffner, “
                                                                                             O                                     “SOM for Histog  gram-based Image
                                                                                             Classification,” IE
                                                                                             C                                     on                 rks,
                                                                                                               EEE Transactions o Neural Networ 2005. Vol 14
    e
The invariant Fo  ourier descript               s
                                 tors feature is independent of                              no
                                                                                             n 02, pp. 214-230 0.
pos               nd
   sition, size, an orientation. With the com  mbination of RCCS                     [9]     S Belongie, C. Fo
                                                                                             S.                owlkes, F. Chung, and J. Malik, “Spe   ectral Partitioning
   d
and back propag   gation network a high accu
                                k,              uracy recogniti
                                                              ion                            w Indefinite Ke
                                                                                             with              ernels Using the N Nystrom Extention in ECCV, part
                                                                                                                                                     n,”
                                                                                             III,
                                                                                             I Copenhagen, D                       ol
                                                                                                               Denmark, 2006, vo 12 no 03, pp. 12    23-132
  stem is realize The trainin set consist of the writi
sys               ed.            ng             ts            ing                    [10] T. Evgeniou, M. P     Pontil, and T. Pog                    ion
                                                                                                                                  ggio, “Regularizati Networks and
  mples of 25 us
sam                               t            m
                  sers selected at random from the 40, and t  the                            S
                                                                                             Support Vector M                     ces
                                                                                                              Machines,” Advanc in Computatio        onal Mathematics,
    t             emaining 15 users. A portion of the traini
test set, of the re             u                             ing                            vol.
                                                                                             v 13, pp. 1–11, 2  2005.
   ta             sed
dat was also us to test the system. In th training set, a
                                                he                                   [11]P.BBartlettand, J.Shaw Taylor, “Gene
                                                                                                               we                  eralization performmance            of
                                                                                                                                                                        f
                                                                                             s
                                                                                             support vector ma achines and other p                    ,”
                                                                                                                                   pattern classifiers, in Advances in
                   of            a              in
recognition rate o 100% was achieved and i the test set t     the
                  d
recognized speed for each char                  c            y
                                 racter is 0.1sec and accuracy is



                                                                               60                                      http://sites.google.com/site/ijcsis/
                                                                                                                       ISSN 1947-5500
                                                                   (IJCSIS) International Journal of Computer Science and Information Security,
                                                                   Vol. 8, No. 7, October 2010




       Kernel Methods Support Vector Learning. 2008, MIT Press
       Cambridge, USA, 2002, vol 11 no 02, pp. 245-252.
[12] E.Osuna, R.Freund, and F.Girosi, “Training Support Vector machines: an
       application to face detection,” in IEEE CVPR’07, Puerto Rico, vol 05
       no 01, pp. 354-360, 2007.
[13] V. Johari and M. Razavi, “Fuzzy Recognition of Persian Handwritten
       Digits,” in Proc. 1st Iranian Conf. on Machine Vision and Image
       Processing, Birjand, vol 05 no 03, 2006, pp. 144-151.
[14] P. K. Simpson, “Fuzzy Min-Max Neural Networks- Part1 Classification,”
       IEEE Trans. Neural Network., vol. 3, no. 5, pp. 776-786, 2002.
[15] H. R. Boveiri, “Scanned Persian Printed Text Characters Recognition
       Using Fuzzy-Neural Networks,” IEEE Transaction on Image
       Processing, vol 14, no 06, pp. 541-552, 2009.
[16] D. Deng, K. P. Chan, and Y. Yu, “Handwritten Chinese character
       recognition using spatial Gabor filters and self- organizing feature
       maps”, Proc. IEEE Inter. Confer. On Image Processing, vol. 3, pp.
       940-944, 2004.



                           AUTHORS PROFILE

C.Sureshkumar received the M.E. degree in Computer Science and
Engineering from K.S.R College of Technology, Thiruchengode, Tamilnadu,
India in 2006. He is pursuing the Ph.D degree in Anna University Coimbatore,
and going to submit his thesis in Handwritten Tamil Character recognition
using Neural Network. Currently working as HOD and Professor in the
Department of Information Technology, in JKKN College of Engineering and
Technology, Tamil Nadu, India. His current research interest includes
document analysis, optical character recognition, pattern recognition and
network security. He is a life member of ISTE.

Dr. T. Ravichandran received a Ph.D in Computer Science and Engineering in
2007, from the University of Periyar, Tamilnadu, India. He is working as a
Principal at Hindustan Institute of Technology, Coimbatore, Tamilnadu, India,
specialised in the field of Computer Science. He published many papers on
computer vision applied to automation, motion analysis, image matching,
image classification and view-based object recognition and management
oriented empirical and conceptual papers in leading journals and magazines.
His present research focuses on statistical learning and its application to
computer vision and image understanding and problem recognition




                                                                                61                            http://sites.google.com/site/ijcsis/
                                                                                                              ISSN 1947-5500

				
DOCUMENT INFO
Description: Vol. 8 No. 6 September 2010 International Journal of Computer Science and Information Security