# handwriting by cuiliqing

VIEWS: 51 PAGES: 18

• pg 1
```									                On-Line Handwriting
Recognition
• Transducer device (digitizer)
• Input: sequence of point coordinates with
pen-down/up signals from the digitizer
• Stroke: sequence of points from pen-down
to pen-up signals
• Word: sequence of one or more strokes.

March 15-17, 2002   Work with student Jong Oh   Davi Geiger, Courant Institute, NYU
System Overview
Pre-processing
Input                               (high curvature points)

Dictionary                                   Segmentation

Character Recognizer                             Recognition Engine

Context Models
Word Candidates

March 15-17, 2002    Work with student Jong Oh   Davi Geiger, Courant Institute, NYU
Segmentation Hypotheses
• High-curvature points and segmentation
points:

March 15-17, 2002   Work with student Jong Oh   Davi Geiger, Courant Institute, NYU
Character Recognition I
• Fisher Discriminant Analysis (FDA): improves over PCA
(Principal Component Analysis).

p=WTx

Linear
projec-
Original space              tion             Projection space
• Training set: 1040 lowercase letters, Test set: 520 lowercase letters
• Test results: 91.5% correct
March 15-17, 2002   Work with student Jong Oh   Davi Geiger, Courant Institute, NYU
Fisher Discriminant Analysis
• Between-class scatter matrix

 B  i 1 Ni (μi  μ) (μi  μ)
C                                        T

– C: number of classes
– Ni: number of data vectors in class i
– i: mean vector of class i and : mean vector

• Within-class scatter matrix
W  i 1  j 1 ( v ij  μ i ) ( v ij  μ i ) T
C          Ni

– vji: j-th data vector of class i.
March 15-17, 2002     Work with student Jong Oh   Davi Geiger, Courant Institute, NYU
Given a projection matrix W (of size n by m) and its
linear transformation  p  W T v , the between-class scatter
in the projection space is

C
ΨB           i 1
N i (μ i ' μ' ) (μ i ' μ' ) T


C
       i 1
N i ( W T μ i  W T μ) ( W T μ i  W T μ) T


C
       i 1
N i ( W T μ i  W T μ) (μ iT W  μ T W)

i 1
C
         W T N i (μ i  μ) (μ iT  μ T ) W

 WT           C
i 1

N i (μ i  μ) (μ i  μ) T W
Similarly
Ψ W  W T ΦW W
 W ΦB WT

March 15-17, 2002             Work with student Jong Oh    Davi Geiger, Courant Institute, NYU
Fisher Discriminant Analysis (cont.)
• Optimization formulation of the fisher
projection solution: (YB, YW are scatter
matrices in projection space)


           YB 

Wopt    arg max                
W   
           Yw 

           WT  B W 

 arg max                     
W    
           W  wW 
T

March 15-17, 2002    Work with student Jong Oh   Davi Geiger, Courant Institute, NYU
FDA (continued)
• Construction of the Fisher projection matrix:
– Compute the n eigenvalues and eigenvectors of
the generalized eigenvalue problem:
S B y   S w y.

– Retain the m eigenvectors having the largest
eigenvalues. They form the columns of the
target projection matrix.
March 15-17, 2002   Work with student Jong Oh   Davi Geiger, Courant Institute, NYU
Character Recognition Results
• Training set: 1040 lowercase letters
• Test set: 520 lowercase letters
• Test results:
FCM                   ECV
Recognition
91.5%
rate
Avg.
candidate                  13.6
set size

March 15-17, 2002   Work with student Jong Oh   Davi Geiger, Courant Institute, NYU
Challenge I
• The problem of the previous approach is: non-characters are
classified as characters. When applied to cursive words it creates
several/too many non-sense word hypothesis by extracting
characters where they don’t seem to exist.
• More generally, one wants to be able to generate shapes and
their deformations.

March 15-17, 2002   Work with student Jong Oh   Davi Geiger, Courant Institute, NYU
Challenge II
•How to extract reliable local geometric features of images
(corners, contour tangents, contour curvature, …) ?
•How to group them ?
•Large size data base to match one input, how to do it fast ?
•Hierarchical clustering of the database, possibly over a
tree structure or some general graph. How to do it ? Which
criteria to cluster ? Which methods to use it ?

March 15-17, 2002   Work with student Jong Oh   Davi Geiger, Courant Institute, NYU
Recognition Engine
• Integrates all available information,
generates and grows the word-level
hypotheses.
• Most general form: graph and its search.
• Hypothesis Propagation Network

March 15-17, 2002    Work with student Jong Oh   Davi Geiger, Courant Institute, NYU
Hypothesis Propagation Network
Recognition of 85% on 100 words (not good)
H (t, m)
Class m's legal
predecessors

List length                                                                T

t

Look-
back
3             window
2                  range
1
"a" "b”         m            "y"   "z"              Time

March 15-17, 2002        Work with student Jong Oh      Davi Geiger, Courant Institute, NYU
Challenge III
•How to search more efficiently in this network and more
generally on Bayesian networks ?

March 15-17, 2002   Work with student Jong Oh   Davi Geiger, Courant Institute, NYU
Visual Bigram Models (VBM)
• Some characters can be very ambiguous when isolated: “9”
and “g”; “e” and “l”; “o” and “0”; etc, but more obvious
when put in a context.

“go”
Relative
height ratio
and
positioning
Character                                               “90”
heights
March 15-17, 2002   Work with student Jong Oh   Davi Geiger, Courant Institute, NYU
VBM: Parameters

top1      top2                        • Height Diff. Ratio:

h1                                                    HDR = (h1- h2) / h
• Top Diff. Ratio:
bot1                h2 h
TDR = (top1- top2) / h
• Bottom Diff. Ratio:
bot2                            BDR = (bot1- bot2) / h

March 15-17, 2002   Work with student Jong Oh   Davi Geiger, Courant Institute, NYU
VBM: Ascendancy Categories
Category                               Members
Ascender
b, d, f, h, k, l, t
(A)
Descender
f, g, j, p, q, y, z
(D)
None (N)             a, c, e, i, m, n, o, r, s, u, v, w, x

•Total 9 visual bigram categories (instead of 26x26=676).

March 15-17, 2002   Work with student Jong Oh    Davi Geiger, Courant Institute, NYU
VBM: Test Results
Using VBM                            Yes                   No
Recognition rate                           93%                 85%
Rank-1                            4                      8
Rank-2                            1                      2
Rank-3                            1                      2
Rank-4                            1                      3
March 15-17, 2002   Work with student Jong Oh    Davi Geiger, Courant Institute, NYU

```
To top