Smooth Morphing of Handwritten Text

Document Sample
Smooth Morphing of Handwritten Text Powered By Docstoc
					                            Smooth Morphing of Handwritten Text

                                                       Conrad Pomm, Sven Werlen
                                                  Institute for Theoretical Computer Science
                                                                Zurich, Switzerland

ABSTRACT                                                                       come with embedded handwriting recognition it is possible to head
There are several approaches for pen-based systems to improve leg-             for another direction. In general, typed text will be better readable
ibility of handwritten text, e.g. smoothing the strokes composing              than handwritten characters. Hence, the idea is to use a recog-
the characters and words. A very challenging solution is the smooth            nizer and replace the handwritten characters with font symbols. To
morphing approach: handwritten strokes are transformed gradually               avoid the audience being confused due to sudden changes caused
into perfectly legible characters provided by a previously executed            by the replacement of pen input with font symbols, the transforma-
handwriting recognition process. In this paper we present our ap-              tion should occur gradually.
proach to a smooth real-time metamorphosis of handwritten char-
acters into clean typography. Our main contributions are a new                 In this paper we present our approach for transforming user-drawn
hybrid algorithm for character mapping, heuristics for splitting and           input strokes step-by-step into clean typography, using the infor-
joining strokes, and finding a stroke mapping with the use of radial            mation accessible through a handwriting recognizer. The result is
distances. We implemented our methods in a whiteboard applica-                 a smooth real-time metamorphosis of handwritten characters into
tion that provides intuitive to use editing operations due to floating          highly readable typefaces without abrupt changes in appearance or
menus and stroke gestures. The implementation is based on the                  positioning of the text (see Figure 1 for a morphing example).
Microsoft Tablet PC SDK and the handwriting recognizer provided
with the Tablet PC operating system.

Categories and Subject Descriptors
H.5.2 [Information Interfaces and Presentation (e.g., HCI)]: User
Interfaces; I.5.4 [Pattern Recognition]: Applications.

General Terms
Algorithms, Human Factors.

Keywords                                                                                     Figure 1: Word morphing example.
Stroke morphing, Tablet PC, online handwriting recognition, ani-
mated interfaces.                                                              We implemented our algorithms into a simple and intuitively to use
                                                                               application that allows editing operations like deleting or changing
1.    INTRODUCTION                                                             the font size of a word, rearranging sentences, correcting wrongly
A good field of applications for pen-based devices such as touch                recognized words. As our special interest was the Tablet PC, we
screens [14], digital whiteboards [4, 10, 13, 7] or Tablet PCs [8,             based the implementation on the Microsoft Tablet PC SDK [9] and
15] is traditional face-to-face teaching where the blackboard and              used the handwriting recognizer provided with the Tablet PC oper-
chalk metaphor is still present, but augmented and enhanced by                 ating system.
the new technological capabilities. In such a scenario, legibility of
the handwritten symbols is a basic requirement. Smoothing and                  Problem Analysis. There are many morphing techniques, vary-
anti-aliasing (cf. the pen tool of Adobe Acrobat), or sub-pixel ad-            ing in complexity and efficiency, and they are applied to all differ-
dressing (cf. [2]) are possible approaches to improve legibility of            ent kinds of objects like 2D-pictures or 3D-models. In our setting
pen input. While there are more and more pen-based devices that                the morphing procedure has to transform the strokes representing
                                                                               handwritten text into a target font representing typed text. By a
                                                                               stroke we mean an object that represents the input from a single
                                                                               pen down-move-up sequence (cf. [3]; in Figure 1 the letter ”P” in
                                                                               the upper left corner was written with two strokes: a nearly verti-
                                                                               cal line and a bow). The approach is only reasonable if we meet
 c ACM, 2004. This is the author’s version of the work. It is posted here by   the following constraints: (A) The morphing can be executed in
permission of ACM for your personal use. Not for redistribution. The defini-    real-time. (B) The process works automatically without user inter-
tive version was published in: Proceedings of the working conference on Ad-    action for all different kinds of users’ handwriting styles. (C) The
vanced visual interfaces, 2004.
AVI ’04, May 25-28, 2004, Gallipoli (LE), Italy                                transformation maintains important features such that legibility is
                                                                               guaranteed for all intermediate stages.
To guarantee (A) we consider stroke morphing as the most appro-         We developed an alphabet editor such that every user may design
priate technique. With a stroke-based target font, the metamorpho-      his own stroke font drawing each character separately. From a list
sis turns out to be an interpolation of strokes that can be executed    of symbols the actual TrueType character is always displayed on
fast. Hence, the most challenging task is the pre-processing of the     the background to aid the process (with font ”Comic Sans MS”
user-drawn strokes rather than the interpolation process itself. The    which looks very much like handwritten symbols). Hence, the lay-
goal is to find a mapping of corresponding strokes. 1 (B) has to be      out of morphed characters will depend upon their appearance in the
guaranteed by the algorithms used in the pre-processing.                alphabet.

In general, aspect (C) is a very challenging demand on shape mor-       We find a serif font style appropriate for presentation purpose in
phing (cf. the energy-based approaches [12], [11]). Preserving          a teaching scenario. The process becomes more complex if we
important features, which also means knowing what the impor-            demand for serif fonts (cf. [1], stage-two metamorphosis).
tant features are, and simultaneously having an universal automatic
morphing process seem to exclude each other (cf. [1], previous          The Morphing Process. The whole morphing phase can be sepa-
work).2 But smooth morphing is only reasonable if the transfor-         rated into four main stages: word extraction, word decomposition,
mation works automatically. Hence, we relax constraint (C) for the      interpolation and post-processing (see Figure 3 for an overview.)
sake of (B). Of course, our goal is to design the stroke morphing
process in a way that the sources for unnatural-looking transforma-                          Preprocessing     Morphing        Postprocessing
tions are minimized. But we have no effective measure for that.             Recognition &     Word decom-     Interpolation         Word
                                                                            Extraction of      position         of single      transformation
                                                                             first word                          Strokes
Stroke Font. Apart from good performance, the main advantages
of using stroke-based characters in the target font are:                                       Character
   • Recognized and non-recognized characters are both repre-
     sented by strokes – a mix of outline fonts (recognized char-
     acters) and strokes (not recognized characters) may result in                              Stroke
     an unbalanced perception.

   • The morphing process can be interrupted without difficulty.

   • Touching up the morphing result is easy to implement.                     Figure 3: The process of recognition and morphing.

   • Because we build our own font, we may also include special         We start with the assistance of the recognizer the word extraction
     characters that cannot be represented with normal font, such       phase. We are restricted to the unit “word” because on a character-
     as mathematical symbols. Thus, the morphing procedure is           level the recognition results are not sufficient. Our criterion to de-
     no longer limited to 255 standard characters, but depends          termine when the writing of a word is finished uses the recognizer
     only upon the recognition. If a recognizer is able to distin-      and a timer starting a background process. At each tick the pro-
     guish handwritten mathematical symbols and our stroke font         cess asks the recognizer to retrieve the meaning of the so far given
     contains its corresponding representation, a morphing may          strokes. The strokes to be considered are all those which were not
     occur.                                                             previously recognized. If the result consists of a single word, this
                                                                        indicates that the user is currently writing that word. In this case,
   • Every user may create his own font to give the characters a        the process waits until the next tick. But, if the result consists of
     personal touch.                                                    two separated words, this indicates that the user has at least finished
                                                                        writing the first word. Thus, we get an entire word, determine the
                                                                        corresponding strokes3 , and remove these strokes the input stream.

                                                                        We gather a maximum of information from the strokes during the
                                                                        pre-processing in order to reduce execution time in the forthcom-
                                                                        ing interpolation phase, which must be realized in real-time. As
                                                                        input we have a word and a set of strokes that represents this word.
                                                                        The character mapping phase partitions the given strokes into sub-
                                                                        sets corresponding to the characters of the given word. This is hard
                                                                        work, because the recognizer from the Tablet PC does not provide
                                                                        dedicated methods for this task. Based on querying the recognizer,
                                                                        we combine several concepts into a hybrid algorithm for establish-
                                                                        ing the character-strokes correspondence. The details are described
                                                                        in section 2.
                 Figure 2: The alphabet editor.
  It is basically a combinatorial problem where the recognizer          The input for stroke mapping is a character c and a list of strokes
serves as an oracle, but ”noise” like bound characters or handwrit-     representing c. We are also given the stroke representation of c
ten characters that are quite different from their corresponding font   in the target stroke font. Finding an one-to-one correspondence of
characters make things hard.                                            input strokes and target strokes forces the cardinality of both stroke
  Building an expert system that recognizes predefined characteris-
tic features of handwriting (such that these features could be con-      The Tablet PC SDK provides the method IInkRecognition-
served) would probably imply building a handwriting recognition         Alternate.GetStrokesFromTextRange() to extract the
system. Our focus was just using a given recognizer.                    strokes.
sets to be equal. But as writing styles differ, the cardinality of these
sets may be not equal. Therefore, we use heuristics for splitting
and joining input strokes in order to obtain equally sized stroke
sets. Finally, our measure for stroke correspondence is based on
radial distances. Section 3 discusses the entire phase.
                                                                                           (a)                                 (b)
At this stage, we have a one-to-one mapping of two stroke sets as
a result of the previous steps. This mapping from source strokes of
the pen-input to target strokes of the alphabet is assumed to be cor-
rect. In the final interpolation step, the real-time morphing phase,
each source stroke is smoothly transformed into the corresponding                          (c)                                 (d)
target stroke. Let (S, T ) be such a pair of strokes. We use linear
interpolation to transform a point XS from S to a corresponding
                                                                            (a) The characters ’o’ and ’u’ as well as the characters ’n’ and ’d’
point XT from T :
                                                                                 are bound. It’s difficult to distinguish where a letter finishes
                Xθ     =    XS ∗ (1 − θ) + XT ∗ θ.                               and where the other begins. In the same way, more than only
                                                                                 two characters may be bound and thus increase the difficulty
Xθ is the intermediate point corresponding to θ ∈ [0, 1]. We obtain              of the decomposition.
pairs (XS , XT ) by choosing points on the strokes S and T with
proportional arclength.                                                          Remark on the MS recognizer: much more recognition errors
                                                                                 occur in a bound writing style than in a single character one.
The post-processing phase is only relevant if there are additional
                                                                            (b) For the word ”exiting”, several characters are represented by
tasks, e.g. after correcting a wrongly recognized word. We de-
                                                                                 more than a single stroke. The dots on the characters ’i’ and
signed a correction procedure showing the recognition alternates in
                                                                                 the bar on the character ’t’ must be correctly associated with
a select box by right-clicking on the word. Usually, alternates are
                                                                                 their corresponding letter. Moreover, some people insert the
very similar, such as the words ”could”, ”Could” and ”cold”. Based
                                                                                 dots and the bars after writing the word, in which case the
on the minimal edit distance we add, remove, or change the fewest
                                                                                 sequence of the input of information cannot be used to guide
possible number of characters, such that the morphing will mostly
                                                                                 the process.
look like error correcting.
                                                                                 Remark on the MS recognizer: The order of the stroke se-
Previous Work. J. Arvo and K. Novins present with the SmartText                  quence influences the success of the recognition, especially
concept [1] the first system that combines recognition and morph-                 for characters like ”P”, ”B” or ”D” that are composed of
ing of handwritten symbols. Their system recognizes handwrit-                    more than one stroke.
ing character by character followed by a stage-one and a stage-two
                                                                            (c) Given such an example, although the recognition returns a cor-
metamorphosis. Stage-one is the transformation from handwritten
                                                                                 rect answer, some letters are illegible and cannot be correctly
character strokes to stroke font characters, stage-two is the trans-
                                                                                 interpreted alone.
formation from the stroke font characters to an outline font. The
stroke morphing in stage-one consists of splitting or joining oper-         (d) Although the last example seems obvious, we also noticed
ations to find an equal partitioning of input and target strokes, and             during our experiments that some special characters like ’?’
then establishes an one-to-one correspondence of strokes by test-                or ’!’ cannot be correctly recognized alone.
ing all combinations and choosing the one with the lowest “energy”
                                                                           In order to solve these problems, one could use specific informa-
Our contribution differs from the approach of J. Arvo and K. Novins        tion about the structure of each character, but it is best to avoid
in the way we start with recognized words rather than with recog-          this method. An attempt to extract specific features of a letter in-
nized characters. Therefore we have to determine the character             side strokes is essentially the implementation of a recognition al-
mapping between the characters in the recognized word and the              gorithm.
corresponding strokes. Further, for the stroke mapping we present
new heuristics for splitting and joining, and we use radial distances      We want to remain as general as possible and to rely on the em-
as measure for stroke correspondence. From our experience, this            bedded recognizer for the recognition parts. Therefore, we devel-
measure gives much better results than energy-based measures, in           oped four different methods which use the recognizer again for re-
particular for characters like “o” where the energy approach failed        trieving each character inside the word (the first three methods are
in the most cases. Finally, our whiteboard application implements a        recursive): left-to-right parsing, right-to-left parsing, divide-and-
real-time morphing process with intuitive to use editing operations        conquer parsing, and force decomposition. Each method can solve
due to floating menus and gestures.                                         some of the problems discussed above. Finally, we will explain
                                                                           how to use them together in order to produce a satisfying general
2.    CHARACTER MAPPING                                                    decomposition of a given word.
From the ”word extracting” phase, we have a word and a list of
strokes representing it. The next step consists of separating the dif-     1. Method: Left-To-Right parsing
ferent characters composing the word into stroke lists. The alphabet       The first method retrieves the strokes corresponding to a character,
contains a list of target strokes for each existing character. Unfor-      processing from left to right. The algorithm may be represented
tunately, the MS Tablet PC SDK does not provide any information            using the following pseudo-code, where the recognition set is the
about the characters or strokes inside a word which would assist the       set of strokes which will be parsed by the recognizer, and the word
process. The following examples exemplify some problems.                   is an array of characters.
  Given the strokes list S, the recognition set rS                      in the word is illegible. The recognizer is unable to retrieve the
  and the word w:                                                       strokes corresponding to that character and thus cannot continue.

  Procedure LeftToRight(S,w): Boolean                                   However, even if the algorithm failed during the process, a part of
  0. If length(w) = 0 then return true                                  the word is nevertheless decomposed and may be used. The effec-
  1.    Empty rS                                                        tiveness depends upon the position of the first exception encoun-
  2.    Move the first stroke of S to rS                                tered in the word.
  3.    Repeat
       3a.   Recognize rS as R                                          2. Method: Right-To-Left parsing
       3b.   If R = w[0] then reco := true                              The second method is equivalent to the previous method, but pro-
       3c.   If R = w[0] and reco = true then goto 4.                   ceeds from right to left, rather than from left to right. The algorithm
       3d.   If S is empty then return false                            can then decompose the character at the end of the given word until
       3e.   Move the leftmost stroke of S to rS                        it encounters an unrecognized character. These occur in the same
  4.    Move the last inserted stroke of rS back to S                   cases as for those in the LeftToRight method.
  5.    Store rS as the strokes list of the character w[0]
  6.    w := w - w[0]                                                   3. Method: Divide-And-Conquer parsing
  7.    return LeftToRight(S,w)                                         This method tries to separate a word in the middle into two parts, to
                                                                        recognize each part separately and then compare them to the given
The procedure LeftToRight tries to extract the strokes repre-           word. The following pseudo-code may be used to understand the
senting the first character in the given word. If they can be found,     decomposition process:
the procedure continues recursively with the remaining strokes and
the rest of the word. The following example presents the algorithm        Given the strokes list S, the left set lS,
at steps 1 and 3 for the word ”Pen”:                                      the right set rS, and the word w:

                                                                          Procedure DivAndConquer(S,w):          Boolean
                                                                          0.   If length(w) = 1 then store and return true
                                                                          1.   If S={s0 } then split(s0 )
 1.                3.              3.               3.                    2.   Separate S into two parts lS and rS
 w[0]=’P’          w[0]=’P’        w[0]=’P’         w[0]=’P’              3.   Recognize lS as R
                   R=’I’           R=’P’            R=’Pe’                4.   If R=Left(w) then call DivAndConquer(lS,Left(w))
                                   reco=true        reco=false            5.   Recognize rS as R
                                                                          6.   If R=Right(w) then call DivAndConquer(lS,Right(w))
                                                                          7.   Return true if both recognitions succeeded.

 1.                3.              3.                                   Considering the same example as for the method LeftToRight,
 w[0]=’e’          w[0]=’e’        w[0]=’e’                             the algorithm separates the word ”Pen” into two sets (lS and rS)
                   R=’e’           R=’en’                               of two strokes. The recognizer identifies the lS as ”P” and rS as
                   reco=true       reco=false                           ”en”. Thus, the second level of recursion will consider the words
                                                                        ”P” and ”en”. Word ”P” is composed of a single character: so the
                                                                        corresponding strokes have been found. In parallel, the algorithm
                                                                        needs another recursion step to decompose the word ”en” into ”e”
                                                                        and ”n”, in order to retrieve each strokes’ sets. The example can
 1.                3.                                                   also be examined in following Figure:
 w[0]=’n’          w[0]=’n’

In the above example, each row represents one recursive call of the
procedure LeftToRight. The strokes list S contains all black
strokes, while the grey strokes represent the recognition set rS.
Why do we choose to move the first stroke of S into rS rather than
the left most one? The reason comes from the MS recognizer itself:
as mentioned above, the recognition result depends on the order of
the stroke sequence.

Although the LeftToRight algorithm succeeds for most of the
current cases, it may happen that a special stroke list cannot be
decomposed. Two such cases may be distinguished: (1) A word
composed of one or more bound strokes cannot be decomposed. A
bound stroke in this case is a stroke which represents more than a
single character. Because the algorithm does not split such a stroke,   The main advantage of this method is the fact that it can handle and
it will not be able to decompose the word correctly. (2) A character    recognize bound strokes, because it does not decompose the word
character by character. On the other hand, compared to the methods        Here, we consider that the characters ’o’ and ’m’ are bound and
LeftToRight and RightToLeft, the DivideAndConquer                         that the character ’t’ is illegible and cannot be recognized. Apply-
method is not able to decompose parts of a word if it contains illeg-     ing DivideAndConquer on the word ”computer”, the left part
ible characters.                                                          of the word will be decomposed without any problems. The right
                                                                          part of the word, however, cannot be decomposed, because of the il-
4. Method: Force                                                          legible character. At this point, the LeftToRight method will be
This method is the most reliable, because it always succeeds. The         able to extract the character ’u’, while the RightToLeft method
process checks that there are enough strokes in S to distribute to        will extract the characters ’e’ and ’r’. Finally, the Force method
each character of the word. If this is not the case, the biggest stroke   will be used for the illegible character.
will be split until a sufficient number of strokes is reached. The
process then iteratively distributes the strokes among the charac-        3.     STROKE MAPPING
ters, such that each character will be represented by roughly the         For each stroke subset computed during the previous phase, the
same number of strokes.                                                   association between the source strokes and the target strokes must
                                                                          be established. If both the source set and the target set contain only
The decomposition always succeeds, but is rarely correct, because         one stroke, then the mapping is immediate. In the case of more than
no recognition was used during the process. It may happen that a          one stroke, however, it is first necessary to ensure that the number
stroke was inserted in a wrong set, for example the stroke set of         of strokes in a given source set matches those in the target set. In
the subsequent or of the previous character in the word. Thus, the        order to achieve this, extra strokes in the source set may be joined.
morphing, based on this word decomposition will look somewhat             To the same end, some of the target strokes may be split if they are
strange, even if the result is correct.                                   too few. The process then maps each stroke of the source set to the
                                                                          appropriate stroke in the target set.
Method combination
In this subsection, we explain how we combined the four meth-             Splitting
ods to benefit from their advantages and to be able to decompose a         Assuming that the source set contains fewer strokes than the tar-
word’s stroke list into sets representing each character of that word.    get set, the challenge consists of identifying which stroke to split
As before, the decomposition always succeeds but its accuracy de-         and at which point. The SDK provides useful information about
pends on the recognizer.                                                  a stroke: points of intersection and cusps (sharper bends) can be
                                                                          extracted. But we found that using this information we got very
We decided to apply the DivideAndConquer method initially,                different results.
because it is the fastest method (in execution time) and because it
can manage bound strokes, which often occur in normal writing. If         Hence, another, even simpler method was implemented: the largest
the process fails to decompose one part of word, we then implement        stroke of a source set is split into two parts, each with the same
the LeftToRight method in order to extract some recognizable              number of points. The operation is repeated until the target number
characters on the left. If that new process fails, we then use the        of strokes is reached. This method does not rely on the analysis
RightToLeft method. Finally, if unrecognized parts remain, the            of both strokes, but is still acceptable. Most of the letters in the
Force method is applied, which will arbitrarily decompose them.           alphabet are composed of a single stroke, such that splitting is not
The example represented in Figure 4 illustrates the use of the four       necessary. In other cases, we considered that most writing styles
methods.                                                                  are casual or speedy and often contain a single stroke per character
                                                                          or two strokes for characters with accents, points or crosses, ’i’, ’j’
                                                                          and ’t’ being examples. It is then reasonable to choose the largest
                                                                          stroke for splitting each time. Finally, separating the stroke into two
                                                                          equal parts generally produces acceptable results in the majority of
                                                                          instances. Accuracy is more consistently obtained with this method
                                                                          than with the use of a method which is vulnerable to the incorrect
                                                                          analysis of intersections and cusps of a stroke.

                                                                          This step only occurs if the source set contains more strokes than
                                                                          the target set. In this case, the process recursively joins the two
                                                                          most suitable strokes of the source set until the correct number of
                                                                          strokes is reached. The challenge consists in finding two suitable
                                                                          strokes. We distinguished three cases:
                                                                               • The source set contains only two strokes. This case is com-
                                                                                 mon and obvious. The two strokes are joined at the nearest
                                                                                 extremities. Figure 5(a) presents such an example.
                                                                               • The source set contains several strokes, only two of which
                                                                                 possess close extremities. As represented in Figure 5(b), the
Figure 4:       Example of the combination of the four                           choice is obvious.
methods during the decomposition: DivideAndConquer,                            • Figure 5(c) presents the final and most difficult case: more
LeftToRight, RightToLeft, and Force. The small dig-                              than two strokes have close extremities. A criterion other
its indicate the number of strokes.                                              than proximity must be applied.
               (a)                (b)             (c)

   Figure 5: Examples of joining: source and target strokes

The case for which the source set contains only two strokes has an        Figure 8: Morphing of a star-shaped polygon using radial pro-
obvious progression, and will not be further expanded upon in this        jection.
discussion. In the second case, close extremities must be detected.
Considering absolute distances is unreliable since the strokes may
be scaled. A close distance is defined here as a distance which is         their radial vector indicates their relative position to the centre. By
n times shorter than the other distances, where n > 1 and may             radial vector, we mean the vector OX where O is the centre of the
be adapted. As shown in Figure 5(b), the distance between the             bounding box and X is the point itself. Figure 9 shows 10 radial
extremities of the top of the letter ’A’ is much shorter than the other   vectors for the two strokes of the first representation of the letter
distances.                                                                ’D’. As we can see, the radial vectors of each stroke are point-
                                                                          ing in very different directions. Therefore, we based our mapping
For the final case, in which more than two strokes have close ex-          algorithm on these radial vectors. The following pseudo-code de-
tremities, we decided to join the strokes which are continuous pro-
jections. Indeed, sharp bends in a stroke do not suggest natural
writing styles. To evaluate the continuity of two strokes, the pro-
cess computes the angle between the vectors, indicating the direc-
tion at the nearest extremities. Thus, a small angle implies that the
two strokes are nearly continuous whereas a greater angle indicates
a sharper bend. Figure 6 represents the decision process for the
example of Figure 5(c).                                                                  (a)         (b)          (c)         (d)

                                                                          Figure 9: (a) Letter ’D’ composed of two strokes. (b) Bound-
                                                                          ingbox and its centre, (c) and (d) Representations of the radial
                            (a)           (b)                             vectors.

                                                                          scribes the method used to calculate the radial distance between
Figure 6: Computation of continuity: (a) the angle between the            two strokes S1 and S2 , and, given the centres C1 and C2 of the
two vectors is approximately 90◦ , (b) the angle between the two          corresponding bounding boxes:
vectors is approximately 0◦ . Case (b) is therefore more suitable.
                                                                            For each point P1i ∈ S1 :
Mapping                                                                         Find the corresponding point P2i ∈ S2
Once the source and the target set contain the same number of                   Compute the radial vectors C1 P1i and C2 P2i
strokes, we can determine an association between the strokes of                 Compute the angle αi between vectors C1 P1i and C2 P2i
each set. Since the number n of strokes in the sets is always small,        Distance =         αi
with n = 1 and n = 2 being most typical, it is reasonable to per-
form a search through all the n! associations, where each stroke-
to-stroke correspondence is also optimized according to direction.        The radial distance of two strokes may be defined as the sum of
An example of the two possible stroke combinations for two sets           the angles between corresponding radial vectors. Additionally, the
representing the letter ’D’ is presented in Figure 7. To evaluate a       radial distance of two sets may be defined as the sum of the ra-
                                                                          dial distances between each stroke and its mapped stroke. If we
                                                                          apply this definition to the two possible mapping combinations in
                                                                          Figure 8 and compute the radial distance for both, considering only
                                                                          three points in each stroke, the two results will be 78 and 370. The
                                                                          difference would have been even larger if more than three points in
                                                                          each stroke had been considered. The complete computing process
             (a)                    (b)                 (c)               can be found in Figure 10.

                                                                          Considering these two results, the strokes in the source set will be
Figure 7: (a) two characters composed of two strokes. (b) and             correctly mapped using the first combination. The final step con-
(c) the two possible mapping combinations.                                sists of finding the direction for each mapping, i.e. computing if the
                                                                          first point of the source stroke must be mapped to the first point or
correspondence, we decided to use a strategy which is often used          to the last point of the target stroke. Again, the process uses radial
for transforming star-shaped polygons: the radial projection [5].         distance to evaluate both possibilities, comparing the source stroke
Figure 8 illustrates an example of such a morphing. Radial projec-        with the target stroke and the inverted target stroke. But this time,
tion gives us information about the localisation of a stroke for our      inaccurate mappings (i.e. stroke reversion) were noticed for ”sym-
character shapes. If we consider the points describing each stroke,       metrical” characters like ’o’ or ’c’. The direction of the rotation
                                                                        4.    THE APPLICATION AROUND
                                                                        Standard software provided with pen input devices like the Tablet
                                                                        PC [8, 15] or electronic whiteboard systems [4, 10, 13, 7] does
                                                                        not offer smooth morphing of handwritten text. Therefore we had
   α0 = 15◦          α1 = 5◦         α2 = 15◦          α3 = 0◦          to develop our own application. We found that the Tablet PC of-
                                                                        fered the best infrastructure in order to concentrate our work on the
                                                                        stroke morphing problem. Particularly, we did not want to build our
                                                                        own handwriting recognizer. We actually wanted to see how far we
                                                                        could get if we can only rely on an ”black-box recognizer”. The
                                                                        Tablet PC SDK developed by Microsoft [9] provides a complete
   α4 = 20◦         α5 = 23◦          α0 = 5◦         α1 = 170◦         framework to handle pen input and to interact with the embedded
                                                                        handwriting recognizers.

                                                                        Intuitive User Interface. Having the teaching scenario in mind,
                                                                        we wanted to build a simple user interface with some editing ca-
                                                                        pabilities, possibly closely related to the classical blackboard and
   α2 = 10◦          α3 = 5◦         α4 = 175◦         α5 = 5◦          chalk metaphor. A common user interface provides buttons, tool-
                                                                        bars, comboboxes and menus to interact with the user. In our case,
                                                                        we decided to avoid such an interface as much as possible in or-
Figure 10: Computation of radial distance of two sets for               der to not disturb a fluid interaction. Finally, we favored to use
the first (top) and the second (bottom) mapping combinations,            strokes themselves to interact between an editing toolbar and the
where αi is the angle between the two vectors represented. Top:         words resp. the whole document structure. We designed a small
  5                        5                                            toolbar realized as a floating menu that appears two lines below the
  i=0 αi = 78. Bottom:     i=0 αi = 370.
                                                                        most recently drawn word so that it is always near the cursor. The
                                                                        toolbar is populated with a set of icons standing for several editing
was then also taken into account for evaluating a combination. For      options like copy, paste, delete, resize, etc. An editing operation
that, the sum of the differences between an angle and the previous      starts when the endpoints of a stroke are located in the bounding
one is computed, where a small sum means the preservation of the        box of a recognized word or on an icon of the toolbar. After the
rotation’s direction used to draw the stroke. Considering the exam-     operation is executed, the stroke will be removed.
ple in Figure 11, the radial distance of the second combination is
smaller than the radial distance of the first combination, even if the
results are close. The rotation’s direction, however, is reversed in
the second combination.

   α0 = 90◦         α1 = 90◦         α2 = 90◦         α3 = 90◦

   α4 = 90◦         α5 = 90◦         α6 = 90◦         α0 = 50◦

   α1 = 45◦         α2 = 150◦        α3 = 90◦         α4 = 30◦

   α5 = 150◦        α6 = 70◦

Figure 11: Comparison of stroke mapping combinations: Top:                             (a)                                (b)
  6               6                               6
  i=0 αi = 630,   i=1 (αi − αi−1 ) = 0. Bottom:   i=0 αi =
585, i=1 (αi − αi−1 ) = 430.                                                   Figure 12: Interface: Examples of simple editing.

                                                                        Thus, the location of the ends of a stroke are used to specify both
                                                                        the object concerned and the action. This makes the execution of an
                                                                        action possible in one single step. For example, in order to increase
                                                                        the font size of a word, the user has only to draw a stroke from the
desired word to the correct option as shown on Figure 12[a]. The            and therefore inaccuracies will not matter so much, or the transfor-
same method can be used to move a word: it is enough to draw a              mation occurs very slow such that the audience hardly realizes the
stroke from the word to the desired position (Figure 12[b]).                change. Which of these ways do people accept? Does the smooth
                                                                            morphing approach provide a benefit at all? Or is it a distraction
The direction of a stroke gesture is used to minimize the number            rather than an aid? We need an evaluation of the system in front of
of options. Starting on a word the option is applied only to the            classes.
word. Starting on a menu icon the option is applied to the whole
paragraph.                                                                  6.   REFERENCES
                                                                             [1] James Arvo and Kevin Novins. Smart text, a synthesis of
5.    CONCLUSION                                                                 recognition and morphing. AAAI Spring Symposium on
We presented our approach to smooth morphing of handwritten                      Smart Graphics, March 2000.
text. Our main contributions are a new hybrid algorithm for char-            [2] Timothy S. Butler. Clearpen: improving the legibility of
acter mapping and finding a stroke mapping with the use of radial                 handwriting. In CHI ’03 extended abstracts on Human
distances. We implemented an application that makes use of these                 factors in computer systems, pages 678–679, 2003.
algorithms and processes the transformation in real-time.
                                                                             [3] Definitions from the Tablet PC SDK Documentation,
Quality of Recognition. Recognition errors are a serious problem                 10.08.2003.
for an interface like ours that deals with handwriting recognition.
The more handwritten input is misinterpreted the more the user has           [4] eBeam Whiteboard Capturing System, 10.08.2003.
to spend energy on correcting these misinterpretations. Pretty soon    
such effort will obscure the benefits of improving legibility with            [5] Jonas Gomes, Lucia Darsa, Bruno Costa, and Luiz Velho.
smooth morphing. Therefore, the quality of the recognizer4 and                   Warping and Morphing of Graphical Objects. Morgan
a style of handwriting that the recognizer processes will notably                Kaufmann, 1999.
contribute to the overall quality of the approach.
                                                                             [6] Tracy Hammond and Randall Davis. Ladder: A language to
Quality of Morphing. It could be of particular use if the recog-                 describe drawing, display, and editing in sketch recognition.
nizer provides a weight for the assumed quality of the recognition               In Proceedings of IJCAI (International Joint Conference on
result. Then critical words could simply be left as they are in order            Artificial Intelligence), 2003.
to preserve legibility. If we assume correct recognition results, in
the majority of cases we obtain natural-looking transformations.             [7] Interactive whiteboard, 10.08.2003.
The main criterium is the difference between the user’s writing        
style and the stroke alphabet.5 In the ideal case where the user
                                                                             [8] Microsoft Tablet PC Developer, 26.11.2003.
of the interface and the creator of the stroke alphabet are the same,
unnatural-looking morphs occur rarely.
                                                                             [9] Microsoft Tablet PC SDK, 25.06.2003.
Application. The use of floating menus and pen gestures simplifies       
and speeds up editing enormously. On the other hand, the applica-
tion’s performance depends highly on the amount of stroke input.            [10] Mimio whiteboard capturing system, 10.08.2003.
Large pages with lots of strokes slow down processes like scrolling.   
Thus, we could improve the performance by finally morphing our
                                                                            [11] Thomas W. Sederberg and Eugene Greenwood. A physically
stroke font to a similar system font like ”Comic Sans MS”.6
                                                                                 based approach to 2-d shape blending. In Proceedings of the
                                                                                 19th annual conference on Computer graphics and
Future Work. We consider the following aspects to be most im-
                                                                                 interactive techniques, pages 25–34, 1992.
portant and challenging. (A) The morphing results are the best
when the handwritten words are similar to the stroke alphabet. May          [12] Steven M. Seitz and Charles R. Dyer. View morphing. In
we extract information automatically out of the user’s handwrit-                 Proceedings of the 23rd annual conference on Computer
ing such that we create an entire set of prototypes belonging to                 graphics and interactive techniques, pages 21–30, 1996.
one symbol? Or we give up generality and define such prototypes
by hand based on a “feature definition language” – similar to the            [13] Smartboard Interactive Whiteboard, 26.11.2003.
approach of [6] in sketch recognition. (B) Touching up wrongly         
recognized or badly written words with additional strokes could
                                                                            [14] Wacom Cintiq Interactive Pen Display, 26.11.2003.
further improve interaction. (C) There are mainly two ways of ap-
plying the morphing process: The transformation occurs very fast
4                                                                           [15] Windows XP Tablet PC Home Page, 26.11.2003.
  We found that the recognizer of the MS Windows Tablet PC Edi-
tion performs in general quite good, but shows weaknesses on           
bound writing styles.
  E.g., the morphing of the letter ’l’ appears unnatural if it is written
by the user with a buckle, when ’l’ is represented as a vertical line
in the alphabet.
  Cf. the stage-two metamorphosis in [1]; in the case of a sans
serif font this step is much easier than in the case of serif fonts.
Alternatively, another solution could be an application that converts
the stroke font automatically into a TrueType or PS font.

Shared By: