A METHOD OF UNDERSTANDING CONCEPTUAL DIAGRAMS - PDF

Document Sample
scope of work template
							MVA'94 IAPR Workshop on Machine Vision Applications Dec. 13-15, 1994, Kawasaki



          A METHOD OF UNDERSTANDING CONCEPTUAL DIAGRAMS                                                                      *
                 Noriyoshi Yoneda Koichi Kise Shinobu Takamatsu Kunio F'ukunaga

                                  Department of Computer and Systems Sciences
                                         University of Osaka Prefecture
                                   Email: (noriyosi I kise)@ss.cs.osakafu-u.ac.jp

                          Abstract                                                                             class
      A conceptual diagram is a line drawing which repre-
  sents semantic structure of concepts usirtg simple yeo-
  metric entities. This paper presents a method of un-
  derstanding conceptual diagrams. The objective of our
  method is to interpret semantic roles of geometric en-
  tities i n conceptual diagrams. In conceptual diagrams,
  however, a single geometric entity plays various seman-
  tic roles for representing concepts, becanse there are no              Figure 1: An example of conceptl~al
                                                                                                           diagrams
  strict rules for writing conceptual diagmms. To cope
  with this problem, we introduce the strategy of h y p t h -      coricepts (dashed lines I); geometric entities cannot be
  esis generation and verificution; hypothesized interpre-         unanibiguo~~sly  interpreted from local viewpoints. To
  tations are verified by relaxation which takes account of        cope with this problem, we introduce the strategy of
  the semantic relation to other entities. &om the experi-         hypothesis generation and verification. Hypothesized
  mental results using 50 conceptual diagrams, we discuss          interpretations of physical structure are verified by re-
  the effectiveness and the limitations of our method.             laxation which takes account of the global consistency
                                                                   of logical structure. From the experimental rm~ilts   11s-
  1     Introduction                                               ing 50 conceptual diagrams, we discum the effectiveness
    Understanding of line drawings is indispensable to             and the limitations of our method.
 realize document image understanding. A number of
 studies have been made on various types of line draw-             2     Conceptual Diagrams
 ings (e.g., technical drawings, maps, flow-charts and cir-           Let us start with considering the physical and logical
 cuit diagrams). In the interpretation of line drawings,           structure of conceptual diagrams. The physical struc-
 most of the existing methods focus on extracting pre-             ture can be represented as physical relations between
 cise description of geometric entities and their spatial          physical objects as follows:
 relations (or physical structure). It would be sufficient
 for understanding flow-charts and circuit diagrams, be-           physical o b j e c t s loops (rectangles, ovals, etc.), lines
 cause, by the rules of writing these diagrams, the physi-             (solid, broken and dotted lines with or without
 cal structure clearly corresponds to what these diagranis            arrowheads) and character strings (simply called
 semantically represent; once the physical structure is re-           strings hereafter).
 constructed, it is trivial to extract the information rep-        physical relations spatial relations between physical
 resented in these diagrams.                                          objects such as contact, overlap, proximity and
    In recent years, however, the need to extract sernan-             alignment.
 tic entities and their relations (or logical structure) has
 been ernphasized[l, 21. It seems essential for some kinds         The logical structure is likewise represented as logical
 of line drawings like conceptual diagrams. As shown in            relations among logical objects.
 Fig. 1, a conceptr~aldiagram is a line drawing which
 ill~~strates str~ictr~re some concepts using simple
              the           of                                     logical o b j e c t s concepts represented in a diagram.
 geometric entities (loops, lines and character strings).              Concepts often have their labels represented as
 Conceptual diagrams are similar to flow-charts in phys-               strings.
 ical structure. However, the logical structure should be          logical relations relations among concepts. Although
 interpreted from the physical structure, because there                there would be many kinds of relations among con-
 are no rules which make the interpretation trivial.                   cepts, we focus here on the relations explicitly rep-
    In this paper, we propose a method of understanding                resented in a diagram. Labels are often attached to
 conceptrial diagrams. A major problem in understand-                  logical relations.
 ing concept~ial  diagrams is that a single geometric entity
 plays various semantic roles depending on surrounding                It can be generally said that a person who writes a
 entities. For exaniple, a line in Fig. 1 can be interpreted       line drawing determines its physical structure aiming
 as a relation between concepts (solid lines), division of         at easy understanding of its logical structure. In other
                                                                   words, the physical structure of a line drawing is closely
     'This work was supported in part by Grant-in-Aid for Scien-                          --

 tific Research from the Ministry of Education, and the Telerom-      'Dashed lines are used just for explanation. If they were writ-
 rnunications Advancement Foundation.                              ten in solid lines, they would still represent division of roncepts.
related to its logical structure. For flow-charts and cir-
cuit diagrams, such relation is strictly determined as
standards. However, there are no standards or definite
rules for conceptual diagrams; we only have some cus-
                                                                       rzqz:                physical obi. A
                                                                                                               +\physical    obj.
                                                                              enclosure                         overlap
tomary rules of writing conceptual diagrams.
   The difficulty of understanding conceptual diagram
is attributable to this point. To be concrete, we face
various local ambiguities in interpretation of physical                   physical   obj.                      physical    obj.
structure. Some of them are listed below:
                                                                              contact                          proximity
      Concepts are often represented as loops. However,
      there exist concepts represented in different ways.
      For example, a string can solely correspond to a
      concept. Similarly, a compound concept is often                   top    v-center      bottom     left    h-center          right
      rcprescnted as a loop which encloses some loops.                                           alignment
      However, aligned loops sometimes (but not always)
      indicate the existence of a compound concept in-
      clr~dirlgconcepts represented as the loops.
                                                                         ,    STRING A n e
                                                                                                      G T - - 0
                                                                                                                   physical       obj.


      Lirics often corresponds to logical relations alriong                   parallel                         next-to-end
      concepts. However, lines also represent the division
      and the grouping of concepts.
                                                                               Figure 2: Physical relations
      Strings are often interpreted a labels of concepts/
                                      5
      relations which are represented as loops/lines.          described in [3], except that no explicit models of loops
      However, it is not easy to find which loop/line a        are utilized.
      string is associated with; a string is not always the       Ends of line segments are classified into terrriinals,
      label of the loop/line closest to the string.            paths and branches. A terminal end is the end which
                                                               belongs to only one line segment. A path end is the end
3     Overview of Processing                                   at which exactly two line segments contact. A branch
   Our 111c:thod of understanding conceptual diagrams is       end is the end a t which more than two line seenients
twofold: extractio~iof physical struct~lre extraction
                                             and               c o n t x t . A ch,ain is a scqucnc:c of line scgniants con-
of logical structure.                                          catenated at all path erlds on condition that: (1) a chain
   The process of extrar:tion of physical str~rcture   takes   includes line segments of the saxne type (solid, dotted
as input the data of line segnients and strings, and gen-      or broken), (2) a chain does not include an arrowhead
erates the description of physical struct~ire.In the input     in the middle.
data, a line segment is represented as coordinates of two         In the first step, all chains are extracted fronl the in-
end points, a type of a linc (solid, dotted or broken) and     put data of line segments. Next step is to find apparent
a type of e x h end (with or without an arrowhead). A          loops and lines. A chain whose two end points c:oincide
string is represented as coordinates of its bounding rect-     is identified as a loop and removed from the input data.
angle and characters in it.                                    If a chain has at least one terminal end, or has at least
   T l ~ description of physical structure is interpreted by
         c                                                     one arrowhead, it is identified as a line and removed.
the process of extraction of logical structure. To cope        This step of processing is repeated until no more chains
with the local ambiguities, we employ the strategy of          are removed. In the third step, we focus on chains con-
hypothesis generation and verification. First, from local      nected at a branch end. If two of such chains form a
viewpoints, possible interpretations of the description        straight segment at the branch, they are concatenated.
are enumerated as hypotheses of concepts, relations and        Then loops are extracted again from chains. After all
labels. Then, these hypotheses are verified to reject          loops are extracted, the chains which remain in the in-
unplausible interpretations.                                   put data are regarded as lines.
   Note that we do not deal with the linguistic meaning        4.2    Physical relations
of concepts; we airn to extract the logical structure ex-
plicitly represented in a diagram. Thus logical objects           As the physical relations, we consider the relations
and relations which have no labels are accepted as the         shown in Fig. 2. In the followings, the bounding rectan-
output, and no further processing such as identification       gle of a string is considered, in the case that a physical
of the hidden meaning of logical objects or relations[2]       object is a string.
is considered.                                                    The relation enclosure is defined between a loop and
                                                               a physical object. If a loop includes a physical object
4     Extraction of Physical Structure                         and no other loops do not include both the physical ob-
   This process consists of the extraction of physical ob-     ject and the loop, it is said that the loop erlcloses the
jects, physical relations and implicit loops indicated by      physical object, or the loop has the enclosure relation
physical structure.                                            to the physical object. In Fig. 2, the physical object
                                                               A is enclosed by the loop B, but not by the loop C.
4.1     Physical objects                                       Two physical objects overlap if one of the physical 0th
   As described in 2, loops are importaut physical ob-         jects lies inside the region bounded by the other physical
jects in conceptual diagrams. Thus we attempt to ex-           object. Contact is the relation between two physical ob-
tract loops from line segments. By extracting all loops        jects if their boundaries share soriie points ant1 they do
frorn linc segments, we can also obtain lines from the         not overlap. For the relation proximity, wc: focus on
rest of line segments. Our procedure rese~nblesthe one         the distance between physical objects A and B. The
                     grouping rectangle
                                                                              % 5        omission



                      (a) grouping line
                                                                                 Figure 4: Logical relations
                                           parent loop             3. The line divides the loops in the parent loop into
                                                                      at least two groups. We consider the extension of
                                           division line
                                                                      the line similar to the above condition. A bonnd-
                                                                      ing rectangle of a group of loops is also called a
                                                                      grouping rectangle.
                       (b) division line                             In the case that these two types of lines or the loops
                                                                  having the alignment relation are identified, a group of
          Figure 3: Grouping and division lines                   loops is extracted as an implicit loop which is repre-
                                                                  sented as a grouping rectangle. When an implicit loop
distance is defined as the minimum distance &,,(> 0 )             is identified, physical relations about the implicit loop
between points a and b which are on the boundaries of A           are also calcrilated. In the following, we use the term
and B, respectively. The points forming the minimum               explicit loops to refer to loops except implicit loops.
distance are described as a' and b'. If dm,, is less than a
certain threshold and no physical objects overlap with            5 Extraction of Logical Structure
the segment between a' and b', A has the proximity re-            5.1 Hypothesis Generation
lation to B. For the alignment relation, we utilize six              In this step, all possible interpretations of physical
types shown in Fig. 2.                                            objects and relations are enumerated as hypotheses from
   The relations parallel and next-to-end are somewhat            local viewpoints. Hypotheses generated at this step are
special. The relation parallel is defined between a line          classified into three types: hypotheses of logical objects
and a string. If a line segment in a linc is parallel with        (concepts), logical relations (relations among concepts)
the longer side of the bounding rectangle of a string,            and labels (names of concepts or relations).
they have the parallel relation. The relation next-to-end            Hypotheses of logical objects are generated from the
is the special case of the proximity and the contact. If          following physical objccts:
(1) a line has the proximity relation to a physical object,
and (2) the extension of the line from an end contacts                loops (explicit and implicit),
with the physical object, the line has the relation of                strings having the relation next-bend to linm,
next-to-end with the physical object. In addition, if an              dotted or broken straight lines. (Tlicse lines indi-
end of a line contacts with a physical object, they also              cate the omission of logical objects.)
have the next-to-end relation.
                                                                    Hypotheses of logical relations are generated between
4.3     Implicit loops                                            physical objects as follows:
   The role of this step is to identify a group of loops rep-         a linc having the relations of next-to-end to physi-
resented by grouping and division lines, and alignment                cal objects ( a logical relation between the physical
of the loops. Examples of grouping and division lines                 objects).
are ill~~strated Fig. 3, where the parent loop indicates
                 in
                                                                      the enclos~~re  relation between loops ( a logical re-
either a loop or a bolinding rectangle of a diagram.
                                                                      lation "part-of" between the loops).
                                          line
   As shown in Fig. 3(a), a g r o ~ ~ p i n g is the line which
satisfies the following conditions:                               Note that a line which has the next-to-end relation at
                                                                  only one end is also accepted as a hypothesis of a logical
 1. Both of the two ends of the linc have arrowheads,             relation, b e c a ~ ~ a e
                                                                                        s physical object is sometimes omitted
    or both of them have no arrowheads.
                                                                  as shown in Fig. 4(a). In such cases, we also generate a
 2. The shape of the line is straight, or like a brace.           hypothesis of an omitted logical object. In addition, a
 3. The line must not overlap with the loops in the               line is interpreted as a logical relation with other lines.
    parent loop.                                                  In Fig. 4(b), the line 1 is hypothesized as the logical
 4. The grouping rectangle shown in Fig. 3(a) encloses            relation between the lines 2 and 3. This enables 11s to
    some but not all loops in the parent loop, and none           interpret a set of lines as an n-ary relation.
    of the loops in the parent loop overlaps with the                Hypothesis of labels are generated for each string ac-
    boundary of the grouping rectangle.                           companied with a physical object with which the la-
On the other hand, the conditions of a division line are          bel is associated. A simple way to do this is to asstr
as follows (see Fig. 3(b)):
                                                                  ciate a string with physical objects each of which has
                                                                  the proximity relation to the string. However. this may
  1. The line does not have an arrowhead.                         cause nlany incorrec:t hypotheses or n~iss      niany correct
  2. The liiic: docs not overlap with the loops in tht:           hypotheses depending severely on the tl~rcsl~ol(l the  of
     parent loop. If the line does not contact with the           proximity relation. Thus, we utilize some heuristics to
     parent loop, we also consider the extension shown            improve the accnracy of hypothesis generation. Hy-
     as the broken line in Fig 3(b).                              potheses of labels are generated as follows:
                  P3 (logical object, label)/
                                                                                       Table 1: Experimental results
                                                                                          No. of hypotheses(N) Cover rate (C)
                PI             P2                                          generation              1.53              99.7%
          (logical object) (logical relation)                              verification I          1.16              99.7%
                                                                       I    selection I            1. O
                                                                                                    O
                                                                                                               I

                                                                                                                     99.2%
    Fignre 5: An example of hypothesis verification
                                                                       6       Experimental Results
  r a string is hypothesized as the label of a loop which                   Our method was applied to 50 samples of conceptr~al
    encloses the string.                                                diagrams obtained from vario~lstechnical papers and
  r (heuristic 1) If a loop has the relation of a l i g n ~ i ~ r l ~ t tcxtbooks written in .Japanese and English. In tl~rse
    to a string in addition to proximity, the string is                                s,
                                                                        s a ~ ~ i p l r471 logical ol)jrc.ts, 517 logical rc,latior~sa l ~ d
    hyl)othesized as thc: lal)(:l of the loop.                          491 labels were in(-l~ideti.
  r (lie~lristic If a linr has the relation of parallel to
                2)                                                                       ts               at
                                                                            R e s ~ ~ lwere eval~~ated ex11 steps of extraction of
    a string in addition to the proximity, the string is                logical strr~cture     (i.e., hypothesis generation, verification
    hypothesized as the label of the line.                              and selection) rising the following criteria:
  r If a string does not satisfy both of the above two                  N: the average n ~ ~ m b ofr hypotheses for one correct
                                                                                                         r
                 the
    he~~ristics, string is hypothesized as the labels                          logical entity (i.e., a11 object, a relation or a label),
    of loops a ~ i dlines which have at least one of the                C: cover rate: the rate of the number of correct hy-
    proximity, contact and overlap relations with thc                                                        of l ~ r r
                                                                               potheses for the ~ ~ l ~ n ~correct logical c~~titios.
    string.
                                                                                                                  r       s.
                                                                           Table 1 shows the experi~nental c s ~ ~ l tAt the step of
                                                                        hypothesis generation, two correct logical objects co111d
5.2 Hypothesis Verification                                             not be hypothesized since labels were too apart from
  The following constrailits are employed to verify hy-                 their physical objects. In addition, lines crossing pcr-
potheses.                                                                                   as
                                                                        pendic~~larly in Fig. 1 were misinterpreted as they
                                                                        were not connected. At the step of hypothesis verifica-
C1 A physical objcct exc:ept implicit loops must be in-                 tion, 69.7% of incorrect hypotheses were rejectrd, wt~ile
    terpreted as at least one of a logical object, a logical            all correct hypotheses were preserved. At the step of
    relation and a label.                                                               ,
                                                                        s r l ~ t i o n nine correct hypotheses were erroneously re-
C2 A logical object must have a logical relation.                       jrcted becanse: (1) an incorrect physical object was
C3 A logical relation must have two logical objects to                  closer to a string which represented a label of other
    be related.                                                         physical object, (2) although a single string represented
                                                                        labels of two physical objects, only one physical object
C4 A label must have a logical object or a relation to                  was selected. We consider that these errors indicate the
    be associated with.                                                 limitations of our method which interprets the physical
C5 An implicit loop except ones generated from divi-                    structure. In order to recover these errors, it is ncces-
    sion lines must have a label.                                       sary to introduce the analysis of linguistic meaning of
C6 A dotted or broken line which represents the omis-                   strings instead of the selection step.
    sion of logical objects must not have a label.
                                                                       7       Conclusion
The procedure of hypothesis verification behaves like re-                 We have presented a method of understanding con-
lmation. Rejection of invalid hypotheses found by test-                ceptual diagrams. To cope with the local ambiguities in
ing C2-C6 is repeated until no more hypotheses are                     interpretation of physical structure, we utilize the tech-
rejected. The verification fails if C1 is violated by the              nique of hypothesis generation and verification. From
rejection.                                                             the experimental results for 50 samples of conccpt~ial
   Let us consider a simple example shown in Fig. 5. Hy-               diagrams, we have confirnied that our method is effec-
potheses for physical objects P1 P4 are listed in paren-               tive but has some limitations of interpretation. The
theses. The physical object P3 has two interpretations                 remaining work is the interpretation of concept~~al   dia-
( a logical object and a label of P4), while other physi-              grams from their images, and incorporation of natural
cal objects have only one interpretation. We can select                language processing to improve the accuracy.
the interpretation "P3 corresponds to a logical object",
since C3 is violated if P3 is a label of P4.                           References
                                                                           [l] K. Tombre. Technical Drawing Recognition and Un-
5.3      Selection of plausible hypotheses                                     derstanding: From Pixels to Semantics. Proc. of
   The constraints utilized in the verification arc not,                       IAPR Workshop on MVA '92, pp.393 402, 1992.
strong enough to se1ec:t tlir ~riostplausible hypotheses.                  [2] Y. Nakamura, R. F~~rnkawa     and M. Nagao. Di-
111 ~)art,ic~~lar,
                 incorrect 1iyl)otliescs of labels ren~ainaf-                  agram Understandir~g Utilizing Natural Lal~gnwgc,
tc-r the verification. In order to select the hypotheses                       Text. Proc. of the 2nd Int'l Conf. on Docu~rlent
of labels, we utilize the following rules: (1) If a string                     Analysis and Recognition, pp.614-618, 1993.
overlaps or contacts with a physical object, a hypoth-                     [3] R. Kasturi, S. T. Bow, W. El-Masri, J. Shah,
vsis stating that the string is attached to the physical                       -1. R. Gattiker and U. B. Mokate. A Systen~ 111-
                                                                                                                           for
ol)jrc.t is selected. (2) Otherwise, a hypothesis stating                      terpretation of Line Drawings. IEEE Trans. PAMI,
that a string is attached to the nearest (dmi, is smallest)                    Vo1.12, No.10, pp.978 992, 1990.
I)l~ysic:alobject is selected.

						
Related docs