A METHOD OF UNDERSTANDING CONCEPTUAL DIAGRAMS - PDF
Document Sample


MVA'94 IAPR Workshop on Machine Vision Applications Dec. 13-15, 1994, Kawasaki
A METHOD OF UNDERSTANDING CONCEPTUAL DIAGRAMS *
Noriyoshi Yoneda Koichi Kise Shinobu Takamatsu Kunio F'ukunaga
Department of Computer and Systems Sciences
University of Osaka Prefecture
Email: (noriyosi I kise)@ss.cs.osakafu-u.ac.jp
Abstract class
A conceptual diagram is a line drawing which repre-
sents semantic structure of concepts usirtg simple yeo-
metric entities. This paper presents a method of un-
derstanding conceptual diagrams. The objective of our
method is to interpret semantic roles of geometric en-
tities i n conceptual diagrams. In conceptual diagrams,
however, a single geometric entity plays various seman-
tic roles for representing concepts, becanse there are no Figure 1: An example of conceptl~al
diagrams
strict rules for writing conceptual diagmms. To cope
with this problem, we introduce the strategy of h y p t h - coricepts (dashed lines I); geometric entities cannot be
esis generation and verificution; hypothesized interpre- unanibiguo~~sly interpreted from local viewpoints. To
tations are verified by relaxation which takes account of cope with this problem, we introduce the strategy of
the semantic relation to other entities. &om the experi- hypothesis generation and verification. Hypothesized
mental results using 50 conceptual diagrams, we discuss interpretations of physical structure are verified by re-
the effectiveness and the limitations of our method. laxation which takes account of the global consistency
of logical structure. From the experimental rm~ilts 11s-
1 Introduction ing 50 conceptual diagrams, we discum the effectiveness
Understanding of line drawings is indispensable to and the limitations of our method.
realize document image understanding. A number of
studies have been made on various types of line draw- 2 Conceptual Diagrams
ings (e.g., technical drawings, maps, flow-charts and cir- Let us start with considering the physical and logical
cuit diagrams). In the interpretation of line drawings, structure of conceptual diagrams. The physical struc-
most of the existing methods focus on extracting pre- ture can be represented as physical relations between
cise description of geometric entities and their spatial physical objects as follows:
relations (or physical structure). It would be sufficient
for understanding flow-charts and circuit diagrams, be- physical o b j e c t s loops (rectangles, ovals, etc.), lines
cause, by the rules of writing these diagrams, the physi- (solid, broken and dotted lines with or without
cal structure clearly corresponds to what these diagranis arrowheads) and character strings (simply called
semantically represent; once the physical structure is re- strings hereafter).
constructed, it is trivial to extract the information rep- physical relations spatial relations between physical
resented in these diagrams. objects such as contact, overlap, proximity and
In recent years, however, the need to extract sernan- alignment.
tic entities and their relations (or logical structure) has
been ernphasized[l, 21. It seems essential for some kinds The logical structure is likewise represented as logical
of line drawings like conceptual diagrams. As shown in relations among logical objects.
Fig. 1, a conceptr~aldiagram is a line drawing which
ill~~strates str~ictr~re some concepts using simple
the of logical o b j e c t s concepts represented in a diagram.
geometric entities (loops, lines and character strings). Concepts often have their labels represented as
Conceptual diagrams are similar to flow-charts in phys- strings.
ical structure. However, the logical structure should be logical relations relations among concepts. Although
interpreted from the physical structure, because there there would be many kinds of relations among con-
are no rules which make the interpretation trivial. cepts, we focus here on the relations explicitly rep-
In this paper, we propose a method of understanding resented in a diagram. Labels are often attached to
conceptrial diagrams. A major problem in understand- logical relations.
ing concept~ial diagrams is that a single geometric entity
plays various semantic roles depending on surrounding It can be generally said that a person who writes a
entities. For exaniple, a line in Fig. 1 can be interpreted line drawing determines its physical structure aiming
as a relation between concepts (solid lines), division of at easy understanding of its logical structure. In other
words, the physical structure of a line drawing is closely
'This work was supported in part by Grant-in-Aid for Scien- --
tific Research from the Ministry of Education, and the Telerom- 'Dashed lines are used just for explanation. If they were writ-
rnunications Advancement Foundation. ten in solid lines, they would still represent division of roncepts.
related to its logical structure. For flow-charts and cir-
cuit diagrams, such relation is strictly determined as
standards. However, there are no standards or definite
rules for conceptual diagrams; we only have some cus-
rzqz: physical obi. A
+\physical obj.
enclosure overlap
tomary rules of writing conceptual diagrams.
The difficulty of understanding conceptual diagram
is attributable to this point. To be concrete, we face
various local ambiguities in interpretation of physical physical obj. physical obj.
structure. Some of them are listed below:
contact proximity
Concepts are often represented as loops. However,
there exist concepts represented in different ways.
For example, a string can solely correspond to a
concept. Similarly, a compound concept is often top v-center bottom left h-center right
rcprescnted as a loop which encloses some loops. alignment
However, aligned loops sometimes (but not always)
indicate the existence of a compound concept in-
clr~dirlgconcepts represented as the loops.
, STRING A n e
G T - - 0
physical obj.
Lirics often corresponds to logical relations alriong parallel next-to-end
concepts. However, lines also represent the division
and the grouping of concepts.
Figure 2: Physical relations
Strings are often interpreted a labels of concepts/
5
relations which are represented as loops/lines. described in [3], except that no explicit models of loops
However, it is not easy to find which loop/line a are utilized.
string is associated with; a string is not always the Ends of line segments are classified into terrriinals,
label of the loop/line closest to the string. paths and branches. A terminal end is the end which
belongs to only one line segment. A path end is the end
3 Overview of Processing at which exactly two line segments contact. A branch
Our 111c:thod of understanding conceptual diagrams is end is the end a t which more than two line seenients
twofold: extractio~iof physical struct~lre extraction
and c o n t x t . A ch,ain is a scqucnc:c of line scgniants con-
of logical structure. catenated at all path erlds on condition that: (1) a chain
The process of extrar:tion of physical str~rcture takes includes line segments of the saxne type (solid, dotted
as input the data of line segnients and strings, and gen- or broken), (2) a chain does not include an arrowhead
erates the description of physical struct~ire.In the input in the middle.
data, a line segment is represented as coordinates of two In the first step, all chains are extracted fronl the in-
end points, a type of a linc (solid, dotted or broken) and put data of line segments. Next step is to find apparent
a type of e x h end (with or without an arrowhead). A loops and lines. A chain whose two end points c:oincide
string is represented as coordinates of its bounding rect- is identified as a loop and removed from the input data.
angle and characters in it. If a chain has at least one terminal end, or has at least
T l ~ description of physical structure is interpreted by
c one arrowhead, it is identified as a line and removed.
the process of extraction of logical structure. To cope This step of processing is repeated until no more chains
with the local ambiguities, we employ the strategy of are removed. In the third step, we focus on chains con-
hypothesis generation and verification. First, from local nected at a branch end. If two of such chains form a
viewpoints, possible interpretations of the description straight segment at the branch, they are concatenated.
are enumerated as hypotheses of concepts, relations and Then loops are extracted again from chains. After all
labels. Then, these hypotheses are verified to reject loops are extracted, the chains which remain in the in-
unplausible interpretations. put data are regarded as lines.
Note that we do not deal with the linguistic meaning 4.2 Physical relations
of concepts; we airn to extract the logical structure ex-
plicitly represented in a diagram. Thus logical objects As the physical relations, we consider the relations
and relations which have no labels are accepted as the shown in Fig. 2. In the followings, the bounding rectan-
output, and no further processing such as identification gle of a string is considered, in the case that a physical
of the hidden meaning of logical objects or relations[2] object is a string.
is considered. The relation enclosure is defined between a loop and
a physical object. If a loop includes a physical object
4 Extraction of Physical Structure and no other loops do not include both the physical ob-
This process consists of the extraction of physical ob- ject and the loop, it is said that the loop erlcloses the
jects, physical relations and implicit loops indicated by physical object, or the loop has the enclosure relation
physical structure. to the physical object. In Fig. 2, the physical object
A is enclosed by the loop B, but not by the loop C.
4.1 Physical objects Two physical objects overlap if one of the physical 0th
As described in 2, loops are importaut physical ob- jects lies inside the region bounded by the other physical
jects in conceptual diagrams. Thus we attempt to ex- object. Contact is the relation between two physical ob-
tract loops from line segments. By extracting all loops jects if their boundaries share soriie points ant1 they do
frorn linc segments, we can also obtain lines from the not overlap. For the relation proximity, wc: focus on
rest of line segments. Our procedure rese~nblesthe one the distance between physical objects A and B. The
grouping rectangle
% 5 omission
(a) grouping line
Figure 4: Logical relations
parent loop 3. The line divides the loops in the parent loop into
at least two groups. We consider the extension of
division line
the line similar to the above condition. A bonnd-
ing rectangle of a group of loops is also called a
grouping rectangle.
(b) division line In the case that these two types of lines or the loops
having the alignment relation are identified, a group of
Figure 3: Grouping and division lines loops is extracted as an implicit loop which is repre-
sented as a grouping rectangle. When an implicit loop
distance is defined as the minimum distance &,,(> 0 ) is identified, physical relations about the implicit loop
between points a and b which are on the boundaries of A are also calcrilated. In the following, we use the term
and B, respectively. The points forming the minimum explicit loops to refer to loops except implicit loops.
distance are described as a' and b'. If dm,, is less than a
certain threshold and no physical objects overlap with 5 Extraction of Logical Structure
the segment between a' and b', A has the proximity re- 5.1 Hypothesis Generation
lation to B. For the alignment relation, we utilize six In this step, all possible interpretations of physical
types shown in Fig. 2. objects and relations are enumerated as hypotheses from
The relations parallel and next-to-end are somewhat local viewpoints. Hypotheses generated at this step are
special. The relation parallel is defined between a line classified into three types: hypotheses of logical objects
and a string. If a line segment in a linc is parallel with (concepts), logical relations (relations among concepts)
the longer side of the bounding rectangle of a string, and labels (names of concepts or relations).
they have the parallel relation. The relation next-to-end Hypotheses of logical objects are generated from the
is the special case of the proximity and the contact. If following physical objccts:
(1) a line has the proximity relation to a physical object,
and (2) the extension of the line from an end contacts loops (explicit and implicit),
with the physical object, the line has the relation of strings having the relation next-bend to linm,
next-to-end with the physical object. In addition, if an dotted or broken straight lines. (Tlicse lines indi-
end of a line contacts with a physical object, they also cate the omission of logical objects.)
have the next-to-end relation.
Hypotheses of logical relations are generated between
4.3 Implicit loops physical objects as follows:
The role of this step is to identify a group of loops rep- a linc having the relations of next-to-end to physi-
resented by grouping and division lines, and alignment cal objects ( a logical relation between the physical
of the loops. Examples of grouping and division lines objects).
are ill~~strated Fig. 3, where the parent loop indicates
in
the enclos~~re relation between loops ( a logical re-
either a loop or a bolinding rectangle of a diagram.
lation "part-of" between the loops).
line
As shown in Fig. 3(a), a g r o ~ ~ p i n g is the line which
satisfies the following conditions: Note that a line which has the next-to-end relation at
only one end is also accepted as a hypothesis of a logical
1. Both of the two ends of the linc have arrowheads, relation, b e c a ~ ~ a e
s physical object is sometimes omitted
or both of them have no arrowheads.
as shown in Fig. 4(a). In such cases, we also generate a
2. The shape of the line is straight, or like a brace. hypothesis of an omitted logical object. In addition, a
3. The line must not overlap with the loops in the line is interpreted as a logical relation with other lines.
parent loop. In Fig. 4(b), the line 1 is hypothesized as the logical
4. The grouping rectangle shown in Fig. 3(a) encloses relation between the lines 2 and 3. This enables 11s to
some but not all loops in the parent loop, and none interpret a set of lines as an n-ary relation.
of the loops in the parent loop overlaps with the Hypothesis of labels are generated for each string ac-
boundary of the grouping rectangle. companied with a physical object with which the la-
On the other hand, the conditions of a division line are bel is associated. A simple way to do this is to asstr
as follows (see Fig. 3(b)):
ciate a string with physical objects each of which has
the proximity relation to the string. However. this may
1. The line does not have an arrowhead. cause nlany incorrec:t hypotheses or n~iss niany correct
2. The liiic: docs not overlap with the loops in tht: hypotheses depending severely on the tl~rcsl~ol(l the of
parent loop. If the line does not contact with the proximity relation. Thus, we utilize some heuristics to
parent loop, we also consider the extension shown improve the accnracy of hypothesis generation. Hy-
as the broken line in Fig 3(b). potheses of labels are generated as follows:
P3 (logical object, label)/
Table 1: Experimental results
No. of hypotheses(N) Cover rate (C)
PI P2 generation 1.53 99.7%
(logical object) (logical relation) verification I 1.16 99.7%
I selection I 1. O
O
I
99.2%
Fignre 5: An example of hypothesis verification
6 Experimental Results
r a string is hypothesized as the label of a loop which Our method was applied to 50 samples of conceptr~al
encloses the string. diagrams obtained from vario~lstechnical papers and
r (heuristic 1) If a loop has the relation of a l i g n ~ i ~ r l ~ t tcxtbooks written in .Japanese and English. In tl~rse
to a string in addition to proximity, the string is s,
s a ~ ~ i p l r471 logical ol)jrc.ts, 517 logical rc,latior~sa l ~ d
hyl)othesized as thc: lal)(:l of the loop. 491 labels were in(-l~ideti.
r (lie~lristic If a linr has the relation of parallel to
2) ts at
R e s ~ ~ lwere eval~~ated ex11 steps of extraction of
a string in addition to the proximity, the string is logical strr~cture (i.e., hypothesis generation, verification
hypothesized as the label of the line. and selection) rising the following criteria:
r If a string does not satisfy both of the above two N: the average n ~ ~ m b ofr hypotheses for one correct
r
the
he~~ristics, string is hypothesized as the labels logical entity (i.e., a11 object, a relation or a label),
of loops a ~ i dlines which have at least one of the C: cover rate: the rate of the number of correct hy-
proximity, contact and overlap relations with thc of l ~ r r
potheses for the ~ ~ l ~ n ~correct logical c~~titios.
string.
r s.
Table 1 shows the experi~nental c s ~ ~ l tAt the step of
hypothesis generation, two correct logical objects co111d
5.2 Hypothesis Verification not be hypothesized since labels were too apart from
The following constrailits are employed to verify hy- their physical objects. In addition, lines crossing pcr-
potheses. as
pendic~~larly in Fig. 1 were misinterpreted as they
were not connected. At the step of hypothesis verifica-
C1 A physical objcct exc:ept implicit loops must be in- tion, 69.7% of incorrect hypotheses were rejectrd, wt~ile
terpreted as at least one of a logical object, a logical all correct hypotheses were preserved. At the step of
relation and a label. ,
s r l ~ t i o n nine correct hypotheses were erroneously re-
C2 A logical object must have a logical relation. jrcted becanse: (1) an incorrect physical object was
C3 A logical relation must have two logical objects to closer to a string which represented a label of other
be related. physical object, (2) although a single string represented
labels of two physical objects, only one physical object
C4 A label must have a logical object or a relation to was selected. We consider that these errors indicate the
be associated with. limitations of our method which interprets the physical
C5 An implicit loop except ones generated from divi- structure. In order to recover these errors, it is ncces-
sion lines must have a label. sary to introduce the analysis of linguistic meaning of
C6 A dotted or broken line which represents the omis- strings instead of the selection step.
sion of logical objects must not have a label.
7 Conclusion
The procedure of hypothesis verification behaves like re- We have presented a method of understanding con-
lmation. Rejection of invalid hypotheses found by test- ceptual diagrams. To cope with the local ambiguities in
ing C2-C6 is repeated until no more hypotheses are interpretation of physical structure, we utilize the tech-
rejected. The verification fails if C1 is violated by the nique of hypothesis generation and verification. From
rejection. the experimental results for 50 samples of conccpt~ial
Let us consider a simple example shown in Fig. 5. Hy- diagrams, we have confirnied that our method is effec-
potheses for physical objects P1 P4 are listed in paren- tive but has some limitations of interpretation. The
theses. The physical object P3 has two interpretations remaining work is the interpretation of concept~~al dia-
( a logical object and a label of P4), while other physi- grams from their images, and incorporation of natural
cal objects have only one interpretation. We can select language processing to improve the accuracy.
the interpretation "P3 corresponds to a logical object",
since C3 is violated if P3 is a label of P4. References
[l] K. Tombre. Technical Drawing Recognition and Un-
5.3 Selection of plausible hypotheses derstanding: From Pixels to Semantics. Proc. of
The constraints utilized in the verification arc not, IAPR Workshop on MVA '92, pp.393 402, 1992.
strong enough to se1ec:t tlir ~riostplausible hypotheses. [2] Y. Nakamura, R. F~~rnkawa and M. Nagao. Di-
111 ~)art,ic~~lar,
incorrect 1iyl)otliescs of labels ren~ainaf- agram Understandir~g Utilizing Natural Lal~gnwgc,
tc-r the verification. In order to select the hypotheses Text. Proc. of the 2nd Int'l Conf. on Docu~rlent
of labels, we utilize the following rules: (1) If a string Analysis and Recognition, pp.614-618, 1993.
overlaps or contacts with a physical object, a hypoth- [3] R. Kasturi, S. T. Bow, W. El-Masri, J. Shah,
vsis stating that the string is attached to the physical -1. R. Gattiker and U. B. Mokate. A Systen~ 111-
for
ol)jrc.t is selected. (2) Otherwise, a hypothesis stating terpretation of Line Drawings. IEEE Trans. PAMI,
that a string is attached to the nearest (dmi, is smallest) Vo1.12, No.10, pp.978 992, 1990.
I)l~ysic:alobject is selected.
Related docs
Get documents about "