A METHOD OF UNDERSTANDING CONCEPTUAL DIAGRAMS - PDF
Shared by: undul855
MVA'94 IAPR Workshop on Machine Vision Applications Dec. 13-15, 1994, Kawasaki A METHOD OF UNDERSTANDING CONCEPTUAL DIAGRAMS * Noriyoshi Yoneda Koichi Kise Shinobu Takamatsu Kunio F'ukunaga Department of Computer and Systems Sciences University of Osaka Prefecture Email: (noriyosi I kise)@ss.cs.osakafu-u.ac.jp Abstract class A conceptual diagram is a line drawing which repre- sents semantic structure of concepts usirtg simple yeo- metric entities. This paper presents a method of un- derstanding conceptual diagrams. The objective of our method is to interpret semantic roles of geometric en- tities i n conceptual diagrams. In conceptual diagrams, however, a single geometric entity plays various seman- tic roles for representing concepts, becanse there are no Figure 1: An example of conceptl~al diagrams strict rules for writing conceptual diagmms. To cope with this problem, we introduce the strategy of h y p t h - coricepts (dashed lines I); geometric entities cannot be esis generation and verificution; hypothesized interpre- unanibiguo~~sly interpreted from local viewpoints. To tations are verified by relaxation which takes account of cope with this problem, we introduce the strategy of the semantic relation to other entities. &om the experi- hypothesis generation and verification. Hypothesized mental results using 50 conceptual diagrams, we discuss interpretations of physical structure are verified by re- the effectiveness and the limitations of our method. laxation which takes account of the global consistency of logical structure. From the experimental rm~ilts 11s- 1 Introduction ing 50 conceptual diagrams, we discum the effectiveness Understanding of line drawings is indispensable to and the limitations of our method. realize document image understanding. A number of studies have been made on various types of line draw- 2 Conceptual Diagrams ings (e.g., technical drawings, maps, flow-charts and cir- Let us start with considering the physical and logical cuit diagrams). In the interpretation of line drawings, structure of conceptual diagrams. The physical struc- most of the existing methods focus on extracting pre- ture can be represented as physical relations between cise description of geometric entities and their spatial physical objects as follows: relations (or physical structure). It would be sufficient for understanding flow-charts and circuit diagrams, be- physical o b j e c t s loops (rectangles, ovals, etc.), lines cause, by the rules of writing these diagrams, the physi- (solid, broken and dotted lines with or without cal structure clearly corresponds to what these diagranis arrowheads) and character strings (simply called semantically represent; once the physical structure is re- strings hereafter). constructed, it is trivial to extract the information rep- physical relations spatial relations between physical resented in these diagrams. objects such as contact, overlap, proximity and In recent years, however, the need to extract sernan- alignment. tic entities and their relations (or logical structure) has been ernphasized[l, 21. It seems essential for some kinds The logical structure is likewise represented as logical of line drawings like conceptual diagrams. As shown in relations among logical objects. Fig. 1, a conceptr~aldiagram is a line drawing which ill~~strates str~ictr~re some concepts using simple the of logical o b j e c t s concepts represented in a diagram. geometric entities (loops, lines and character strings). Concepts often have their labels represented as Conceptual diagrams are similar to flow-charts in phys- strings. ical structure. However, the logical structure should be logical relations relations among concepts. Although interpreted from the physical structure, because there there would be many kinds of relations among con- are no rules which make the interpretation trivial. cepts, we focus here on the relations explicitly rep- In this paper, we propose a method of understanding resented in a diagram. Labels are often attached to conceptrial diagrams. A major problem in understand- logical relations. ing concept~ial diagrams is that a single geometric entity plays various semantic roles depending on surrounding It can be generally said that a person who writes a entities. For exaniple, a line in Fig. 1 can be interpreted line drawing determines its physical structure aiming as a relation between concepts (solid lines), division of at easy understanding of its logical structure. In other words, the physical structure of a line drawing is closely 'This work was supported in part by Grant-in-Aid for Scien- -- tific Research from the Ministry of Education, and the Telerom- 'Dashed lines are used just for explanation. If they were writ- rnunications Advancement Foundation. ten in solid lines, they would still represent division of roncepts. related to its logical structure. For flow-charts and cir- cuit diagrams, such relation is strictly determined as standards. However, there are no standards or definite rules for conceptual diagrams; we only have some cus- rzqz: physical obi. A +\physical obj. enclosure overlap tomary rules of writing conceptual diagrams. The difficulty of understanding conceptual diagram is attributable to this point. To be concrete, we face various local ambiguities in interpretation of physical physical obj. physical obj. structure. Some of them are listed below: contact proximity Concepts are often represented as loops. However, there exist concepts represented in different ways. For example, a string can solely correspond to a concept. Similarly, a compound concept is often top v-center bottom left h-center right rcprescnted as a loop which encloses some loops. alignment However, aligned loops sometimes (but not always) indicate the existence of a compound concept in- clr~dirlgconcepts represented as the loops. , STRING A n e G T - - 0 physical obj. Lirics often corresponds to logical relations alriong parallel next-to-end concepts. However, lines also represent the division and the grouping of concepts. Figure 2: Physical relations Strings are often interpreted a labels of concepts/ 5 relations which are represented as loops/lines. described in , except that no explicit models of loops However, it is not easy to find which loop/line a are utilized. string is associated with; a string is not always the Ends of line segments are classified into terrriinals, label of the loop/line closest to the string. paths and branches. A terminal end is the end which belongs to only one line segment. A path end is the end 3 Overview of Processing at which exactly two line segments contact. A branch Our 111c:thod of understanding conceptual diagrams is end is the end a t which more than two line seenients twofold: extractio~iof physical struct~lre extraction and c o n t x t . A ch,ain is a scqucnc:c of line scgniants con- of logical structure. catenated at all path erlds on condition that: (1) a chain The process of extrar:tion of physical str~rcture takes includes line segments of the saxne type (solid, dotted as input the data of line segnients and strings, and gen- or broken), (2) a chain does not include an arrowhead erates the description of physical struct~ire.In the input in the middle. data, a line segment is represented as coordinates of two In the first step, all chains are extracted fronl the in- end points, a type of a linc (solid, dotted or broken) and put data of line segments. Next step is to find apparent a type of e x h end (with or without an arrowhead). A loops and lines. A chain whose two end points c:oincide string is represented as coordinates of its bounding rect- is identified as a loop and removed from the input data. angle and characters in it. If a chain has at least one terminal end, or has at least T l ~ description of physical structure is interpreted by c one arrowhead, it is identified as a line and removed. the process of extraction of logical structure. To cope This step of processing is repeated until no more chains with the local ambiguities, we employ the strategy of are removed. In the third step, we focus on chains con- hypothesis generation and verification. First, from local nected at a branch end. If two of such chains form a viewpoints, possible interpretations of the description straight segment at the branch, they are concatenated. are enumerated as hypotheses of concepts, relations and Then loops are extracted again from chains. After all labels. Then, these hypotheses are verified to reject loops are extracted, the chains which remain in the in- unplausible interpretations. put data are regarded as lines. Note that we do not deal with the linguistic meaning 4.2 Physical relations of concepts; we airn to extract the logical structure ex- plicitly represented in a diagram. Thus logical objects As the physical relations, we consider the relations and relations which have no labels are accepted as the shown in Fig. 2. In the followings, the bounding rectan- output, and no further processing such as identification gle of a string is considered, in the case that a physical of the hidden meaning of logical objects or relations object is a string. is considered. The relation enclosure is defined between a loop and a physical object. If a loop includes a physical object 4 Extraction of Physical Structure and no other loops do not include both the physical ob- This process consists of the extraction of physical ob- ject and the loop, it is said that the loop erlcloses the jects, physical relations and implicit loops indicated by physical object, or the loop has the enclosure relation physical structure. to the physical object. In Fig. 2, the physical object A is enclosed by the loop B, but not by the loop C. 4.1 Physical objects Two physical objects overlap if one of the physical 0th As described in 2, loops are importaut physical ob- jects lies inside the region bounded by the other physical jects in conceptual diagrams. Thus we attempt to ex- object. Contact is the relation between two physical ob- tract loops from line segments. By extracting all loops jects if their boundaries share soriie points ant1 they do frorn linc segments, we can also obtain lines from the not overlap. For the relation proximity, wc: focus on rest of line segments. Our procedure rese~nblesthe one the distance between physical objects A and B. The grouping rectangle % 5 omission (a) grouping line Figure 4: Logical relations parent loop 3. The line divides the loops in the parent loop into at least two groups. We consider the extension of division line the line similar to the above condition. A bonnd- ing rectangle of a group of loops is also called a grouping rectangle. (b) division line In the case that these two types of lines or the loops having the alignment relation are identified, a group of Figure 3: Grouping and division lines loops is extracted as an implicit loop which is repre- sented as a grouping rectangle. When an implicit loop distance is defined as the minimum distance &,,(> 0 ) is identified, physical relations about the implicit loop between points a and b which are on the boundaries of A are also calcrilated. In the following, we use the term and B, respectively. The points forming the minimum explicit loops to refer to loops except implicit loops. distance are described as a' and b'. If dm,, is less than a certain threshold and no physical objects overlap with 5 Extraction of Logical Structure the segment between a' and b', A has the proximity re- 5.1 Hypothesis Generation lation to B. For the alignment relation, we utilize six In this step, all possible interpretations of physical types shown in Fig. 2. objects and relations are enumerated as hypotheses from The relations parallel and next-to-end are somewhat local viewpoints. Hypotheses generated at this step are special. The relation parallel is defined between a line classified into three types: hypotheses of logical objects and a string. If a line segment in a linc is parallel with (concepts), logical relations (relations among concepts) the longer side of the bounding rectangle of a string, and labels (names of concepts or relations). they have the parallel relation. The relation next-to-end Hypotheses of logical objects are generated from the is the special case of the proximity and the contact. If following physical objccts: (1) a line has the proximity relation to a physical object, and (2) the extension of the line from an end contacts loops (explicit and implicit), with the physical object, the line has the relation of strings having the relation next-bend to linm, next-to-end with the physical object. In addition, if an dotted or broken straight lines. (Tlicse lines indi- end of a line contacts with a physical object, they also cate the omission of logical objects.) have the next-to-end relation. Hypotheses of logical relations are generated between 4.3 Implicit loops physical objects as follows: The role of this step is to identify a group of loops rep- a linc having the relations of next-to-end to physi- resented by grouping and division lines, and alignment cal objects ( a logical relation between the physical of the loops. Examples of grouping and division lines objects). are ill~~strated Fig. 3, where the parent loop indicates in the enclos~~re relation between loops ( a logical re- either a loop or a bolinding rectangle of a diagram. lation "part-of" between the loops). line As shown in Fig. 3(a), a g r o ~ ~ p i n g is the line which satisfies the following conditions: Note that a line which has the next-to-end relation at only one end is also accepted as a hypothesis of a logical 1. Both of the two ends of the linc have arrowheads, relation, b e c a ~ ~ a e s physical object is sometimes omitted or both of them have no arrowheads. as shown in Fig. 4(a). In such cases, we also generate a 2. The shape of the line is straight, or like a brace. hypothesis of an omitted logical object. In addition, a 3. The line must not overlap with the loops in the line is interpreted as a logical relation with other lines. parent loop. In Fig. 4(b), the line 1 is hypothesized as the logical 4. The grouping rectangle shown in Fig. 3(a) encloses relation between the lines 2 and 3. This enables 11s to some but not all loops in the parent loop, and none interpret a set of lines as an n-ary relation. of the loops in the parent loop overlaps with the Hypothesis of labels are generated for each string ac- boundary of the grouping rectangle. companied with a physical object with which the la- On the other hand, the conditions of a division line are bel is associated. A simple way to do this is to asstr as follows (see Fig. 3(b)): ciate a string with physical objects each of which has the proximity relation to the string. However. this may 1. The line does not have an arrowhead. cause nlany incorrec:t hypotheses or n~iss niany correct 2. The liiic: docs not overlap with the loops in tht: hypotheses depending severely on the tl~rcsl~ol(l the of parent loop. If the line does not contact with the proximity relation. Thus, we utilize some heuristics to parent loop, we also consider the extension shown improve the accnracy of hypothesis generation. Hy- as the broken line in Fig 3(b). potheses of labels are generated as follows: P3 (logical object, label)/ Table 1: Experimental results No. of hypotheses(N) Cover rate (C) PI P2 generation 1.53 99.7% (logical object) (logical relation) verification I 1.16 99.7% I selection I 1. O O I 99.2% Fignre 5: An example of hypothesis verification 6 Experimental Results r a string is hypothesized as the label of a loop which Our method was applied to 50 samples of conceptr~al encloses the string. diagrams obtained from vario~lstechnical papers and r (heuristic 1) If a loop has the relation of a l i g n ~ i ~ r l ~ t tcxtbooks written in .Japanese and English. In tl~rse to a string in addition to proximity, the string is s, s a ~ ~ i p l r471 logical ol)jrc.ts, 517 logical rc,latior~sa l ~ d hyl)othesized as thc: lal)(:l of the loop. 491 labels were in(-l~ideti. r (lie~lristic If a linr has the relation of parallel to 2) ts at R e s ~ ~ lwere eval~~ated ex11 steps of extraction of a string in addition to the proximity, the string is logical strr~cture (i.e., hypothesis generation, verification hypothesized as the label of the line. and selection) rising the following criteria: r If a string does not satisfy both of the above two N: the average n ~ ~ m b ofr hypotheses for one correct r the he~~ristics, string is hypothesized as the labels logical entity (i.e., a11 object, a relation or a label), of loops a ~ i dlines which have at least one of the C: cover rate: the rate of the number of correct hy- proximity, contact and overlap relations with thc of l ~ r r potheses for the ~ ~ l ~ n ~correct logical c~~titios. string. r s. Table 1 shows the experi~nental c s ~ ~ l tAt the step of hypothesis generation, two correct logical objects co111d 5.2 Hypothesis Verification not be hypothesized since labels were too apart from The following constrailits are employed to verify hy- their physical objects. In addition, lines crossing pcr- potheses. as pendic~~larly in Fig. 1 were misinterpreted as they were not connected. At the step of hypothesis verifica- C1 A physical objcct exc:ept implicit loops must be in- tion, 69.7% of incorrect hypotheses were rejectrd, wt~ile terpreted as at least one of a logical object, a logical all correct hypotheses were preserved. At the step of relation and a label. , s r l ~ t i o n nine correct hypotheses were erroneously re- C2 A logical object must have a logical relation. jrcted becanse: (1) an incorrect physical object was C3 A logical relation must have two logical objects to closer to a string which represented a label of other be related. physical object, (2) although a single string represented labels of two physical objects, only one physical object C4 A label must have a logical object or a relation to was selected. We consider that these errors indicate the be associated with. limitations of our method which interprets the physical C5 An implicit loop except ones generated from divi- structure. In order to recover these errors, it is ncces- sion lines must have a label. sary to introduce the analysis of linguistic meaning of C6 A dotted or broken line which represents the omis- strings instead of the selection step. sion of logical objects must not have a label. 7 Conclusion The procedure of hypothesis verification behaves like re- We have presented a method of understanding con- lmation. Rejection of invalid hypotheses found by test- ceptual diagrams. To cope with the local ambiguities in ing C2-C6 is repeated until no more hypotheses are interpretation of physical structure, we utilize the tech- rejected. The verification fails if C1 is violated by the nique of hypothesis generation and verification. From rejection. the experimental results for 50 samples of conccpt~ial Let us consider a simple example shown in Fig. 5. Hy- diagrams, we have confirnied that our method is effec- potheses for physical objects P1 P4 are listed in paren- tive but has some limitations of interpretation. The theses. The physical object P3 has two interpretations remaining work is the interpretation of concept~~al dia- ( a logical object and a label of P4), while other physi- grams from their images, and incorporation of natural cal objects have only one interpretation. We can select language processing to improve the accuracy. the interpretation "P3 corresponds to a logical object", since C3 is violated if P3 is a label of P4. References [l] K. Tombre. Technical Drawing Recognition and Un- 5.3 Selection of plausible hypotheses derstanding: From Pixels to Semantics. Proc. of The constraints utilized in the verification arc not, IAPR Workshop on MVA '92, pp.393 402, 1992. strong enough to se1ec:t tlir ~riostplausible hypotheses.  Y. Nakamura, R. F~~rnkawa and M. Nagao. Di- 111 ~)art,ic~~lar, incorrect 1iyl)otliescs of labels ren~ainaf- agram Understandir~g Utilizing Natural Lal~gnwgc, tc-r the verification. In order to select the hypotheses Text. Proc. of the 2nd Int'l Conf. on Docu~rlent of labels, we utilize the following rules: (1) If a string Analysis and Recognition, pp.614-618, 1993. overlaps or contacts with a physical object, a hypoth-  R. Kasturi, S. T. Bow, W. El-Masri, J. Shah, vsis stating that the string is attached to the physical -1. R. Gattiker and U. B. Mokate. A Systen~ 111- for ol)jrc.t is selected. (2) Otherwise, a hypothesis stating terpretation of Line Drawings. IEEE Trans. PAMI, that a string is attached to the nearest (dmi, is smallest) Vo1.12, No.10, pp.978 992, 1990. I)l~ysic:alobject is selected.