Document Sample
Technical_Report Powered By Docstoc
					Technical Report of the Development Process of the Machine of a Process (MOP) Ontology.


                                   Luis E, Ramos G &
                                     Richard Gil H.

                                   December, 2011
                     Methodology Strategy Selection according to SMOL approach

   The methodology strategy selection (according the Phase I of SMOL) has suggested a Middle-Out
   strategy. Some details about the criteria applied for the strategy selection is described as follow:

   On one hand, the main characteristic of the Domain Attributes related to the complexity of the
   domain have been assessed by the Expert Users through the technique suggested in SMOL and
   referred in Figure 1 below. The results are: Established domain (0), Conventional domain (0),
   Technological Dependent domain (1) and Interdisciplinary domain (1). This corresponding
   evaluation is associated to the optional state e13 (N = 22 -1 = 3  Middle-Out).

   On the other hand, applying the heuristic corresponding to the Algorithms 1 according to the
   Knowledge Sources (KSOs) availability, we can found that the option b is the better one ( Middle-
   Out) for this manufacturing case. In fact, we found at least two previous developed ontologies (upper
   and domain) and we have developed a representative corpus of texts recovered by Internet.

   Specifically, the Middle-Out strategy could split it up in twofold: first a Top-Down SMOL approach
   from the previous ontologies through some Methodological Recourses (MRs) for Ontology matching
   such as FOAM and Protégé-Prompt. After that, a Bottom-Up SMOL approach from texts using other
   MRs such as an automatic agent for the Corpus identification of main keywords (TF-IDF) associated
   to the domain and the GATE-tools (NLP) for the host-ontology validation (Classes and Instances).

                    Figure 1 Methodology Strategy Selection According Domain Complexities Assessment

   Algorithm 1: Heuristic suggested to setting up the domain-attributes (KSO as texts is possible)

IF (Domain-Ontology or Upper is/are found) ≥ 1 THEN
    a. IF (RDB Public or RDB private is/are found ≥ 1) ( At least 3 domain-attributes are 0’s) THEN Top-Down
    b.IF (RDB-scheme or Domain-Ontology is found = 1) ( At least 2 domain attributes are 0’s) THEN Middle-Out.
    c. IF (RDB not found                          = 0) ( At least 1 domain attribute is 0)      THEN Bottom-Up
ELSE (%Not any Ontology)
    d.IF (RDB-scheme or RDB is/are found          = 1) ( At least 2 domain attributes are 0’s) THEN Middle-Out.
    e. IF (RDB not found                          = 0) ( At least 1 domain attribute is 0)     THEN Bottom-Up
IF (RDB Public or RDB private is/are found) ≥ 1 THEN
    f. IF (Domain-Ontology or Upper is/are found  ≥ 1) ( At least 3 domain-attributes are 0’s) THEN Top-Down
    g. IF (Upper-Ontology (not domain) is found    = 1) ( At least 2 domain attributes are 0’s) THEN Middle-Out.
    h.IF (Ontology is NOT found                    = 0) ( At least 2 domain attributes are 0’s) THEN Bottom-Up
ELSE (%Not any RDB)
  i. IF (Domain-Ontology or Upper is/are found                    ≥ 1) ( At least 2 domain attributes are 0’s) THEN Middle-Out.
  j. IF (Upper-Ontology (not domain) is found                     ≤ 1) ( At least 1 domain attribute is 0)     THEN Bottom-Up
 Regarding to the strategy selection for this manufacturing case described above (Phase I of SMOL), the
 Figure 2 represents an integrated methodological process for Ontology Learning/Validation from diverse
 but complementary Knowledge Sources (KSOs) followed in this work to enrich and populate the MOP-
 ontology’s objects (classes, relations, instances and rules).

                                                                                                                    c                 b
                      III- Query                                    User’s     Versioning
                    requirements                                   Profiles                                  VII- Knowledge
                 (Non-)taxonomic relations                                                                      structure
                                                                                                             reorganization                 Knowledge
                                                                                                                    (Protégé)               structures
                  II- Knowledge                 IV- Knowledge                 V- Knowledge stru-
                     Discovery                     selection                  cture construction             VI- Knowledge
      Expert           1) Protégé               1) Upper & Domains              1) Protégé-PROMPT
                   2) Scholar-Google                 Ontologies                 2) GATE-Gazetteers
                                                                                                             exploring and
      Users                                    2) A Developed Corpus                                           searching                   Updated &
           a      I- Methodology                                    1) Some Published Ontologies
                 strategy selection                                       (MASON, MTM & MO)                  Knowledge
                      1) Top-Down,                                  2) A Corpus (620 Text-samples)            Sources
                      2) Bottom-Up

                 Selected                                            Knowledge structure        Query and requirement          Parameter configuration
                                            Satisfied         c                             d                              e
           a     strategy          b                                      updating                  reformulation                     set-up

           Figure 2: SMOL is applied to the Manufacturing case from two Knowledge Sources (Ontologies and Texts)

 Some partial results obtained following this methodological proposal (based in SMOL) are showed in
 the Table 1.
               Table 1: Manufacturing Case study summary: some evidences about SMOL application from two KSO

       Knowledge         Structured      SMOL tools     Enriched & Populated MOP-                                        Data Pre-Processing
          Source         Knowledge          applied            Ontology object
       Ontologies - Ont. Enrichment - Swoogle       +Classes: (Example)                                                 - WordNet/Synset
        (MASON, - Ont. Comparison - Protégé- Prompt *Classes: (Examples)                                                - Attribute meaning
      MTM & MO) - Ont. Validation - Racer-Pro       +Instances: (Examples)                                                analysis.
       Documents - Ont. Population - Rapid-I        +Classes: (Examples)                                                - Google-Scholar
      (620 texts of - Knowledge agent - GATE (Ont.) *Classes: (Examples)                                                - WordNet/Synset
         journals)  - Ont. Validation - Racer-Pro   +Instances: (Examples)                                              - GATE-Gazetteers
      Databases (no
       applied yet)
                                       Ont.=Ontology, Ontology’ Object= +Added, *Reviewed, #Changed

 The previous MOP-ontology taxonomy representation and the new one after SMOL application has
 shown in the Figure 2 and Figure 3 respectively.
                      Figure 3: MOP-ontology representation after the SMOL application

                                   Ontology Learning: Tools Applied
   GATE is an open source solution for full lifecycle Text Analytics, which provides support for
    manual annotation, performance evaluation, information extraction, [semi-]automatic semantic
    annotation, and many other (many) tasks [1].
   Protégé is a free, open-source platform that provides a growing user community with a suite of
    tools to construct domain models and knowledge-based applications with ontologies. At its core,
    Protégé implements a rich set of knowledge-modelling structures and actions that support the
    creation, visualization, and manipulation of ontologies in various representation formats [2].
   Protégé-PROMPT is a Protégé plug-in. It allows the users to manage multiple ontologies in
    Protégé, mainly to: a) compare versions of the same ontology, b) map one ontology to another, c)
    move frames between included and including project , d) merge two ontologies into one, and d)
    extract a part of an ontology. Furthermore, Prompt has a CogZ plug-in which is a complementary
    graphical user interface (GUI) which simplifies user visual-interactive integration process [3].
   Rapid-I is a tool for data analysis solution in RapidMiner. It is the world-leading open-source
    system for knowledge discovery and data mining. It is available as stand-alone applications for
    data analysis and as a data mining engine which can be integrated into own products [4].
   Swoogle is a user-agent for Semantic Web Documents searching, especially Semantic Web
    Ontologies. There are two relevant query types available: a) ontology: search a small collection
    which consists of only Semantic Web Ontologies, i.e. semantic web documents which have at
    least one class or property defined, and b) document: search all Semantic Web Documents [5].
    WVTOOL is a plug in of Rapid-i to support word vector processing [6].
                                  Initial Search and Results

Replicating the current activity of Industrial Engineering, we proceeded to perform a search
of Industrial Machinery for a hypothetical industrial facility. Taking care of considering
previous development of ontology related with manufacturing, we used the concepts
presented in the module of machine – tools of MASON [7].
   The search was started in, this general search engine gave us two kind of
results, at first we obtained the name of some companies specialized on specific machines.
At second, the names of specialized search engines were also obtained, to mention,       ,       ,, Thus, the search was performed in both,
general and specialized search engines. The resulting documents of some search were
summed up together, in order to have an idea of the possible effort that should be involve in
selecting a given machine. In Table 1, these results are outlined. In average an engineer or a
team of engineers would require to evaluate 338 different machines to select one model
which adjusts to their needs. That is, perform such evaluation in a minimum of 106 and a
maximum of 940 models of machines.

                     Table 2 Result Obtained at Search Machine on Internet

                         Name of Machine         Number of results
                        Hydraulic press                  106
                        Punching press                   114
                        Rolling mill                     200
                        Injection machine                940
                        Manual molding                   139
                        Lathe                            298
                        Milling machine                  405
                        Plasma cutter                    482
                        Shot blasting                    480
                        Drop Hammer                      218
                                Average                 338,2
                                      Corpus Conformation

    Concepts distribution of MASON was used as seed for corpus of documents generation. In
    Fig. 4(a) the main four concepts in machine module of MASON are presented, to mention
    Shearing, Heat treatment, Founding and Forming. Each of these concepts has a
    subcategory containing a set of other concepts as kind of machines. The percentage
    distribution per concept in the subcategory is also shown in Fig. 4.a. Taking in account
    such distribution the document for this corpus were collected. Additionally, other sources
    were considered, for instances documents from ASME and OSHA1, ontologies and
    research papers related with this topic. In Fig. 4.b the resulting distribution in corpus is
    outlined. Furthermore, a detailed count of document is given in Table 3.

                                 Fig. 4. Distribution and Proportions

            Table 3. Distribution of Documents in Corpus

     category                      Subcategory             N° of Doc´s
  Research papers                                               3
    Ontologies                                                  3
     Forming                     Hydraulic Press               25
                                    Cam Press                   7
                                 Friction Hammer               29
                                   Knuckle Press               22
                                  Punching Press               38
                                    Rolling Mill               34
                                Friction Screw Press           40
                                   Wedge Press                 4
                                 Folding Machine               30
     Founding                        Injection                 45
                                 Manual Moulding               10
                                Moulding Machine               13
  Heat treatment                   Electroplating              19
                                       Oven                    40
                                      Plating                  19
                                   Shot blasting               18
     Shearing                        Grinding                  21
                                      Lathe                   30
                             Machine centre four axis         30
                              Machine centre 5 axis           24
                                 Milling machine              41
                              Milling machine 4 axis          15
                            Plasma trimming machine           40
                             Water trimming machine           27
 ASME standards                                                2
OSHA regulations                                               10
    Total                                                     633
                                       Matching Ontologies

MASON, MO and MTM were compared to one another by means of Protégé-Prompt. In
Table 4 the result is presented. MASON and MTM had 5 mapping from which three were
false positive. In other words, Prompt highlighted them as mapping but domain experts
determined it was not true. However two true positive were found, it means that both
suggestions provided by Prompt were accepted as valid mapping. When mapping MASON
and MO, only one mapping was found and was true positive. Finally, MTM and MO were
mapped and one true positive mapping was found.

                                    Table 4 Ontologies Mapping

   Ontologies                            Matching                                 Validation
 Source              Target              Arg 1                Arg 2
 MASON               MTM                 System               System              TP
                                         Cone                 Machine             FP
                                         Hole                 Spindle             FP
                                         Site                 Spindle             FP
                                         Machine-tool         Machine-tool        TP
 MASON               MO                  Cone                 Machine             FP
 MTM                 MO                  Machine              Machine             TP
                                                                                  TP: True Positive
                                                                                  FP: False Positive

[1] General Architecture for Text Engineering. The University of Sheffield, 2011.
[2] N. F. Noy, M. Sintek, S. Decker, M. Crubezy, R. W. Fergerson, y M. A. Musen, «Creating
    Semantic Web contents with Protege-2000», Intelligent Systems, IEEE, vol. 16, no. 2, págs. 60-
    71, 2001.
[3] N. F. Noy y M. A. Musen, «The PROMPT suite: interactive tools for ontology merging and
    mapping», International Journal of Human-Computer Studies, vol. 59, no. 6, págs. 983-1024,
    Dic. 2003.
[4] F. Jungermann, «Information Extraction with RapidMiner», in Proceedings of the RapidMiner
    Community Meeting And Conference, 2010, págs. 67-72.
[5] «Swoogle». [Online]. Available:
[6] M. Wurst, Word Vector Tool. .
[7] S. Lemaignan, A. Siadat, J.-Y. Dantan, y A. Semenenko, «MASON: A Proposal For An
    Ontology Of Manufacturing Domain», in Distributed Intelligent Systems: Collective
    Intelligence and Its Applications, 2006. DIS 2006. IEEE Workshop on, 2006, págs. 195-200.

Shared By: