Technical Report of the Development Process of the Machine of a Process (MOP) Ontology.
Luis E, Ramos G &
Richard Gil H.
Methodology Strategy Selection according to SMOL approach
The methodology strategy selection (according the Phase I of SMOL) has suggested a Middle-Out
strategy. Some details about the criteria applied for the strategy selection is described as follow:
On one hand, the main characteristic of the Domain Attributes related to the complexity of the
domain have been assessed by the Expert Users through the technique suggested in SMOL and
referred in Figure 1 below. The results are: Established domain (0), Conventional domain (0),
Technological Dependent domain (1) and Interdisciplinary domain (1). This corresponding
evaluation is associated to the optional state e13 (N = 22 -1 = 3 Middle-Out).
On the other hand, applying the heuristic corresponding to the Algorithms 1 according to the
Knowledge Sources (KSOs) availability, we can found that the option b is the better one ( Middle-
Out) for this manufacturing case. In fact, we found at least two previous developed ontologies (upper
and domain) and we have developed a representative corpus of texts recovered by Internet.
Specifically, the Middle-Out strategy could split it up in twofold: first a Top-Down SMOL approach
from the previous ontologies through some Methodological Recourses (MRs) for Ontology matching
such as FOAM and Protégé-Prompt. After that, a Bottom-Up SMOL approach from texts using other
MRs such as an automatic agent for the Corpus identification of main keywords (TF-IDF) associated
to the domain and the GATE-tools (NLP) for the host-ontology validation (Classes and Instances).
Figure 1 Methodology Strategy Selection According Domain Complexities Assessment
Algorithm 1: Heuristic suggested to setting up the domain-attributes (KSO as texts is possible)
IF (Domain-Ontology or Upper is/are found) ≥ 1 THEN
a. IF (RDB Public or RDB private is/are found ≥ 1) ( At least 3 domain-attributes are 0’s) THEN Top-Down
b.IF (RDB-scheme or Domain-Ontology is found = 1) ( At least 2 domain attributes are 0’s) THEN Middle-Out.
c. IF (RDB not found = 0) ( At least 1 domain attribute is 0) THEN Bottom-Up
ELSE (%Not any Ontology)
d.IF (RDB-scheme or RDB is/are found = 1) ( At least 2 domain attributes are 0’s) THEN Middle-Out.
e. IF (RDB not found = 0) ( At least 1 domain attribute is 0) THEN Bottom-Up
IF (RDB Public or RDB private is/are found) ≥ 1 THEN
f. IF (Domain-Ontology or Upper is/are found ≥ 1) ( At least 3 domain-attributes are 0’s) THEN Top-Down
g. IF (Upper-Ontology (not domain) is found = 1) ( At least 2 domain attributes are 0’s) THEN Middle-Out.
h.IF (Ontology is NOT found = 0) ( At least 2 domain attributes are 0’s) THEN Bottom-Up
ELSE (%Not any RDB)
i. IF (Domain-Ontology or Upper is/are found ≥ 1) ( At least 2 domain attributes are 0’s) THEN Middle-Out.
j. IF (Upper-Ontology (not domain) is found ≤ 1) ( At least 1 domain attribute is 0) THEN Bottom-Up
Regarding to the strategy selection for this manufacturing case described above (Phase I of SMOL), the
Figure 2 represents an integrated methodological process for Ontology Learning/Validation from diverse
but complementary Knowledge Sources (KSOs) followed in this work to enrich and populate the MOP-
ontology’s objects (classes, relations, instances and rules).
III- Query User’s Versioning
requirements Profiles VII- Knowledge
(Non-)taxonomic relations structure
II- Knowledge IV- Knowledge V- Knowledge stru-
Discovery selection cture construction VI- Knowledge
Expert 1) Protégé 1) Upper & Domains 1) Protégé-PROMPT
2) Scholar-Google Ontologies 2) GATE-Gazetteers
Users 2) A Developed Corpus searching Updated &
a I- Methodology 1) Some Published Ontologies
strategy selection (MASON, MTM & MO) Knowledge
1) Top-Down, 2) A Corpus (620 Text-samples) Sources
Selected Knowledge structure Query and requirement Parameter configuration
Satisfied c d e
a strategy b updating reformulation set-up
Figure 2: SMOL is applied to the Manufacturing case from two Knowledge Sources (Ontologies and Texts)
Some partial results obtained following this methodological proposal (based in SMOL) are showed in
the Table 1.
Table 1: Manufacturing Case study summary: some evidences about SMOL application from two KSO
Knowledge Structured SMOL tools Enriched & Populated MOP- Data Pre-Processing
Source Knowledge applied Ontology object
Ontologies - Ont. Enrichment - Swoogle +Classes: (Example) - WordNet/Synset
(MASON, - Ont. Comparison - Protégé- Prompt *Classes: (Examples) - Attribute meaning
MTM & MO) - Ont. Validation - Racer-Pro +Instances: (Examples) analysis.
Documents - Ont. Population - Rapid-I +Classes: (Examples) - Google-Scholar
(620 texts of - Knowledge agent - GATE (Ont.) *Classes: (Examples) - WordNet/Synset
journals) - Ont. Validation - Racer-Pro +Instances: (Examples) - GATE-Gazetteers
Ont.=Ontology, Ontology’ Object= +Added, *Reviewed, #Changed
The previous MOP-ontology taxonomy representation and the new one after SMOL application has
shown in the Figure 2 and Figure 3 respectively.
Figure 3: MOP-ontology representation after the SMOL application
Ontology Learning: Tools Applied
GATE is an open source solution for full lifecycle Text Analytics, which provides support for
manual annotation, performance evaluation, information extraction, [semi-]automatic semantic
annotation, and many other (many) tasks .
Protégé is a free, open-source platform that provides a growing user community with a suite of
tools to construct domain models and knowledge-based applications with ontologies. At its core,
Protégé implements a rich set of knowledge-modelling structures and actions that support the
creation, visualization, and manipulation of ontologies in various representation formats .
Protégé-PROMPT is a Protégé plug-in. It allows the users to manage multiple ontologies in
Protégé, mainly to: a) compare versions of the same ontology, b) map one ontology to another, c)
move frames between included and including project , d) merge two ontologies into one, and d)
extract a part of an ontology. Furthermore, Prompt has a CogZ plug-in which is a complementary
graphical user interface (GUI) which simplifies user visual-interactive integration process .
Rapid-I is a tool for data analysis solution in RapidMiner. It is the world-leading open-source
system for knowledge discovery and data mining. It is available as stand-alone applications for
data analysis and as a data mining engine which can be integrated into own products .
Swoogle is a user-agent for Semantic Web Documents searching, especially Semantic Web
Ontologies. There are two relevant query types available: a) ontology: search a small collection
which consists of only Semantic Web Ontologies, i.e. semantic web documents which have at
least one class or property defined, and b) document: search all Semantic Web Documents .
WVTOOL is a plug in of Rapid-i to support word vector processing .
Initial Search and Results
Replicating the current activity of Industrial Engineering, we proceeded to perform a search
of Industrial Machinery for a hypothetical industrial facility. Taking care of considering
previous development of ontology related with manufacturing, we used the concepts
presented in the module of machine – tools of MASON .
The search was started in google.com, this general search engine gave us two kind of
results, at first we obtained the name of some companies specialized on specific machines.
At second, the names of specialized search engines were also obtained, to mention
www.industrialmachines.net, www.wotol.com, www.alibaba.com,
www.machinesindia.com, www.allmachinery.com. Thus, the search was performed in both,
general and specialized search engines. The resulting documents of some search were
summed up together, in order to have an idea of the possible effort that should be involve in
selecting a given machine. In Table 1, these results are outlined. In average an engineer or a
team of engineers would require to evaluate 338 different machines to select one model
which adjusts to their needs. That is, perform such evaluation in a minimum of 106 and a
maximum of 940 models of machines.
Table 2 Result Obtained at Search Machine on Internet
Name of Machine Number of results
Hydraulic press 106
Punching press 114
Rolling mill 200
Injection machine 940
Manual molding 139
Milling machine 405
Plasma cutter 482
Shot blasting 480
Drop Hammer 218
Concepts distribution of MASON was used as seed for corpus of documents generation. In
Fig. 4(a) the main four concepts in machine module of MASON are presented, to mention
Shearing, Heat treatment, Founding and Forming. Each of these concepts has a
subcategory containing a set of other concepts as kind of machines. The percentage
distribution per concept in the subcategory is also shown in Fig. 4.a. Taking in account
such distribution the document for this corpus were collected. Additionally, other sources
were considered, for instances documents from ASME and OSHA1, ontologies and
research papers related with this topic. In Fig. 4.b the resulting distribution in corpus is
outlined. Furthermore, a detailed count of document is given in Table 3.
Fig. 4. Distribution and Proportions
Table 3. Distribution of Documents in Corpus
category Subcategory N° of Doc´s
Research papers 3
Forming Hydraulic Press 25
Cam Press 7
Friction Hammer 29
Knuckle Press 22
Punching Press 38
Rolling Mill 34
Friction Screw Press 40
Wedge Press 4
Folding Machine 30
Founding Injection 45
Manual Moulding 10
Moulding Machine 13
Heat treatment Electroplating 19
Shot blasting 18
Shearing Grinding 21
Machine centre four axis 30
Machine centre 5 axis 24
Milling machine 41
Milling machine 4 axis 15
Plasma trimming machine 40
Water trimming machine 27
ASME standards 2
OSHA regulations 10
MASON, MO and MTM were compared to one another by means of Protégé-Prompt. In
Table 4 the result is presented. MASON and MTM had 5 mapping from which three were
false positive. In other words, Prompt highlighted them as mapping but domain experts
determined it was not true. However two true positive were found, it means that both
suggestions provided by Prompt were accepted as valid mapping. When mapping MASON
and MO, only one mapping was found and was true positive. Finally, MTM and MO were
mapped and one true positive mapping was found.
Table 4 Ontologies Mapping
Ontologies Matching Validation
Source Target Arg 1 Arg 2
MASON MTM System System TP
Cone Machine FP
Hole Spindle FP
Site Spindle FP
Machine-tool Machine-tool TP
MASON MO Cone Machine FP
MTM MO Machine Machine TP
TP: True Positive
FP: False Positive
 General Architecture for Text Engineering. The University of Sheffield, 2011.
 N. F. Noy, M. Sintek, S. Decker, M. Crubezy, R. W. Fergerson, y M. A. Musen, «Creating
Semantic Web contents with Protege-2000», Intelligent Systems, IEEE, vol. 16, no. 2, págs. 60-
 N. F. Noy y M. A. Musen, «The PROMPT suite: interactive tools for ontology merging and
mapping», International Journal of Human-Computer Studies, vol. 59, no. 6, págs. 983-1024,
 F. Jungermann, «Information Extraction with RapidMiner», in Proceedings of the RapidMiner
Community Meeting And Conference, 2010, págs. 67-72.
 «Swoogle». [Online]. Available: http://swoogle.umbc.edu/.
 M. Wurst, Word Vector Tool. .
 S. Lemaignan, A. Siadat, J.-Y. Dantan, y A. Semenenko, «MASON: A Proposal For An
Ontology Of Manufacturing Domain», in Distributed Intelligent Systems: Collective
Intelligence and Its Applications, 2006. DIS 2006. IEEE Workshop on, 2006, págs. 195-200.