Indian Journal of Engineering & Materials Sciences
Vol.16, August 2009, pp. 205-210
Mathematical model and rule extraction for tool wear monitoring problem using
nature inspired techniques
S N Omkara*, J Senthilnatha & S Sureshb
Department of Aerospace Engineering, Indian Institute of Science, Bangalore 560 012, India
Department of Electrical Engineering, Indian Institute of Technology, New Delhi 110 016, India
Received 24 July 2008; accepted 16 July 2009
In this paper, pattern classification problem in tool wear monitoring is solved using nature inspired techniques such as
Genetic Programming (GP) and Ant-Miner (AM). The main advantage of GP and AM is their ability to learn the underlying
data relationships and express them in the form of mathematical equation or simple rules. The extraction of knowledge from
the training data set using GP and AM are in the form of Genetic Programming Classifier Expression (GPCE) and rules
respectively. The GPCE and AM extracted rules are then applied to set of data in the testing/validation set to obtain the
classification accuracy. A major attraction in GP evolved GPCE and AM based classification is the possibility of obtaining
an expert system like rules that can be directly applied subsequently by the user in his/her application. The performance of
the data classification using GP and AM is as good as the classification accuracy obtained in the earlier study (i.e. using
Keywords: Tool wear monitoring; Genetic Programming; Ant-Miner
In manufacturing, the development of mathematical their study, to reduce the complexity, the input
model is an important task to critically analyze the dimensions are reduced from six to two, and a NN is
process. These mathematical models relate the inputs trained for pattern classification problem.
of the system to the desired outputs. In some cases, NNs have been used for various manufacturing
obtaining a mathematical model (i.e., relationship applications2,3. The main limitation of NN is that they
between the input and the desired outputs) can be a capture the relationship between the input-output data
difficult task. In such situations, there is a need to effectively, but the weights do not express the
build mathematical models based on the given input- relationship between the input-output data explicitly.
output data. These models should effectively identify However, the nature inspired techniques like Genetic
the underlying input-output relationship. Artificial Programming (GP) and Ant-Miner (AM) can give
neural networks (ANNs) have been successfully used explicit relationship between the input and output
to find the input-output relationship. Neural network classes. In the Genetic Programming approach4
(NN) applications in manufacturing can be broadly arithmetic function sets are used to evolve Genetic
classified into pattern classification problems1 and Programming Classifier Expressions (GPCE). The
function approximation problems2. GPCE can be expressed mathematically or as an
Purushothaman et al.1 have formulated the tool expert system like rules for pattern classification5,6.
wear monitoring problem as a pattern classification On the other hand, AM7,8 is a class of data
problem. In this study, the six inputs to the NN are classification algorithm modeled on the actions of an
speed, feed, depth of cut, axial force, radial force and ant colony. AM is used to extract simple rules from
the tangential force. The output of the NN is flank the given data set8-10.
wear bandwidth. In their study, (after training the In this study, we use GP and AM approach for the
neural network) for a given input pattern, based on the pattern classification problem of tool wear monitoring
flank wear bandwidth, the output is classified as discussed by Purushothaman et al.1. A genetic
pattern belonging to class-1 or class-2, and thus programming classifier expression is evolved as a
considered as a pattern classification problem. Also in discriminant function between two classes using
—————— training set of data points. The GPCE are then applied
* For correspondence (E-mail: email@example.com) to set of data in the testing/validation set to obtain the
206 INDIAN J. ENG. MATER. SCI., AUGUST 2009
classification accuracy. On the other hand, AM is used reproduction, crossover and mutation to generate the
to derive knowledge in the form of simple rules from next generation. Hence, the solution is evolved
the training data. These rules are applied sequentially through the generations.
to the testing data set to obtain the classification Koza4 has applied GP for a two-class pattern
accuracy. These nature inspired techniques are classification problem. In a two-class problem, a
intriguing because of their ability to classify the data single GP expression is evolved. While evaluating
efficiently and in the simplicity of the mathematical the GP expression, if the result is positive, then the
expression and rules that have been extracted. input data are assigned to one class (say class-1);
In this paper, the nature inspired techniques for else they are assigned to the other class (class-2).
pattern classification and applications to tool wear Thus, in the training set the desired (known) output
monitoring are discussed. The mathematical model d is +1 for samples belong to one class (say class-1),
and rule extraction are described. and the desired (known) output d is –1 for samples
belong to the other class (class-2). Hence, the output
Nature Inspired Techniques for Pattern of a GP expression is either +1 (indicating that the
Classification input sample belongs to that class) or –1 (indicating
Nature inspired technique is the field of research that that the input sample does not belong to that class).
works with computational techniques inspired in part We call this GP expression evolved in a two-class
by nature and natural systems. These nature inspired problem as GPCE for pattern classification problem.
techniques provide a more robust and efficient This GPCE is the mathematical model evolved for
approach for solving complex real-world problems11,12. the pattern classification problem. This GPCE
Many nature inspired techniques such as Artificial divides the feature space into two regions. GP uses
Neural Network13, Ant Colony Optimization7, Genetic the function set that contains operators and functions
Programming4 and Particle Swarm Optimization to evolve a GPCE as the discriminant function for
(PSO)14,15 have been proposed. Among these, we the two classes present in the training set. Let Y be
briefly describe two methods - GP and AM. the output of the GPCE.
Genetic programming for pattern classification
IF GPCE(x) ≥ 0 THEN Y = +1, x ∈ class-1
Genetic programming is an evolutionary approach
IF GPCE(x) < 0 THEN Y = -1, x ∉ class-1,
which applies the Darwin’s principle of survival of
the fittest to a population of parametric solution of a
where x is the input feature values. In the present
given problem. GP evolves a population of computer
study, for evolving a GPCE we have used the function
programs, which are possible solutions to a given
problem. Each program or individual in the set with only arithmetic operations (+, -, ÷, and ×).
Koza4 has shown that in GP the evolution is a
population is generally represented as a tree
composed of functions and data/terminals appropriate never-ending process, and hence a termination
criterion is needed. The termination criterion for GP is
to the problem domain. The set of functions F and set
of terminals/inputs T must satisfy the closure and generally based on the problem or is limited by the
sufficiency properties. The closure property demands number of generations. In GP, a user-defined fitness
function has to be maximized for his/her application.
that the function set is well defined and closed for any
combination of arguments that it may encounter. On Thus, at the end of a GP run, we have a current
population of individuals and also the fittest
the other hand, the sufficiency property requires that
the set of functions in F and the set of terminals be individual that appeared during the run. The fittest
individual that has evolved for the given problem is
able to express a solution of the problem. The
function set may contain standard arithmetic its solution or desired mathematical model.
operators, mathematical functions, logical operators, Ant-miner for pattern classification
and domain-specific functions. The terminal set An ant colony optimization approach for discovery
usually consists of feature variables and constants. of classification rules has been proposed called Ant-
Each individual in the population is assigned a fitness Miner8-10. Ant-miner follows a sequential covering
value, which quantifies how good the solution is. The approach to discover a list of classification rules
fitness value is computed by a problem-dependent covering all, or almost all, the training cases. At first,
fitness function. GP uses genetic operations like the list of discovered rules is empty and the training
OMKAR et al.: RULE EXTRACTION FOR TOOL WEAR MONITORING 207
set consists of all the training cases. A rule is added to 3. The determination of the tool's wear caused by
the rule list when it classifies correctly a pre-defined abrasion, erosion, or other sinfluences.
number of training cases. A three step process gets
repeated for each training case – rule construction, Purushothaman et al.1 have experimentally
rule pruning and pheromone updating, until one rule studied and simulated the challenges involved to
gets extracted. This rule is added to the list of classify the tool wear data based on two-class pattern
discovered rules and the training cases that are classification problem using NN. In their
covered correctly by this rule (i.e., cases satisfying the experimental study, in the ranges of various
rule antecedent and having the class predicted by the parameters such as speed, feed and depth of cut, data
rule consequent) are removed from the training set. is collected on axial force, radial force, tangential
This process is performed iteratively while the force, and flank wear bandwidth. The conditions of
number of uncovered training cases is greater than a the machining and the resource used are
user-specified threshold. explained in Purushothaman et al.1. Thus, in their
Each classification rule has the form IF <term1 study, there are six inputs namely speed, feed, depth
AND term2 AND …> THEN <class>.Each term is a of cut, axial force, radial force and tangential force.
triple <attribute, operator, value>, where value is a The flank wear bandwidth is the output. The output
value belonging to the domain of attribute. The (flank wear bandwidth) is modified for pattern
operator element in the triple is a relational operator. classification problem as (i) all the data points for
The six inputs of the tool wear monitoring which the flank wear bandwidth is less than or equal
constitute the attribute set. The relational operators: to 200 belong to class-1 and (ii) all the data points
greater than (>), less than (<), greater than equal to for which the flank bandwidth is greater than 200,
(>=), less than equal to (<=) and equal to (=) belong to class-2.
constitute the operator set. In their study-113 data points are collected in
which 87 data points (or samples) belonged to
Applications to Tool Wear Monitoring class-1, and 26 data points belonged to class-2. They
Monitoring of tool wear is an important used 20 data points belonging to class-1 and 10 data
requirement for realizing automated manufacturing. points belonging to class-2 for training the NN. The
Tool wear is a very complex phenomenon which can rest of the data points, 67 (class-1) and 16 (class-2),
lead to machine down time, product rejects and can are used for testing. The input is reduced from six
also cause problems to personnel16. The three most dimensions to two dimensions using optimal
important tasks in the area of tool monitoring are17: discriminant method and the NN is trained.
Quantifying the input-output relationship is difficult
1. The fast detection of collisions, i.e. any using NN. Hence, Genetic Programming and
unintended contacts between the tool and the Ant-Miner are used to obtain a mathematical model
workpiece or parts of the machine (causing e.g. and simple rules for this problem.
rapidly increasing forces); In the present study, we use the same data points as
2. The identification of tool breakage, e.g. in Purushothaman et al.1 for this pattern classification
outbreaks at brittle cutting edges and problem using GP and AM. A partial list of data
Table 1— Subset of experimental data set
Sl. No. x1 x2 x3 x4 x5 x6
(Speed) (Feed) (Depth of cut) (Axial force) (Radial force) (Tangential force)
1 450 10 15 150 115 150 1
2 450 10 50 60 50 115 1
3 450 10 200 180 130 450 1
4 350 10 50 60 90 125 1
5 300 6 50 45 80 70 1
6 450 10 150 750 650 500 2
7 400 10 50 175 350 140 2
8 450 10 150 240 850 620 2
9 456 10 100 550 590 430 2
10 450 10 200 1100 1200 840 2
208 INDIAN J. ENG. MATER. SCI., AUGUST 2009
points using the six input features such as x1(speed), Equivalent mathematical model is
x2(feed), x3(depth of cut), x4(axial force), x5(radial
force) and x6(tangential force) and a desired output 2 x1 + 2 x3 − 60 − x4 − 2 x5
feature (i.e., class-1 and class-2) are given in Table 1.
In the GP or AM approach to pattern classification,
the given data set is divided into training set and 2 x3
validation/testing set. In case of GP, the training set x4 + 66 x4 − − 12.3243 … (1)
data points are used for obtaining GPCE
(mathematical model) and the testing set data points For a given input sample, if the above expression is
are used for obtaining classification accuracy whereas greater than or equal to zero, then the input sample
AM extract rules from the training data and the belongs to class-1. Otherwise, the input sample
extracted rules are used to classify the test data. We belongs to class-2. From this mathematical
have used 21 data points that belong to class-1, and 11 expression, we can derive simple rules. This is as
data points that belong to class-2 for obtaining GPCE follows:
and simple rules. The rest of the data points 66
(belonging to the class-1) and 15 (belonging to the Classification rule:
class-2) are used for testing/validation and for
obtaining classification accuracy. If ( x1 + x3 ) > x4 + x5 + 30 , then this sample
Mathematical Model and Rule Extraction belongs to class-1.
Genetic programming This rule says if the sum of the value of speed (x1)
The genetic programming parameters which and depth of cut (x3) is greater than the sum of half
include population size, GP generations, cross over the value of axial force (x4), the value of radial force
weight, mutation weight, mutation rate and (x5) and the constant value 30, then the flank wear
tournament size are varied until they produce most width will be less than 200 (class-1).
favorable classification result. The optimum values The advantage of above classification rule is that
for the above parameters for the most favorable any person without much knowledge about the
results are as follows: physical process can easily use them for
classification. The rules also represent the knowledge
Population size = 2000 that is learned while obtaining the GPCEs.
GP generations = 5,00,000
Cross over weight = 70 Ant-miner
Mutation weight = 20 The Ant-Miner parameters such as
Mutation rate = 60 Number_of_ants, Min_cases_per_rule, Max_
Tournament size =3000 uncovered_cases, Number_rules_to_converge were
varied to extract different set of rules and the overall
For the above parameter, we have done several classification efficiencies hence obtained were
runs to evolve GPCE with the training set and the recorded. The optimum values for the above
best GPCE obtained for the run is listed below. The parameters are as follows:
GP expression evolved is in the form of LISP
s-expression and this expression can be easily Number_of_ants = 25.
converted into a mathematical expression as Min_cases_per_rule = 6.
follows. Max_uncovered_cases = 3.
Number_rules_converg = 5.
GPCE: (MUL (SUB (MUL (SUB (ADD 20 x4)
(SUB –44 2)) x4) (ADD (SUB (DIV x3 21) AM extracted rules from the training data set and the
(DIV –75 –111)) 13)) (DIV (ADD (ADD (ADD x3 x1) extracted rules were used to classify the test data.
(ADD 32 –29)) (SUB (DIV (SUB (ADD 20 x4) Following are some of the rules extracted by the
(SUB –44 2)) –2) x5)) x4)) algorithm:
OMKAR et al.: RULE EXTRACTION FOR TOOL WEAR MONITORING 209
For the class-1 of tool wear data set: classified without any misclassifications and hence
x1 <= 372 has an individual efficiency of 100%. The overall
classification is impressive with an efficiency of
This rule says if the value of speed (x1) is less than
100%. The overall classification efficiency obtained
or equal to 372, then the sample belongs to class-1.
for the training data is a measure of the relevance of
For the class-2 of tool wear data set:
the GPCE extracted.
x1 > 393 and x2 <=17 and x5 >307 Next the GPCE extracted are applied to the testing
data set and the efficiencies are evaluated. As we can
This rule says if the value of speed (x1) is greater
notice from the classification matrix generated for the
than 393 and the value of feed (x2) is less than or
testing data (Table 3), two of the samples belonging
equal to 17 and the value of radial force (x5) is greater
to class-2 are misclassified as class-1, but overall
than 307, then the sample belongs to class-2.
efficiency is impressive with a 97.53%.
Simulations and Results Purushothaman et al.1 applied NN to solve this
To evaluate the performance, the data set is used to problem, and the classification accuracy obtained in
arrive at the classification matrix which is of size their approach is 96.36%. We can observe that the
n × n, where n is the number of classes. A typical classification accuracy obtained in GP approach is
entry qij in the classification matrix shows how many comparable to that of NN approach.
samples belonging to class i have been classified into
Ant-miner simulation and classification
class j. For a perfect classifier, the classification
The classification matrices obtained after applying
matrix is diagonal. However, due to misclassification
the derived rules from AM for the training and
we get off-diagonal elements. The individual
testing data are shown in Tables 4 and 5
efficiency of class i is defined (for all j) as
respectively. From the classification matrix for the
training data we can notice that in the training set,
qii / ∑ qji … (2)
samples belonging to class-1 getting classified
The overall efficiency is defined as
without any misclassifications and hence has an
(∑ qii) / N … (3)
individual efficiency of 100%. But for class-2 a
where N is the total number of elements in the single case is getting misclassified as class-1. Hence
data set. class-2 has an individual efficiency of 90.90%.
However, the overall classification is impressive
GP simulation and classification with an efficiency of 96.87%. The overall
Initially, GP learns from the training data set and classification efficiency obtained for the training
evolves the GPCE. The Classification Matrices data is a measure of the relevance of the rules
obtained after applying the GPCE, for the training extracted.
and testing data are shown in Tables 2 and 3 Next the rules extracted are applied to the testing
respectively. From the classification matrix for the data set and the efficiencies are evaluated (Table 5).
training data we can notice that in the training set, As we can notice from the classification matrix
samples belonging to class-1 and class-2 are getting generated for the testing data, there are some
Table 2— Classification matrix of tool wear monitoring Table 4— Classification matrix of tool wear monitoring
training data set by GP algorithm training data set by AM algorithm
Class-1 Class-2 Individual Efficiency Class-1 Class-2 Individual Efficiency
Class-1 21 0 100% Class-1 21 0 100%
Class-2 0 11 100% Class-2 1 10 90.9%
Overall efficiency = 100% Overall efficiency = 96.87%
Table 3— Classification matrix of tool wear monitoring Table 5— Classification matrix of tool wear monitoring
testing data set by GP algorithm testing data set by AM algorithm
Class-1 Class-2 Individual Efficiency Class-1 Class-2 Individual Efficiency
Class-1 66 0 100% Class-1 66 0 100%
Class-2 2 13 86.67% Class-2 4 11 73.34%
Overall efficiency = 97.53% Overall efficiency = 95.06%
210 INDIAN J. ENG. MATER. SCI., AUGUST 2009
misclassifications between the two classes, but overall 5 Kishore J K, Patnaik L M, Mani V & Agarwal V K, IEEE
efficiency of 95.06% is almost the same as that Trans Evolut Comput, 4 (2000) 242-258.
6 Suresh S, Omkar S N, Mani V & Menaka C, J Aerospace Sci
obtained for the training data set. Technol, 56 (2004) 26-41.
7 Marco Dorigo & Christian Blum, Theor Comput Sci, 344
Conclusions (2005) 243-278.
In this paper, nature inspired techniques such as 8 Parpinelli R S, Lopes H S & Freitas A A, IEEE Trans Evol
genetic programming and ant-miner are used to solve a Comput,6 (2002) 321-332.
pattern classification problem that arise in tool wear 9 Omkar S N & Raghavendra K U, IEEE Int Conf Industrial
Technology, (2006) 1559-1562.
monitoring, is presented. These techniques evolve a 10 Omkar S N & Raghavendra T R, Eng Appl Artif Intell, 21
mathematical model or a rule base that express an (2008) 1381-1388.
input-output relationship explicitly. This approach is 11 Back T & Schwefel H P, Evolut Comput, 1 (1993).
better than other approaches such as NN, in the sense 12 Yao X E, Evolutionary Computation: Theory and
Applications, (World Scientific, Singapore), (1999).
that it gives an insight into the knowledge contained in
13 S. Haykin, Neural Networks – A Comprehensive Foundation,
the data set. Also, GPCE and AM extracted rules may 2nd ed, ( New York), 1994.
be used in developing a rule-based expert system. 14 Eberchart R & Kennedy J, A new optimizer using particle
swarm theory, in Proc Int Sym Micro Machine and Human
References Science, Japan, 1995.
1 Purushothaman S & Srinivasa Y G, Int J Prod Res, 36 (1998) 15 Eberchart R & Kennedy J, Particle swarm optimization, in
635-651. Proc. IEEE Int Conf Neural Networks, 1995.
2 Anderson K, Cook G E, Kasai G & Ramaswamy K, IEEE 16 Dimla D E Jr, Lister P M Leighton N J, Int J Mach Tools
Trans Ind Appl, 26 (1990) 824-830. Manufact, 37 (1997) 1219 -1241.
3 Cook G E, Barnett R J, Anderson K & Strauss A M, IEEE 17 Golz H U, Schillo E, Wolf A, Kaufeld M, Sprengel P,
Trans Ind Appl, 31 (1995) 1484-1491. Johannsen P & Heinek D. Bewertung yon Werkzeugtiber
4 Koza J R, Genetic Programming: On the Programming of wachungssystemen aus Sicht der Anwender In:
Computers by Means of Natural Selection. (M I T Press, f2Jberwachung von Zerspan und Umformprozessen,
Cambridge, USA), 1992. Dtisseldorf, VDI-Verlag, (1995) 309-317.