# Mathematical model and rule extraction for tool wear

Document Sample

```					Indian Journal of Engineering & Materials Sciences
Vol.16, August 2009, pp. 205-210

Mathematical model and rule extraction for tool wear monitoring problem using
nature inspired techniques
S N Omkara*, J Senthilnatha & S Sureshb
a
Department of Aerospace Engineering, Indian Institute of Science, Bangalore 560 012, India
b
Department of Electrical Engineering, Indian Institute of Technology, New Delhi 110 016, India

Received 24 July 2008; accepted 16 July 2009

In this paper, pattern classification problem in tool wear monitoring is solved using nature inspired techniques such as
Genetic Programming (GP) and Ant-Miner (AM). The main advantage of GP and AM is their ability to learn the underlying
data relationships and express them in the form of mathematical equation or simple rules. The extraction of knowledge from
the training data set using GP and AM are in the form of Genetic Programming Classifier Expression (GPCE) and rules
respectively. The GPCE and AM extracted rules are then applied to set of data in the testing/validation set to obtain the
classification accuracy. A major attraction in GP evolved GPCE and AM based classification is the possibility of obtaining
an expert system like rules that can be directly applied subsequently by the user in his/her application. The performance of
the data classification using GP and AM is as good as the classification accuracy obtained in the earlier study (i.e. using
ANN approach).

Keywords: Tool wear monitoring; Genetic Programming; Ant-Miner

In manufacturing, the development of mathematical                     their study, to reduce the complexity, the input
model is an important task to critically analyze the                  dimensions are reduced from six to two, and a NN is
process. These mathematical models relate the inputs                  trained for pattern classification problem.
of the system to the desired outputs. In some cases,                     NNs have been used for various manufacturing
obtaining a mathematical model (i.e., relationship                    applications2,3. The main limitation of NN is that they
between the input and the desired outputs) can be a                   capture the relationship between the input-output data
difficult task. In such situations, there is a need to                effectively, but the weights do not express the
build mathematical models based on the given input-                   relationship between the input-output data explicitly.
output data. These models should effectively identify                 However, the nature inspired techniques like Genetic
the underlying input-output relationship. Artificial                  Programming (GP) and Ant-Miner (AM) can give
neural networks (ANNs) have been successfully used                    explicit relationship between the input and output
to find the input-output relationship. Neural network                 classes. In the Genetic Programming approach4
(NN) applications in manufacturing can be broadly                     arithmetic function sets are used to evolve Genetic
classified into pattern classification problems1 and                  Programming Classifier Expressions (GPCE). The
function approximation problems2.                                     GPCE can be expressed mathematically or as an
Purushothaman et al.1 have formulated the tool                     expert system like rules for pattern classification5,6.
wear monitoring problem as a pattern classification                   On the other hand, AM7,8 is a class of data
problem. In this study, the six inputs to the NN are                  classification algorithm modeled on the actions of an
speed, feed, depth of cut, axial force, radial force and              ant colony. AM is used to extract simple rules from
the tangential force. The output of the NN is flank                   the given data set8-10.
wear bandwidth. In their study, (after training the                      In this study, we use GP and AM approach for the
neural network) for a given input pattern, based on the               pattern classification problem of tool wear monitoring
flank wear bandwidth, the output is classified as                     discussed by Purushothaman et al.1. A genetic
pattern belonging to class-1 or class-2, and thus                     programming classifier expression is evolved as a
considered as a pattern classification problem. Also in               discriminant function between two classes using
——————                                                                training set of data points. The GPCE are then applied
* For correspondence (E-mail: omkar@aero.iisc.ernet.in)               to set of data in the testing/validation set to obtain the
206                                     INDIAN J. ENG. MATER. SCI., AUGUST 2009

classification accuracy. On the other hand, AM is used       reproduction, crossover and mutation to generate the
to derive knowledge in the form of simple rules from         next generation. Hence, the solution is evolved
the training data. These rules are applied sequentially      through the generations.
to the testing data set to obtain the classification            Koza4 has applied GP for a two-class pattern
accuracy. These nature inspired techniques are               classification problem. In a two-class problem, a
intriguing because of their ability to classify the data     single GP expression is evolved. While evaluating
efficiently and in the simplicity of the mathematical        the GP expression, if the result is positive, then the
expression and rules that have been extracted.               input data are assigned to one class (say class-1);
In this paper, the nature inspired techniques for         else they are assigned to the other class (class-2).
pattern classification and applications to tool wear         Thus, in the training set the desired (known) output
monitoring are discussed. The mathematical model             d is +1 for samples belong to one class (say class-1),
and rule extraction are described.                           and the desired (known) output d is –1 for samples
belong to the other class (class-2). Hence, the output
Nature Inspired Techniques for Pattern                       of a GP expression is either +1 (indicating that the
Classification                                               input sample belongs to that class) or –1 (indicating
Nature inspired technique is the field of research that   that the input sample does not belong to that class).
works with computational techniques inspired in part         We call this GP expression evolved in a two-class
by nature and natural systems. These nature inspired         problem as GPCE for pattern classification problem.
techniques provide a more robust and efficient               This GPCE is the mathematical model evolved for
approach for solving complex real-world problems11,12.       the pattern classification problem. This GPCE
Many nature inspired techniques such as Artificial           divides the feature space into two regions. GP uses
Neural Network13, Ant Colony Optimization7, Genetic          the function set that contains operators and functions
Programming4 and Particle Swarm Optimization                 to evolve a GPCE as the discriminant function for
(PSO)14,15 have been proposed. Among these, we               the two classes present in the training set. Let Y be
briefly describe two methods - GP and AM.                    the output of the GPCE.
Genetic programming for pattern classification
IF GPCE(x) ≥ 0 THEN Y = +1, x ∈ class-1
Genetic programming is an evolutionary approach
IF GPCE(x) < 0 THEN Y = -1, x ∉ class-1,
which applies the Darwin’s principle of survival of
the fittest to a population of parametric solution of a
where x is the input feature values. In the present
given problem. GP evolves a population of computer
study, for evolving a GPCE we have used the function
programs, which are possible solutions to a given
problem. Each program or individual in the                   set with only arithmetic operations (+, -, ÷, and ×).
Koza4 has shown that in GP the evolution is a
population is generally represented as a tree
composed of functions and data/terminals appropriate         never-ending process, and hence a termination
criterion is needed. The termination criterion for GP is
to the problem domain. The set of functions F and set
of terminals/inputs T must satisfy the closure and           generally based on the problem or is limited by the
sufficiency properties. The closure property demands         number of generations. In GP, a user-defined fitness
function has to be maximized for his/her application.
that the function set is well defined and closed for any
combination of arguments that it may encounter. On           Thus, at the end of a GP run, we have a current
population of individuals and also the fittest
the other hand, the sufficiency property requires that
the set of functions in F and the set of terminals be        individual that appeared during the run. The fittest
individual that has evolved for the given problem is
able to express a solution of the problem. The
function set may contain standard arithmetic                 its solution or desired mathematical model.
operators, mathematical functions, logical operators,        Ant-miner for pattern classification
and domain-specific functions. The terminal set                 An ant colony optimization approach for discovery
usually consists of feature variables and constants.         of classification rules has been proposed called Ant-
Each individual in the population is assigned a fitness      Miner8-10. Ant-miner follows a sequential covering
value, which quantifies how good the solution is. The        approach to discover a list of classification rules
fitness value is computed by a problem-dependent             covering all, or almost all, the training cases. At first,
fitness function. GP uses genetic operations like            the list of discovered rules is empty and the training
OMKAR et al.: RULE EXTRACTION FOR TOOL WEAR MONITORING                                                        207

set consists of all the training cases. A rule is added to                     3. The determination of the tool's wear caused by
the rule list when it classifies correctly a pre-defined                          abrasion, erosion, or other sinfluences.
number of training cases. A three step process gets
repeated for each training case – rule construction,                         Purushothaman et al.1 have experimentally
rule pruning and pheromone updating, until one rule                       studied and simulated the challenges involved to
gets extracted. This rule is added to the list of                         classify the tool wear data based on two-class pattern
discovered rules and the training cases that are                          classification problem using NN. In their
covered correctly by this rule (i.e., cases satisfying the                experimental study, in the ranges of various
rule antecedent and having the class predicted by the                     parameters such as speed, feed and depth of cut, data
rule consequent) are removed from the training set.                       is collected on axial force, radial force, tangential
This process is performed iteratively while the                           force, and flank wear bandwidth. The conditions of
number of uncovered training cases is greater than a                      the machining and the resource used are
user-specified threshold.                                                 explained in Purushothaman et al.1. Thus, in their
Each classification rule has the form IF <term1                        study, there are six inputs namely speed, feed, depth
AND term2 AND …> THEN <class>.Each term is a                              of cut, axial force, radial force and tangential force.
triple <attribute, operator, value>, where value is a                     The flank wear bandwidth is the output. The output
value belonging to the domain of attribute. The                           (flank wear bandwidth) is modified for pattern
operator element in the triple is a relational operator.                  classification problem as (i) all the data points for
The six inputs of the tool wear monitoring                             which the flank wear bandwidth is less than or equal
constitute the attribute set. The relational operators:                   to 200 belong to class-1 and (ii) all the data points
greater than (>), less than (<), greater than equal to                    for which the flank bandwidth is greater than 200,
(>=), less than equal to (<=) and equal to (=)                            belong to class-2.
constitute the operator set.                                                 In their study-113 data points are collected in
which 87 data points (or samples) belonged to
Applications to Tool Wear Monitoring                                      class-1, and 26 data points belonged to class-2. They
Monitoring of tool wear is an important                                used 20 data points belonging to class-1 and 10 data
requirement for realizing automated manufacturing.                        points belonging to class-2 for training the NN. The
Tool wear is a very complex phenomenon which can                          rest of the data points, 67 (class-1) and 16 (class-2),
lead to machine down time, product rejects and can                        are used for testing. The input is reduced from six
also cause problems to personnel16. The three most                        dimensions to two dimensions using optimal
important tasks in the area of tool monitoring are17:                     discriminant method and the NN is trained.
Quantifying the input-output relationship is difficult
1. The fast detection of collisions, i.e. any                           using NN. Hence, Genetic Programming and
unintended contacts between the tool and the                         Ant-Miner are used to obtain a mathematical model
workpiece or parts of the machine (causing e.g.                      and simple rules for this problem.
rapidly increasing forces);                                             In the present study, we use the same data points as
2. The identification of tool breakage, e.g.                            in Purushothaman et al.1 for this pattern classification
outbreaks at brittle cutting edges and                               problem using GP and AM. A partial list of data
Table 1— Subset of experimental data set
Input                                                     Output
Sl. No.         x1          x2                  x3                   x4              x5                 x6
Class
(Speed)      (Feed)          (Depth of cut)       (Axial force)   (Radial force)   (Tangential force)
1              450          10                 15                   150            115                 150               1
2              450          10                 50                    60             50                 115               1
3              450          10                 200                  180            130                 450               1
4              350          10                 50                    60             90                 125               1
5              300           6                  50                   45             80                  70               1
6              450          10                 150                  750            650                 500               2
7              400          10                 50                   175            350                 140               2
8              450          10                 150                  240            850                 620               2
9              456          10                 100                  550            590                 430               2
10             450          10                 200                  1100           1200                840               2
208                                  INDIAN J. ENG. MATER. SCI., AUGUST 2009

points using the six input features such as x1(speed),        Equivalent mathematical model is
x2(feed), x3(depth of cut), x4(axial force), x5(radial
force) and x6(tangential force) and a desired output         2 x1 + 2 x3 − 60 − x4 − 2 x5 
feature (i.e., class-1 and class-2) are given in Table 1.                                 
             21x4             
In the GP or AM approach to pattern classification,
the given data set is divided into training set and                                     2           x3        
validation/testing set. In case of GP, the training set                                 x4 + 66 x4 − − 12.3243 … (1)
             21        
data points are used for obtaining GPCE
(mathematical model) and the testing set data points           For a given input sample, if the above expression is
are used for obtaining classification accuracy whereas      greater than or equal to zero, then the input sample
AM extract rules from the training data and the             belongs to class-1. Otherwise, the input sample
extracted rules are used to classify the test data. We      belongs to class-2. From this mathematical
have used 21 data points that belong to class-1, and 11     expression, we can derive simple rules. This is as
data points that belong to class-2 for obtaining GPCE       follows:
and simple rules. The rest of the data points 66
(belonging to the class-1) and 15 (belonging to the         Classification rule:
class-2) are used for testing/validation and for
obtaining classification accuracy.                            If   ( x1 + x3 ) > x4 + x5 + 30 ,    then this sample
2
Mathematical Model and Rule Extraction                      belongs to class-1.

Genetic programming                                            This rule says if the sum of the value of speed (x1)
The genetic programming parameters which                 and depth of cut (x3) is greater than the sum of half
include population size, GP generations, cross over         the value of axial force (x4), the value of radial force
weight, mutation weight, mutation rate and                  (x5) and the constant value 30, then the flank wear
tournament size are varied until they produce most          width will be less than 200 (class-1).
favorable classification result. The optimum values            The advantage of above classification rule is that
for the above parameters for the most favorable             any person without much knowledge about the
results are as follows:                                     physical process can easily use them for
classification. The rules also represent the knowledge
Population size = 2000                                    that is learned while obtaining the GPCEs.
GP generations = 5,00,000
Cross over weight = 70                                    Ant-miner
Mutation weight = 20                                         The       Ant-Miner      parameters      such     as
Mutation rate = 60                                        Number_of_ants,        Min_cases_per_rule,        Max_
Tournament size =3000                                     uncovered_cases, Number_rules_to_converge were
varied to extract different set of rules and the overall
For the above parameter, we have done several            classification efficiencies hence obtained were
runs to evolve GPCE with the training set and the           recorded. The optimum values for the above
best GPCE obtained for the run is listed below. The         parameters are as follows:
GP expression evolved is in the form of LISP
s-expression and this expression can be easily                Number_of_ants = 25.
converted into a mathematical expression as                   Min_cases_per_rule = 6.
follows.                                                      Max_uncovered_cases = 3.
Number_rules_converg = 5.
GPCE: (MUL (SUB (MUL (SUB (ADD 20 x4)
(SUB –44 2)) x4) (ADD (SUB (DIV x3 21)                      AM extracted rules from the training data set and the
(DIV –75 –111)) 13)) (DIV (ADD (ADD (ADD x3 x1)             extracted rules were used to classify the test data.
(ADD 32 –29)) (SUB (DIV (SUB (ADD 20 x4)                    Following are some of the rules extracted by the
(SUB –44 2)) –2) x5)) x4))                                  algorithm:
OMKAR et al.: RULE EXTRACTION FOR TOOL WEAR MONITORING                                           209

For the class-1 of tool wear data set:                       classified without any misclassifications and hence
x1 <= 372                                                    has an individual efficiency of 100%. The overall
classification is impressive with an efficiency of
This rule says if the value of speed (x1) is less than
100%. The overall classification efficiency obtained
or equal to 372, then the sample belongs to class-1.
for the training data is a measure of the relevance of
For the class-2 of tool wear data set:
the GPCE extracted.
x1 > 393 and x2 <=17 and x5 >307                                Next the GPCE extracted are applied to the testing
data set and the efficiencies are evaluated. As we can
This rule says if the value of speed (x1) is greater
notice from the classification matrix generated for the
than 393 and the value of feed (x2) is less than or
testing data (Table 3), two of the samples belonging
equal to 17 and the value of radial force (x5) is greater
to class-2 are misclassified as class-1, but overall
than 307, then the sample belongs to class-2.
efficiency is impressive with a 97.53%.
Simulations and Results                                            Purushothaman et al.1 applied NN to solve this
To evaluate the performance, the data set is used to         problem, and the classification accuracy obtained in
arrive at the classification matrix which is of size            their approach is 96.36%. We can observe that the
n × n, where n is the number of classes. A typical              classification accuracy obtained in GP approach is
entry qij in the classification matrix shows how many           comparable to that of NN approach.
samples belonging to class i have been classified into
Ant-miner simulation and classification
class j. For a perfect classifier, the classification
The classification matrices obtained after applying
matrix is diagonal. However, due to misclassification
the derived rules from AM for the training and
we get off-diagonal elements. The individual
testing data are shown in Tables 4 and 5
efficiency of class i is defined (for all j) as
respectively. From the classification matrix for the
training data we can notice that in the training set,
qii / ∑ qji                                          … (2)
samples belonging to class-1 getting classified
The overall efficiency is defined as
without any misclassifications and hence has an
(∑ qii) / N                                          … (3)
individual efficiency of 100%. But for class-2 a
where N is the total number of elements in the                  single case is getting misclassified as class-1. Hence
data set.                                                       class-2 has an individual efficiency of 90.90%.
However, the overall classification is impressive
GP simulation and classification                                with an efficiency of 96.87%. The overall
Initially, GP learns from the training data set and          classification efficiency obtained for the training
evolves the GPCE. The Classification Matrices                   data is a measure of the relevance of the rules
obtained after applying the GPCE, for the training              extracted.
and testing data are shown in Tables 2 and 3                       Next the rules extracted are applied to the testing
respectively. From the classification matrix for the            data set and the efficiencies are evaluated (Table 5).
training data we can notice that in the training set,           As we can notice from the classification matrix
samples belonging to class-1 and class-2 are getting            generated for the testing data, there are some
Table 2— Classification matrix of tool wear monitoring          Table 4— Classification matrix of tool wear monitoring
training data set by GP algorithm                               training data set by AM algorithm
Class-1   Class-2    Individual Efficiency                    Class-1    Class-2    Individual Efficiency
Class-1           21         0              100%                 Class-1     21         0          100%
Class-2           0         11              100%                 Class-2     1          10         90.9%
Overall efficiency = 100%                                        Overall efficiency = 96.87%
Table 3— Classification matrix of tool wear monitoring          Table 5— Classification matrix of tool wear monitoring
testing data set by GP algorithm                                testing data set by AM algorithm
Class-1    Class-2    Individual Efficiency                    Class-1    Class-2    Individual Efficiency
Class-1          66          0               100%                Class-1     66         0          100%
Class-2           2         13              86.67%               Class-2     4          11         73.34%
Overall efficiency = 97.53%                                      Overall efficiency = 95.06%
210                                       INDIAN J. ENG. MATER. SCI., AUGUST 2009

misclassifications between the two classes, but overall            5    Kishore J K, Patnaik L M, Mani V & Agarwal V K, IEEE
efficiency of 95.06% is almost the same as that                         Trans Evolut Comput, 4 (2000) 242-258.
6    Suresh S, Omkar S N, Mani V & Menaka C, J Aerospace Sci
obtained for the training data set.                                     Technol, 56 (2004) 26-41.
7    Marco Dorigo & Christian Blum, Theor Comput Sci, 344
Conclusions                                                             (2005) 243-278.
In this paper, nature inspired techniques such as               8    Parpinelli R S, Lopes H S & Freitas A A, IEEE Trans Evol
genetic programming and ant-miner are used to solve a                   Comput,6 (2002) 321-332.
pattern classification problem that arise in tool wear             9    Omkar S N & Raghavendra K U, IEEE Int Conf Industrial
Technology, (2006) 1559-1562.
monitoring, is presented. These techniques evolve a                10   Omkar S N & Raghavendra T R, Eng Appl Artif Intell, 21
mathematical model or a rule base that express an                       (2008) 1381-1388.
input-output relationship explicitly. This approach is             11   Back T & Schwefel H P, Evolut Comput, 1 (1993).
better than other approaches such as NN, in the sense              12   Yao X E, Evolutionary Computation: Theory and
Applications, (World Scientific, Singapore), (1999).
that it gives an insight into the knowledge contained in
13   S. Haykin, Neural Networks – A Comprehensive Foundation,
the data set. Also, GPCE and AM extracted rules may                     2nd ed, ( New York), 1994.
be used in developing a rule-based expert system.                  14   Eberchart R & Kennedy J, A new optimizer using particle
swarm theory, in Proc Int Sym Micro Machine and Human
References                                                              Science, Japan, 1995.
1     Purushothaman S & Srinivasa Y G, Int J Prod Res, 36 (1998)   15   Eberchart R & Kennedy J, Particle swarm optimization, in
635-651.                                                          Proc. IEEE Int Conf Neural Networks, 1995.
2     Anderson K, Cook G E, Kasai G & Ramaswamy K, IEEE            16   Dimla D E Jr, Lister P M Leighton N J, Int J Mach Tools
Trans Ind Appl, 26 (1990) 824-830.                                Manufact, 37 (1997) 1219 -1241.
3     Cook G E, Barnett R J, Anderson K & Strauss A M, IEEE        17   Golz H U, Schillo E, Wolf A, Kaufeld M, Sprengel P,
Trans Ind Appl, 31 (1995) 1484-1491.                              Johannsen P & Heinek D. Bewertung yon Werkzeugtiber
4     Koza J R, Genetic Programming: On the Programming of              wachungssystemen aus Sicht der Anwender In:
Computers by Means of Natural Selection. (M I T Press,            f2Jberwachung von Zerspan und Umformprozessen,
Cambridge, USA), 1992.                                            Dtisseldorf, VDI-Verlag, (1995) 309-317.

```
DOCUMENT INFO
Shared By:
Categories:
Stats:
 views: 22 posted: 4/28/2010 language: English pages: 6