USE OF NEW FEATURES FOR ENHANCING THE PERFORMANCE OF FAULT
DIAGNOSIS SYSTEMS FOR CHEMICAL PROCESS
Isaac Monroy, Gerard Escudero, Moisès Graells.
EUETIB, UPC, Comte d´Urgell 187. Barcelona E-08036, Spain. +34 934137459
{isaac.monroy, gerard.escudero, moisés.graells}@upc.edu
Process and Product Engineering
Fault dianosis systems (FDS) have been used as one of the best ways for preventing industry accidents, mainly
the data-based Fault Diagnosis methods applied to chemical process. The Multilabel (ML) approach has been
applied in this case because it represents the training information obtained by on-line process data in the best
way for each class or fault. Support Vector Machines (SVM) have been used as learning algorithm because of
their generalization bounds good properties and noise and outlier tolerance for the classifiers, as well as their
proved efficiency dealing with ML problems in other areas. A FDS has been developed following the ML&SVM
approach and has been validated through its application to a heat exchanger system operating batchwise, which
requires hybrid-dynamic modellimg. In addition, new features were determined and used for improving the
information representation and the performance of the system.
Many experiments simulating an initial case and a serial of cases with abnormal conditions were executed and
the data acquired were arranged in training and test sets in order to detect the incidence of the harassed faults
during the experiments and diagnose each one of them by means of the FDS. The base case was the heat
exchange between water at medium temperature and a recirculating stream from a bath with a given water/ice
mixture. All the experiments were stopped when the bath temperature raised beyond 1.5º C. On the other hand,
the different experiments to the base case were done simulating 4 different faults. These are: external heating of
the cold water (Fault or Class 1), failure of the cold stream inlet temperature monitoring (C2), failure of the cold
stream flow monitoring (C3) and the combination of 2 anomalies, the simultaneous heating of the bath and the
detection of the same cold temperature (C4). The measured process variables were 7: operation time, hot stream
inlet and outlet temperatures, cold stream inlet and outlet temperatures and stream flow rates.
Information representation consisted on 5 data matrices, 4 of faults and one base (class 0). The columns
represent the attributes or process variables (7) and the rows the measurements of these variables through the
time. Each source data set has a different number of samples, so the original set sample number per class which
was used for applying SVM is that with the smaller number of samples (910 samples per class). The original
sample set is composed of random samples containing the 5 classes. This set is divided into 2 of the same
magnitude (455 samples per class), which correspond to the training and testing sets.
SVM with linear kernel and default soft margin value were applied to the data sets. The diagnosis performance is
measured using the normalized F1 index because it encompasses the precision and recall concepts. In addition, a
methodology consisting of attributes extension has been applied for improving the fault diagnosis performance.
The features that are not included in the process measurements may enhance the characterization of the dynamic
behaviour of the process. These new features are the standard deviation and the slopes of data with a time
window of 20 samples, which produce an expanded data set that could provide valuable information to the
learning algorithm and improve the fault diagnosis. Moreover, some tests with different types of kernels were
done in order to improve the diagnosis performance. Table 1 shows the F1 index for each class and 3 types of
kernel (linear, polynomial 3º degree and radial). Class 0 corresponds to the base situation and the others to the
different simulated faults. Original attributes correspond to the measured process variables.
Features and Kernel type F1 Index (%)
Class0 Class1 Class2 Class3 Class4 Mean1 Mean2
Original attributes, linear kernel 99.8 100 64.4 99.7 90.1 84.5 88.6
Orig-std dev, linear kernel 99.4 100 81.5 99.7 98.5 93.7 94.9
Orig-slopes, linear kernel 81.0 100 37.6 99.7 98.8 81.7 84.0
Orig-Std dev-slopes, linear kernel 99.4 100 96.0 99.7 93.6 97.0 97.3
O. attributes, 3º degree polynomial kernel 87.0 63.0 39.0 99.8 56.0 55.0 64.0
O. attributes, radial kernel 42.9 0 0 0 0 0 0
The best fault diagnosis is got using 21 features (the original process variables, the standard deviations and the
slopes of these data) and a linear kernel, which could change with other process’ information or kind of
processes. Hence, the good capability of the ML&SVM approach for diagnosing faults is demonstrated for a lab
scale case study.
Acknowledgements: Financial support received from the Generalitat de Catalunya (FI programs) is fully
appreciated.
Este trabajo presenta el desarrollo de un Sistema de Diagnosis de Fallos (FDS) siguiendo un planteamiento
Multilabel con SVM (ML&SVM) y es validado mediante su aplicación a un intercambiador de calor con modo
de operación batch, el cual requiere una modelización dinámica-híbrida. Además se obtuvieron nuevos atributos
que fueron usados para la mejora de la representación de la información y del rendimiento del sistema.
Los resultados obtenidos para el diagnóstico de clases y fallos, los cuales son provocados en dicho sistema, son
cercanos al 100% con lo que se demuestra la capacidad del sistema ML&SVM para la diagnosis de fallos en un
caso de estudio batch escala laboratorio