VIEWS: 0 PAGES: 8 CATEGORY: Business POSTED ON: 5/1/2013 Public Domain
International Journal of Advanced Research OF ADVANCED RESEARCH IN INTERNATIONAL JOURNAL in Engineering and Technology (IJARET), ISSN 0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 4, Issue 3, April (2013), © IAEME ENGINEERING AND TECHNOLOGY (IJARET) ISSN 0976 - 6480 (Print) ISSN 0976 - 6499 (Online) IJARET Volume 4, Issue 3, April 2013, pp. 93-100 © IAEME: www.iaeme.com/ijaret.asp ©IAEME Journal Impact Factor (2013): 5.8376 (Calculated by GISI) www.jifactor.com TRAINING THE NEURAL NETWORK USING LEVENBERG-MARQUARDT’S ALGORITHM TO OPTIMIZE THE EVACUATION TIME IN AN AUTOMOTIVE VACUUM PUMP Vijayashree1*, Kolla Bhanu Prakash2 and T.V. Ananthan3 1, 2, 3 Department of Computer Science and Engineering, Dr. MGR Educational and Research Institute University, Maduravoyal, Chennai 600 095, India ABSTRACT Neural networks have been used for engine computations in the recent past. One reason for using neural networks is to capture the accuracy of experimental data while saving computational time, so that system simulations can be performed within a reasonable time frame. The main aim of this study is to optimize and arrive at a design base for a vacuum pump in an automotive engine using Levenberg-Marquardt’s (LM) Algorithm for Artificial Neural Networking (ANN). Design bases are created based on the previous products and by bench marking. Effortless application of brake is a preferred comfort feature in automotive application. To provide an easy and effective feeling, the braking mechanism needs to be assisted with external energy. This is optimized based on LM algorithm using the neural network to arrive at the optimum evacuation time.. Index Terms: automotive engine, braking system, evacuation time, Levenberg-Marquardt’s (LM) Algorithm, neural networks, vacuum pump. I. INTRODUCTION Effortless application of brake is a preferred comfort feature in automotive application. To provide an easy and effective feeling, the braking mechanism needs to be assisted with external energy. Vane type Vacuum pump exactly serves this purpose, which is used to produce vacuum by evacuating the air in the vacuum booster. This vacuum is used to actuate the booster for the power brakes in the diesel-powered and Gasoline Direct Injection automobile. The capacity of the vacuum pump varies based on the weight and brake booster capacity of the vehicle. Therefore, it is necessary to have a design base with a proven technique, which will serve as a basis for faster product development. Neural networks and other machine learning algorithms are increasingly being used for engine applications [1]. These applications can be categorized as either real time control/diagnostic methods 93 International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 4, Issue 3, April (2013), © IAEME or predictive tools for design purposes. Some applications have even moved downstream of the engine [2]. The present work aims to use neutral network technique using LM algorithm to arrive at the appropriate evacuation time which is a critical parameter. The particular task selected is to minimize the evacuation time in a vane type vacuum pump. The dataset used are the experimental results conducted at UCAL Fuel Systems Ltd. Chennai. II. VACUUM PUMP Vane type vacuum pump has a unique profile in which an eccentrically mounted rotor rotates the vane as shown in the Fig.1. The movement of vanes creates pressure difference, which creates vacuum in brake booster. Air enters the pump through inlet check valve assembly. Oil is circulated inside the pump to lubricate the rotating parts and to maintain sealing between the high pressure and low pressure regions [3, 4, 5]. The air and oil mixture are then expelled outside the pump through the reed valve. The performance of the pump is specified by evacuation time of a specified tank volume [3]. Evacuation time, t = (Vt / Q ) / ln (p1 / p2) Where Vt is tank volume; p1 is atmospheric pressure and p2 is required pressure. Vane type vacuum pump is used to produce vacuum by evacuating the air in the vacuum booster. This vacuum is used to actuate the booster for the power brakes in the diesel-powered and GDI automobile. The capacity of the vacuum pump varies based on the weight and brake booster capacity of the vehicle. Therefore, it is necessary to have a design base with a proven technique, which will serve as a basis for faster product development. These results obtained from the existing pump were used for training the ANN using LM algorithm to create the design base for any future design. Figure 1 shows the vacuum pump of capacity 110cc . Fig.1 Photograph of vacuum pump of capacity 110cc III. LEVENBERG-MARQUARDT’S ALGORITHM The LM algorithm is an iterative technique that locates a local minimum of a multivariate function that is expressed as the sum of squares of several non-linear, real-valued functions. It has become a standard technique for nonlinear least-squares problems, widely adopted in various disciplines for dealing with data-fitting applications. LM can be thought of as a combination of steepest descent and the Gauss-Newton method. When the current solution is far from a local minimum, the algorithm behaves like a steepest descent method: slow, but guaranteed to converge. When the current solution is close to a local minimum, it becomes a Gauss-Newton method and exhibits fast convergence. Input: A vector function f : Rm → Rn with n ≥ m, a measurement vector x ∈ Rn and an initial parameters estimate p0 ∈ Rm. 94 International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 4, Issue 3, April (2013), © IAEME Output: A vector p+ ∈ Rm minimizing ||x – f(p) ||2 Algorithm: k := 0; v := 2; p := p0; A := JTJ; ∈P := x – f(p); g := JT ∈P; stop := (||g||∞ ≤ ∈1); µ := τ * maxi=1, …, m (Aii); while (not stop) and (k < kmax) k := k + 1; repeat Solve (A + µI) δP = g; if (||δP || ≤ ∈2 ||p||) stop := true; else pnew := p + δP; T ρ := (||∈P ||2 − ||x – f(pnew)||2) / ( δ P (µδP + g)); ∈ if ρ > 0 p = pnew; A := JTJ; ∈P := x – f(p); g := JT ∈P; stop := (||g||∞ ≤ ∈1); µ := µ * max(1/3, 1 – (2ρ – 1)3); v := 2; else µ := µ * v; v := 2 * v; endif endif until (ρ > 0) or (stop) endwhile p+ := p; The above is Levenberg-Marquardt nonlinear least squares algorithm. ρis the gain ratio, defined by the ratio of the actual reduction in the error (||∈P ||2 ) that corresponds to a step δP and the reduction ∈ predicted for δP by the linear model of Eq. (1). See text and [6,7] for details. When LM is applied to the problem, the operation enclosed in the rectangular box is carried out by taking into account the sparse structure of the corresponding Hessian matrix A. In the following, vectors and arrays appear in boldface and T is used to denote transposition. Also, ||.|| and ||.||∞ respectively denote the 2 and infinity norms. Let f be an assumed functional relation which maps a parameter vector p ∈ Rm to an estimated measurement vector x = f(p), x ∈ Rn. An initial parameter estimate p0 and a measured vector x are provided and it is desired to find the vector p+ that best satisfies the functional relation f locally, i.e. minimizes the squared distance ∈T ∈ with ∈ = x - x for all p within a sphere having a certain, small radius. The basis of the LM algorithm is a linear ∂f (p) approximation to f in the neighborhood of p. Denoting by J the Jacobian matrix , a Taylor series ∂p expansion for a small ||δP|| leads to the following approximation f (p + δP ) ≈ f (p) + J δP (1) Like all non-linear optimization methods, LM is iterative. Initiated at the starting point p0, it produces a series of vectors p1, p2, … that converge towards a local minimize p+ for f. Hence, at each iteration, it is required to find the step δP that minimizes the quantity ||x − f (p + δP ) || ≈ ||x − f (p) − J δP || = ||∈− J δP|| ∈ (2) 95 International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 4, Issue 3, April (2013), © IAEME The sought δP is thus the solution to a linear least-squares problem: the minimum is attained when J δP − ∈ is orthogonal to the column space of J. This leads to JT (J δP − ∈) = 0, which yields the Gauss-Newton step δP; as the solution of the so-called normal equations: JT J δP = JT ∈ (3) Ignoring the second derivative terms, matrix JT J in Eq.(3) approximates the Hessian of ½∈T∈ ∈ T T [18]. Note also that J ∈ is along the steepest descent direction, since the gradient of ½∈ ∈ is −JT∈. ∈ The LM method actually solves a slight variation of Eq. (3), known as the augmented normal equations: N δP = JT∈, with N ≡ JTJ + µI and µ > 0 (4) Where I is the identity matrix. The strategy of altering the diagonal elements of JTJ is called damping and µ is referred to as the damping term. If the updated parameter vector p +δP with δP computed from Eq. (4) leads to a reduction in the error ∈T∈, the update is accepted and the process repeats with a decreased damping term. Otherwise, the damping term is increased, the augmented normal equations are solved again and the process iterates until a value of δP that decreases the error is found. The process of repeatedly solving Eq. (4) for different values of the damping term until an acceptable update to the parameter vector is found corresponds to one iteration of the LM algorithm. In LM, the damping term is adjusted at each iteration to assure a reduction in the error. If the damping is set to a large value, matrix N in Eq. (4) is nearly diagonal and the LM update step δP is near the steepest descent direction JT∈. Moreover, the magnitude of δP is reduced in this case, ensuring that excessively large Gauss-Newton steps are not taken. Damping also handles situations where the Jacobian is rank deficient and JTJ is therefore singular [4]. The damping term can be chosen so that matrix N in Eq. (4) is nonsingular and, therefore, positive definite, thus ensuring that the δP computed from it is in a descent direction. In this way, LM can defensively navigate a region of the parameter space in which the model is highly nonlinear. If the damping is small, the LM step approximates the exact Gauss-Newton step. LM is adaptive because it controls its own damping: it raises the damping if a step fails to reduce ∈T∈ otherwise it reduces the damping. By doing so, LM is capable of alternating between a slow descent approach when being far from the minimum and a fast, quadratic convergence when being at the minimum’s neighborhood [8]. The LM algorithm terminates when at least one of the following conditions is met: 1. The gradient’s magnitude drops below a threshold ε1. 2. The relative change in the magnitude of δP drops below a threshold ε2. 3. A maximum number of iterations kmax is reached. The complete LM algorithm is shown in the above pseudocode; more details regarding it can be found in [6]. The initial damping factor is chosen equal to the product of a parameter τ with the maximum element of JTJ in the main diagonal. Indicative values for the user-defined parameters are τ = 10−3, ε1 = ε2 = 10−2, kmax = 100. IV. METHODOLOGY OF NEURAL NETWORKS IN VACUUM PUMP PERFORMANCE OPTIMIZATION The performance of the vacuum pump is determined by time required to evacuate air from the reservoir. This function depends on the various parameters like temperature, oil pressure, rotation speed etc. The vacuum pump development requires the procedure to develop the pump of any capacity based on the customer requirement. In this first training stage, the inputs and the desired outputs are given to the NN. The weights are modified to minimize the error between the NN predictions and expected outputs. Different types of learning algorithms have been developed, but the most common and robust one is back-propagation. The goal of the training is to minimize the error, and consequently to optimize the NN solution. Each iterative step in which the weights are recalculated is called epoch. When the minimum is achieved, the weights are fixed and the training process ends. Once a neural network has been trained to a satisfactory 96 International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 4, Issue 3, April (2013), © IAEME level, it may be used as a predictive tool for new data. To do this, only the inputs are given to the NN, and the NN predicted outputs are calculated using the previous error minimizing weights. V. RESULTS AND DISCUSSION The dataset used was obtained from UCAL Fuel Systems Ltd, Chennai. There were 4 sets of training data, each set corresponding to a different combination of pump and tank capacity, speed, pressure and evacuation time. There were 21x6 training data points and 4 input features. The target values were the 21x6 normalized (by the minimum possible evacuation time) values. There were 10 such sets for testing too. No tuning set was required to be extracted from the training data, since because of the large number of training data points, the training error as well as tune error decreased asymptotically, beyond a few hundred epochs, and early stopping did not occur. The MATLAB neural network toolbox was used to build the baseline neural networks. The Levenberg-Marquardt algorithm [9, 10] was used with the back propagation algorithm. Twenty five hidden layers with an optimal 10 neurons having sigmoid activation function, and the output layer having a ten neuron with a linear activation function was the chosen configuration. The Nguyen-Widow method was used to initialize the weights. Evacuation time predictions were made using this configuration (baseline case). The reasons to incorporate a physical model into a neural network are: 1. To make the network more robust. Even if confronted with a set of conditions very different from those encountered in the training data, the network should output realistic results. 2. To reduce dependence on training data, i.e. to enable the network to form a reasonable hypothesis, from small datasets. 3. To improve the prediction accuracy. Table 1 Experimental data for tank capacity 100 cc and pump capacity 3 cc. Speed Temperature 400 1000 1500 2300 50 3.47 1.97 1.7 1.61 90 3.53 1.98 1.8 1.7 120 3.92 2.08 1.8 1.75 150 4.77 2.16 1.17 1.72 Table 2 ANN result for tank capacity 100 cc and pump capacity 3 cc (hidden layers: 25) Temperature Speed Evacuation time 400 1000 1500 2300 50 3.47912 1.7302 1.9189 1.60273 90 3.53071 1.98974 1.32223 1.67414 120 3.90548 2.18308 0.84523 1.73175 150 4.90085 1.78111 2.24074 1.67527 The reported error is the mean square error over normalized evacuation time values. It is always the test error, unless otherwise mentioned. It was noticed from error plots that most of the error occurred 97 International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 4, Issue 3, April (2013), © IAEME over the -0.2396 region (Fig.2). The other regions had much smaller errors and this error were therefore chosen for comparison with the three new methods. Fig.2 Error histogram The mean square error of the model output to the target output is a typical measure of neural network performance. However, it was found that there are practical difficulties in establishing acceptance criteria for the mean square error. Therefore a normalised version of the mean square error was implemented. This normalised mean square error used the nearer specification limit concept that was modified to encompass the definition of an acceptable percentage error level. Here, the acceptable error was equated to the typical level of propagated error that one would expect from the instrumentation measuring the engine performance. This was consistent with the idea that it is reasonable not to expect a higher standard of inference using the model than one could expect from direct measurement of the engine performance. The performance obtained during the training are Performance = 0.1601 trainPerformance = 8.4504e-008 valPerformance = 0.4123 testPerformance = 0.2283 During training, the progress is constantly updated in the training window. Of most interest are the performance, the magnitude of the gradient of performance and the number of validation checks. The magnitude of the gradient and the number of validation checks are used to terminate the training. The gradient will become very small as the training reaches a minimum of the performance. If the magnitude of the gradient is less than 1e-5, the training will stop (Fig.3). This limit can be adjusted by setting the parameter net.trainParam.min_grad. The number of validation checks represents the number of successive iterations that the validation performance fails to decrease. If this number reaches 6 (the default value), the training will stop. 98 International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 4, Issue 3, April (2013), © IAEME Fig.3 Gradient plot The performance plot (Fig.4) shows the value of the performance function versus the iteration number (epochs). It plots training, validation and test performances. The best validation performance is 0.17081 at epoch1. Fig.4 Performance plot The training state plot shows the progress of other training variables, such as the gradient magnitude, the number of validation checks, etc (Fig.5). The error histogram plot shows the distribution of the network errors. The regression plot shows a regression between network outputs and network targets. Fig.5 Training regeression plots The three axes represent the training, validation and testing data. The dashed line in each axis represents the perfect result – outputs = targets. The solid line represents the best fit linear regression line between outputs and targets. The R value is an indication of the relationship between the outputs and targets. If R = 1, this indicates that there is an exact linear relationship between outputs and targets. 99 International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 4, Issue 3, April (2013), © IAEME If R is close to zero, then there is no linear relationship between outputs and targets. For this example, the training data indicates a good fit. The validation and test results also show R values that greater than 0.9. The scatter plot is helpful in showing that certain data points have poor fits. Here in this R at training, validation, test and with all the three are 0.083294, 0.13655, 0.80023 and 0.080557 respectively. VI. CONCLUSION From the results obtained from the above Levenberg-Marquardt’s algorithm, it can be concluded that the above algorithm works quite satisfactorily in optimizing the evacuation time in automotive engines. The above optimization has been validated and found to be accurate to 5% level. The deviation of NN optimized values were also found within 5%, when compared with experimental results. VII. ACKNOWLEDGEMENT I wish to acknowledge Mr. J. Suresh Kumar, Deputy General Manager of UCAL Fuel Systems Ltd, Chennai for his help in conducting the experiments and generating the data set to do this project and validate the same in their prototype. REFERENCES [1] Indranil Brahma, Yongsheng He and Christopher J. Rutland, Improvement of Neural Network Accuracy for Engine Simulations, SAE Paper 2003-01-3227 [2] He, Y. and Rutland, C.J., “Application of Artificial Neural Network for Integration of Advanced Engine Simulation Methods”, Proceedings of the 2000 Fall Technical Conference of the ASME Internal Combustion Engine Division, ICE-Vol.35-1, 53-64, Paper No. 2000-ICE-304, 2000 [3] Chambers, A., Fitch, R. K., Halliday, B. S., “Basic Vacuum Technology,” ISBN 0-75-030495-2, 1998. [4] Nagendiran, S., Sivanantham, R., and Kumar, J.,“Improvement of the Performance of Cam-Operated Vacuum Pump for Multi Jet Diesel Engine,” SAE Technical Paper 2009-01-1462, 2009, doi:10.4271/2009-01-1462. [5] Nagendiran S R, Arun Subramanian, J Suresh kumar and Ramalingam Sivanantham Designing of Automotive Vacuum Pumps - Development of Mathematical Model for Critical Parameters and Optimization using Artificial Neural Networks, SAE Paper No.2012-01-0779K. Madsen, H. Nielsen, and O. Tingleff. Methods for Non-Linear Least Squares Problems. Technical University of Denmark, 2004. Lecture notes, available at http://www.imm.dtu.dk/courses/02611/nllsq.pdf. [6] Manolis I.A. Lourakis and Antonis A. Argyros, Is Levenberg-Marquardt the Most Efficient Optimization Algorithm for Implementing Bundle Adjustment? Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV’05), IEEE Computer Society [7] J. Dennis and R. Schnabel. Numerical Methods for Unconstrained Optimization and Nonlinear Equations. Classics in Applied Mathematics. SIAM Publications, Philadelphia, 1996. [8] Indranil Brahma, Yongsheng He and Christopher J. Rutland, Improvement of Neural Network Accuracy for Engine Simulations SAE Paper 2003-01-3227 [9] Hagan, M.T. and Menjaj, M.B., “Training Feedforward Networks with the Marquardt Algorithm”, IEEE Transactions on Neural Networks, Vol. 5, No. 6, pp.989-993, 1994. [10] Pallavi.H.Agarwal, Prof.Dr.P.M.George and Prof.Dr.L.M.Manocha, “Comparison Of Neural Network Models On Material Removal Rate Of C-Sic” International Journal Of Design And Manufacturing Technology (IJDMT) Volume 3, Issue 1, 2012, pp. 1 – 10, ISSN Print: 0976 – 6995, ISSN Online: 0976 – 7002 [11] Dharmendra Kumar Singh, Dr.Moushmi Kar And Dr.A.S.Zadgaonkar, “Analysis Of Generated Harmonics Due To Transformer Load On Power System Using Artificial Neural Network” International Journal of Electrical Engineering & Technology (IJEET) Volume 4, Issue 1, 2013, pp. 81 – 90, ISSN PRINT: 0976-6545, ISSN ONLINE: 0976-6553. 100