VIEWS: 10 PAGES: 8 POSTED ON: 4/12/2010 Public Domain
An Application of Parallel Genetic Programming for Parameter Estimation of Bio-reactor Srinophakun, T., Apinyavisit, S. and Lotangtrakun, P. Department of Chemical Engineering, Faculty of Engineering, Kasetsart University, Bangkok, 10900 Abstract The new parallel genetic algorithm is developed based on island concept and processed on the computer; AMATA. This algorithm is implemented to the problem of parameter estimation which by nature is the optimization style problem. These results show the very good agreement to the experiment of various case studies of BOD, enzyme kinetic. Keywords: Parallel GA; Parameter Estimation 1. Introduction The use of genetic algorithms (GAs) to optimize difficult problems has attracted recent interest in chemistry and other fields (Hibbert, 1992). Genetic algorithms are stochastic search technique based on the mechanism of natural selection and natural genetic (Goldberge (1989)). Genetic algorithms differing from conventional search techniques, starts with an initial set of the random solution called population. Each individual in the population is call chromosome which comprises of a sequence of gene. The „gene‟ in this GA represent decision variables of solution. The application of genetic algorithms for generating initial parameter estimations for the kinetic models of a catalytic process (oxidative methane dehydrodimerization) is described. The aim was to provide suitable starting points for the applied combination of an integration process and a locally converging nonlinear parameter estimation algorithm. The influence of the control parameters of this GA (number of individuals, the mutation rates and the second methods) was studied. Additionally, studies of kinetic model parameters for two different Pbo/alumina catalysts were carried out. The suggested method can be used to estimate suitable values for the model parameters of a complex mathematical model (Moros et al, 1996). A method providing initial estimates for rate constants of nonlinear chemical kinetics is proposed. This method is based on a genetic algorithm. The particular structure of the genetic algorithm adapted to the problem of estimating rate constants is introduced. The rate and certainty of convergence the estimation procedure is analyzed illustrating the proposed strategy by example of its application to a kinetic model of the catalytic oxidative coupling of methane to C2 hydrocarbons. For this example, the maximum likelihood estimate of the rate constants could be found using initial estimates from the genetic algorithm which were further optimized by a Nelder-Meadalgorithm. It is shown that optimization procedure supports the estimation of rate constants for kinetic models without supposition of rate-determining steps (Wolf and Moros, 1996). In this paper we use a new genetic algorithms called DCGA (Shimodaira, 1997) to this goal because they can maintain the diversity of chromosomes in population for global or near global optimum solution. In the DCGA of the population for the next generation are selected from the merged population of parents and their offspring based on a selection probability. This algorithm is extended to parallel style computation by PC Cluster computer named AMATA. 2. Genetic Algorithms Some methods in simple GA limit the performance of algorithm is poor so that Wasanapradit (2000) proposed new method for GA to improve performance described as follows: 2.1 Representation The binary representation is easy to deal with genetic operators but it has some disadvantages when applied to multidimensional, high precision numerical problem. It requires long representation and is difficult to understand. Real number representation can deal with this problem and easy to understand. 2.2 Crossover Arithmetic crossover is used for crossover. This operator is defined as a linear combination of two vectors, which is modified to apply with real number representation. 2.3 Mutation This operator is also different from simple genetic algorithm. The multiposition mutation is used for mutation operator. Before explain this method, the uniform mutation will be introduced. The procedure of uniform mutation is the chromosome which is selected for undergoing mutated with the same strategy as simple GA. This operator requires a single parent v and produces a single offspring v‟. The operator selects a random component k (1, … , q) of vector v = [v1,…, vk,…, vq] and produces v‟ = [v1, . . ., v‟k, . . ., vq]. Where v‟k is random value from the range [vLk, vUk]. Where vLk, vUk are lower and upper bound of variable vk respectively. The mutated gene can be calculated as: U v 'k v L a ( v k v L ) k k (3) Where a is random number from uniform probability distribution. This paper we use multiposition mutation, the mutated chromosome will be set the amount and position of genes undergoing mutation by using mutation rate (Pm). After that, these genes were replaced with the new values by the same method as uniform mutation. 2.4 Selection Shimodaira (1997) developed a diversity control oriented genetic algorithm (DCGA) in order to solve a premature convergence problem in traditional genetic algorithm. In the DCGA, the structures for the next generation were selected form the merged population of parents and their offspring eliminating duplicates based on a selection probability called CPSS (Cross-generational Probabilistic Survival Selection) method. In the CPSS method, chromosomes from offspring and the current population are selected using random numbers based on a selection probability calculated by the following equation: 1 m h ps m (4) L Where ps is the selection probability. L is the length of the string representing the chromosome. h is the hamming distance between the candidate chromosome with the best fitness value. m is the shape coefficient. is the exponent. Wasanapradit (2000) developed method for h and L from the same basic concept for real number representation. Since the binary representation counts the number of bit string as distance, for h, it can analogy to Euclidean distance for real number representation. The Euclidean distance can be calculated with this equation: h v 1 ' 2 v1 ' ... v q v q 2 (5) Where vi is ith gene of candidate chromosome. v i' is ith gene of the fittest chromosome. For L, it is maximum distance between the fittest chromosome and any chromosome in that generation as follows: L = max (hj) : j = 1,…, pop_size (6) With this approach, we can apply CPSS method with the real number representation. 3. Model of Parallel Genetic Algorithm In this paper we propose the modification of Global Parallel Genetic Algorithm (GPGA). This GPGA maintains a single population and the evaluation of the individual and/ or the application of genetic operator are done in parallel. Just like in the serial GA, each individual competes with all the other members of the population and also has a chance to mate with any other individual. In other words, selection and mating are still global, hence the name of the model. The most common operation that is parallelized is the evaluation of the individuals, because the fitness of an individual is independent from the rest of the population and there is no need to communicate during this phase. The evaluation of individuals is parallelized occurs only as each processor receives its subset of individuals to evaluate and when the processors return the fitness values. On a distributed memory computer, the population can be stored in one processor. This “master” processor would be responsible for sending the individuals to the other processors (the “slaves”) for evaluation, collecting the results, and applying the genetic operators to produce the next generation For the modification of GPGA, the following population scheme is proposed: independent populations are associated with nodes, called local populations. Each node executes a single GA, including all the basic genetic operations (selection, crossover, and mutation), and the creation of a new local population. When all nodes are ready with the new generations, each node then sends the best local individual to the master node. The master node selects the best of all received individuals, and broadcasts the best global to all slave nodes. Each independent slave node replaces the worst local individual with the new best global received. In other words, the interchange of information between parallel searches is summarized as the selection of the best global, which replaces the worst of each local population, as shown in Figure 1. Start Master Slave #1 …. Slave #N Initial sub population Initial sub population Initial sub population (Evaluate, selection, (Evaluate, selection, (Evaluate, selection, crossover and mutation) crossover and mutation) crossover and mutation) Master gets best local solution from each node and selects global solution Master Broadcast solutions to each node Master Slave #1 ….. Slave #N Replace worst solutions Replace worst solutions Replace worst solutions with best solutions with best solutions with best solutions (Selection, crossover (Selection, crossover and (Selection, crossover and and mutation) mutation) mutation) Master gets best local solution from each node and selects global solution No Master,Convergence criterion ? Yes Stop Figure 1 Flowchart of the modification of GPGA 4. Computation This paper we use computer program MATLAB (Matrix Laboratory). It is software for developed code of kinetic parameter estimation problem using the estimation method and least square as objective function and new GA as solving method. The code was implemented on personal computer (intel celeron 800 MHz processor 128 KB RAM). The function to be minimized was the sum of squares of the differences between calculated and measured concentration. 5. Application of the new Genetic algorithms to the model examples of chemical kinetics 5.1 Biological oxygen demand problem Data on biological oxygen demand versus time are modeled by the following equation y = k1[1 - exp( - k2t)] (7) Where k1 is the ultimate carbonaceous oxygen demand (mg/L) and k2 is the BOD reaction rate constant (d-1). A set of BOD data were obtain by 3rd year Environmental Engineering students at the Technical University of Crete and given in Table 1. Table 1 BOD Data: Experimental data Time (d) BOD – Experimental Date (mg/L) 1 110 2 180 3 230 4 260 5 280 6 290 7 310 8 330 5.2 Enzyme Kinetics problem Let us consider the determination of two parameters, the maximum reaction (rmax) and the saturation constant (Km) in an enzyme-catalyzed reaction following Michaelis-Menten kinetics. The Michaelis-Menten kinetic rate equation relates the reaction rate ( r ) To the substrate concentrations (s) by 1 1 Km (8) r rm ax rm ax * s The parameters are usually obtained from a series of initial rate experiments performed at various substrate concentrations. Data for the hydrolysis of benzoyl-L-tyrosine ethyl ester (BTEE) by trypsin at 30 c and pH 7.5 are given in Table 2. Table 2 Enzyme Kinetic: Experimental data for the hydrolysis of benzoyl-L-tyrosine ethyl ester (BTEE) by trypsin at 30 c and pH 7.5 S(M) 20 15 10 5.0 2.5 r(M/min) 330 300 260 220 110 6. Results and Discussion For examples 1 and example 2, we set the control parameters of new GA for determination of initial estimates of rate constants (Table 3). Table 3 Control parameters of new GA. Control parameters Values Set of population: 10 Number of iteration step: 1,0000 Probability of crossover, (Pc): 0.3 Probability of mutation, (Pm): 0.1 Probability of CPSS, (PS) m=0.3, =0.3 ( Ps = {(1-m)h/L + m} ): For example 1 we calculated kinetic rate constant with new genetic algorithm. The kinetic rate constant for equation (7), k1 = 334.25 and for equation (6), k2 = 0.3806. The LS objective function is 288.96. When we compared the quadratic convergence of the Gauss-Newton method (Englezos and Kalogerakis, 2001) is shown k1 = 334.3 and k2 = 0.3807. The LS objective function is 288.90. Based on these parameter values. the model calculated BOD values are compared to the experimental data shown in Figure 2. 350 300 Biological oxygen demand 250 Experiment al 200 Calculat ions 150 100 50 0 0 1 2 3 4 5 6 7 8 9 T ime(t) Figure 2 Biological oxygen demand: Experimental data and calculated model for value of BOD. For example 2 we calculated kinetic rate constant with new genetic algorithm (KUPGA). The maximum reaction (rmax) = 420.2 and the saturation constant (km) = 5.705. The LS objective function is 974.582. When we compared the convergence of the Gauss-Newton method without the need for Marquardt‟s modification (Englezos and Kalogerakis, 2001) is shown constant rmax = 420.2 and km = 5.705. The LS objective function is 974.582. Then we compared the reaction rate (r) of the model calculated reaction rate (r) and the experimental data in Figure 3. benzoyl -L- tyrosine ethyl ester by trypsin 350 Enzyme kineticas for the hydrolysis of 300 Experiment 250 KUP GA 200 150 100 50 0 0 5 10 15 20 25 S (M) Figure 3 Enzyme Kinetics: Experimental data for the hydrolysis of benzoyl -L- tyrosine ethyl ester by trypsin and calculated model 7. Conclusion References Englezos, P. and Kalogerakis, N. (2001). Applied parameter estimation for chemical engineers. Marcel Dekker; New York. Draper, N. and Smith, H. (1998). Applied Regression Analysis, 3nd ed.; New York, Wiley. Gallot, J.E., Kapor, M.P. and Kaliaguine, S. (1998). “Kinetic of 2-hexanol and 3-hexanol Oxidation Reaction over TS-1 Catalyst.” AIChE Journal. 44(6): 1438-1454. Goldberg, D. (1989). Genetic Algorithm in search, Optimization and Machine Learning; Adidison-Wesley. Hibbert, D. B. (1992). “A hybrid genetic algorithm for the estimation of kinetic parameters.” Chemometrics and Intelligent Laboratory System 19: 319-329. Moros, R., Kalies, H., Rex, G. and Schaffarczyk, St. (1996). “A genetic algorithm for generating initial parameter estimations for kinetic models of catalytic processes.” Comp. Chem. Engng. 20(10): 1257-1270. Shimodaira, H. (1997). “ A Diversity Control Oriented Genetic Algorithm.” IEEE : 367- 374. Srinivasan, R. and Levi, A. A. (1963). “Kinetics of the Thermal Isomerization of Bicyclo [2, 1, 1] hexane.” J. Amer. Chem. Soc. 85: 3363-3365. Wasanapradit, T. (2000). “Solving Nonlinear Mixed Integer Programming Using Genetic Algorithm.” Master Thesis, KMUTI. Thonburi University. Wolf, D. and Moros, R. (1997). “Estimation rate constants of heterogeous catalytic reactions without supposition of rate determining surface steps-an application of a genetic algorithm.” Chemical Engineering Science 52(7): 1189-1199.