An Application of Parallel Genetic Programming for Parameter by malj


									An Application of Parallel Genetic Programming for Parameter Estimation
                              of Bio-reactor
Srinophakun, T., Apinyavisit, S. and Lotangtrakun, P.
Department of Chemical Engineering, Faculty of Engineering, Kasetsart University, Bangkok, 10900

The new parallel genetic algorithm is developed based on island concept and processed on the
computer; AMATA. This algorithm is implemented to the problem of parameter estimation
which by nature is the optimization style problem. These results show the very good
agreement to the experiment of various case studies of BOD, enzyme kinetic.
Keywords: Parallel GA; Parameter Estimation

1. Introduction
    The use of genetic algorithms (GAs) to optimize difficult problems has attracted recent
interest in chemistry and other fields (Hibbert, 1992). Genetic algorithms are stochastic search
technique based on the mechanism of natural selection and natural genetic (Goldberge
(1989)). Genetic algorithms differing from conventional search techniques, starts with an
initial set of the random solution called population. Each individual in the population is call
chromosome which comprises of a sequence of gene. The „gene‟ in this GA represent
decision variables of solution.
    The application of genetic algorithms for generating initial parameter estimations for the
kinetic models of a catalytic process (oxidative methane dehydrodimerization) is described.
The aim was to provide suitable starting points for the applied combination of an integration
process and a locally converging nonlinear parameter estimation algorithm. The influence of
the control parameters of this GA (number of individuals, the mutation rates and the second
methods) was studied. Additionally, studies of kinetic model parameters for two different
Pbo/alumina catalysts were carried out. The suggested method can be used to estimate
suitable values for the model parameters of a complex mathematical model (Moros et al,
1996). A method providing initial estimates for rate constants of nonlinear chemical kinetics
is proposed. This method is based on a genetic algorithm. The particular structure of the
genetic algorithm adapted to the problem of estimating rate constants is introduced. The rate
and certainty of convergence the estimation procedure is analyzed illustrating the proposed
strategy by example of its application to a kinetic model of the catalytic oxidative coupling of
methane to C2 hydrocarbons. For this example, the maximum likelihood estimate of the rate
constants could be found using initial estimates from the genetic algorithm which were further
optimized by a Nelder-Meadalgorithm. It is shown that optimization procedure supports the
estimation of rate constants for kinetic models without supposition of rate-determining steps
(Wolf and Moros, 1996).
    In this paper we use a new genetic algorithms called DCGA (Shimodaira, 1997) to this
goal because they can maintain the diversity of chromosomes in population for global or near
global optimum solution. In the DCGA of the population for the next generation are selected
from the merged population of parents and their offspring based on a selection probability.
This algorithm is extended to parallel style computation by PC Cluster computer named
2. Genetic Algorithms
    Some methods in simple GA limit the performance of algorithm is poor so that
Wasanapradit (2000) proposed new method for GA to improve performance described as

2.1 Representation
   The binary representation is easy to deal with genetic operators but it has some
disadvantages when applied to multidimensional, high precision numerical problem. It
requires long representation and is difficult to understand. Real number representation can
deal with this problem and easy to understand.

2.2 Crossover
    Arithmetic crossover is used for crossover. This operator is defined as a linear combination
of two vectors, which is modified to apply with real number representation.

2.3 Mutation
    This operator is also different from simple genetic algorithm. The multiposition mutation
is used for mutation operator. Before explain this method, the uniform mutation will be
introduced. The procedure of uniform mutation is the chromosome which is selected for
undergoing mutated with the same strategy as simple GA. This operator requires a single
parent v and produces a single offspring v‟. The operator selects a random component k  (1,
… , q) of vector v = [v1,…, vk,…, vq] and produces v‟ = [v1, . . ., v‟k, . . ., vq]. Where v‟k is
random value from the range [vLk, vUk]. Where vLk, vUk are lower and upper bound of variable
vk respectively. The mutated gene can be
calculated as:
                                       v 'k  v L  a ( v k  v L )
                                                k               k                           (3)
    Where a is random number from uniform probability distribution. This paper we use
multiposition mutation, the mutated chromosome will be set the amount and position of genes
undergoing mutation by using mutation rate (Pm). After that, these genes were replaced with
the new values by the same method as uniform mutation.

2.4 Selection
   Shimodaira (1997) developed a diversity control oriented genetic algorithm (DCGA) in
order to solve a premature convergence problem in traditional genetic algorithm. In the
DCGA, the structures for the next generation were selected form the merged population of
parents and their offspring eliminating duplicates based on a selection probability called
CPSS (Cross-generational Probabilistic Survival Selection) method. In the CPSS method,
chromosomes from offspring and the current population are selected using random numbers
based on a selection probability calculated by the following equation:
                                            1  m h    
                                      ps              m                                  (4)
                                                L        
 ps is the selection probability.
 L is the length of the string representing the chromosome.
 h is the hamming distance between the candidate chromosome with the best fitness value.
 m is the shape coefficient.
  is the exponent.
   Wasanapradit (2000) developed method for h and L from the same basic concept for real
number representation.
    Since the binary representation counts the number of bit string as distance, for h, it can
analogy to Euclidean distance for real number representation. The Euclidean distance can be
calculated with this equation:
                                      h    v   1
                                                        ' 2
                                                      v1                     '
                                                                 ...  v q  v q   
Where vi is ith gene of candidate chromosome.
         v i' is ith gene of the fittest chromosome.
   For L, it is maximum distance between the fittest chromosome and any chromosome in
that generation as follows:
                                            L = max (hj) : j = 1,…, pop_size          (6)
   With this approach, we can apply CPSS method with the real number representation.

3. Model of Parallel Genetic Algorithm
In this paper we propose the modification of Global Parallel Genetic Algorithm (GPGA). This
GPGA maintains a single population and the evaluation of the individual and/ or the
application of genetic operator are done in parallel. Just like in the serial GA, each individual
competes with all the other members of the population and also has a chance to mate with any
other individual. In other words, selection and mating are still global, hence the name of the
model. The most common operation that is parallelized is the evaluation of the individuals,
because the fitness of an individual is independent from the rest of the population and there is
no need to communicate during this phase. The evaluation of individuals is parallelized
occurs only as each processor receives its subset of individuals to evaluate and when the
processors return the fitness values. On a distributed memory computer, the population can be
stored in one processor. This “master” processor would be responsible for sending the
individuals to the other processors (the “slaves”) for evaluation, collecting the results, and
applying the genetic operators to produce the next generation

For the modification of GPGA, the following population scheme is proposed: independent
populations are associated with nodes, called local populations. Each node executes a single
GA, including all the basic genetic operations (selection, crossover, and mutation), and the
creation of a new local population. When all nodes are ready with the new generations, each
node then sends the best local individual to the master node. The master node selects the best
of all received individuals, and broadcasts the best global to all slave nodes. Each independent
slave node replaces the worst local individual with the new best global received. In other
words, the interchange of information between parallel searches is summarized as the
selection of the best global, which replaces the worst of each local population, as shown in
Figure 1.

                                Master                     Slave #1                         ….                   Slave #N
                       Initial sub population      Initial sub population                                 Initial sub population
                        (Evaluate, selection,       (Evaluate, selection,                                  (Evaluate, selection,
                      crossover and mutation)     crossover and mutation)                                crossover and mutation)

                                                      Master gets best local solution from each
                                                         node and selects global solution

                                                          Broadcast solutions to each node

                              Master                      Slave #1                        …..                    Slave #N
                      Replace worst solutions     Replace worst solutions                                 Replace worst solutions
                        with best solutions         with best solutions                                     with best solutions
                       (Selection, crossover     (Selection, crossover and                               (Selection, crossover and
                          and mutation)                  mutation)                                               mutation)

                                                   Master gets best local solution from each node
                                                            and selects global solution

                                                           Master,Convergence criterion ?



                       Figure 1 Flowchart of the modification of GPGA
4. Computation
   This paper we use computer program MATLAB (Matrix Laboratory). It is software for
developed code of kinetic parameter estimation problem using the estimation method and
least square as objective function and new GA as solving method. The code was implemented
on personal computer (intel celeron 800 MHz processor 128 KB RAM). The function to be
minimized was the sum of squares of the differences between calculated and measured

5. Application of the new Genetic algorithms to the model examples of chemical kinetics
5.1 Biological oxygen demand problem
    Data on biological oxygen demand versus time are modeled by the following equation

                                                y = k1[1 - exp( - k2t)]                                                              (7)

    Where k1 is the ultimate carbonaceous oxygen demand (mg/L) and k2 is the BOD reaction
rate constant (d-1). A set of BOD data were obtain by 3rd year Environmental Engineering
students at the Technical University of Crete and given in Table 1.
Table 1 BOD Data: Experimental data

                   Time (d)                                   BOD – Experimental
                                                                 Date (mg/L)
                       1                                             110
                       2                                             180
                       3                                             230
                       4                                             260
                       5                                             280
                       6                                             290
                       7                                             310
                       8                                             330

5.2 Enzyme Kinetics problem
    Let us consider the determination of two parameters, the maximum reaction (rmax) and the
saturation constant (Km) in an enzyme-catalyzed reaction following Michaelis-Menten
kinetics. The Michaelis-Menten kinetic rate equation relates the reaction rate ( r )
    To the substrate concentrations (s) by

                                          1 1      Km
                                                                                            (8)
                                          r rm ax rm ax * s

The parameters are usually obtained from a series of initial rate experiments performed at
various substrate concentrations. Data for the hydrolysis of benzoyl-L-tyrosine ethyl ester
(BTEE) by trypsin at 30 c and pH 7.5 are given in Table 2.

Table 2 Enzyme Kinetic: Experimental data for the hydrolysis of benzoyl-L-tyrosine ethyl
       ester (BTEE) by trypsin at 30 c and pH 7.5

                     S(M)                            20         15      10       5.0   2.5
                   r(M/min)                         330        300     260       220   110

6. Results and Discussion
   For examples 1 and example 2, we set the control parameters of new GA for
determination of initial estimates of rate constants (Table 3).

Table 3 Control parameters of new GA.

        Control parameters                                                    Values
        Set of population:                                                      10
        Number of iteration step:                                             1,0000
        Probability of crossover, (Pc):                                         0.3
        Probability of mutation, (Pm):                                          0.1
        Probability of CPSS, (PS)                                                    m=0.3, =0.3
        ( Ps = {(1-m)h/L + m} ):

   For example 1 we calculated kinetic rate constant with new genetic algorithm. The kinetic
rate constant for equation (7), k1 = 334.25 and for equation (6), k2 = 0.3806. The LS objective
function is 288.96. When we compared the quadratic convergence of the Gauss-Newton
method (Englezos and Kalogerakis, 2001) is shown k1 = 334.3 and k2 = 0.3807. The LS
objective function is 288.90. Based on these parameter values. the model calculated BOD
values are compared to the experimental data shown in Figure 2.


                 Biological oxygen demand

                                                                                     Experiment al
                                                                                     Calculat ions




                                                  0   1   2   3   4      5   6   7        8          9

                                                                  T ime(t)

Figure 2 Biological oxygen demand: Experimental data and calculated model for value of

  For example 2 we calculated kinetic rate constant with new genetic algorithm (KUPGA).
The maximum reaction (rmax) = 420.2 and the saturation constant (km) = 5.705. The LS
objective function is 974.582. When we compared the convergence of the Gauss-Newton
method without the need for Marquardt‟s modification (Englezos and Kalogerakis, 2001) is
shown constant rmax = 420.2 and km = 5.705. The LS objective function is 974.582. Then we
compared the reaction rate (r) of the model calculated reaction rate (r) and the experimental
data in Figure 3.
                benzoyl -L- tyrosine ethyl ester by trypsin

                 Enzyme kineticas for the hydrolysis of       300
                                                              250                              KUP GA





                                                                    0   5   10            15   20           25

                                                                                 S (M)

Figure 3 Enzyme Kinetics: Experimental data for the hydrolysis of benzoyl -L- tyrosine ethyl
                          ester by trypsin and calculated model

7. Conclusion

Englezos, P. and Kalogerakis, N. (2001). Applied parameter estimation for chemical
   engineers. Marcel Dekker; New York.
Draper, N. and Smith, H. (1998). Applied Regression Analysis, 3nd ed.; New York, Wiley.
Gallot, J.E., Kapor, M.P. and Kaliaguine, S. (1998). “Kinetic of 2-hexanol and 3-hexanol
   Oxidation Reaction over TS-1 Catalyst.” AIChE Journal. 44(6): 1438-1454.
Goldberg, D. (1989). Genetic Algorithm in search, Optimization and Machine Learning;
Hibbert, D. B. (1992). “A hybrid genetic algorithm for the estimation of kinetic parameters.”
   Chemometrics and Intelligent Laboratory System 19: 319-329.
Moros, R., Kalies, H., Rex, G. and Schaffarczyk, St. (1996). “A genetic algorithm for
   generating initial parameter estimations for kinetic models of catalytic processes.” Comp.
   Chem. Engng. 20(10): 1257-1270.
Shimodaira, H. (1997). “ A Diversity Control Oriented Genetic Algorithm.” IEEE : 367-
Srinivasan, R. and Levi, A. A. (1963). “Kinetics of the Thermal Isomerization of Bicyclo [2,
   1, 1] hexane.” J. Amer. Chem. Soc. 85: 3363-3365.
Wasanapradit, T. (2000). “Solving Nonlinear Mixed Integer Programming Using Genetic
   Algorithm.” Master Thesis, KMUTI. Thonburi University.
Wolf, D. and Moros, R. (1997). “Estimation rate constants of heterogeous catalytic reactions
   without supposition of rate determining surface steps-an application of a genetic
   algorithm.” Chemical Engineering Science 52(7): 1189-1199.

To top