A Heuristic Rule for Varying Exploration and Convergence in

Document Sample
A Heuristic Rule for Varying Exploration and Convergence in Powered By Docstoc
					                                                                                Proceedings of The National Conference
                                                                              On Undergraduate Research (NCUR) 2006
                                                                            The University of North Carolina at Asheville
                                                                                               Asheville, North Carolina
                                                                                                       April 6 – 8, 2006

     A Heuristic Rule for Varying Exploration and Convergence in Iterated
                              Genetic Algorithms

                                      Bryan Culbertson, Mark Kokoska

                                      Department of Computer Science
                                             Lafayette College
                                              1 Markle Hall
                                         Easton, PA 18042. USA

                                      Faculty Advisor: Chun Wai Liew


Current techniques used by most genetic algorithms do not effectively manage real world optimization problems.
Our approach, however, applies an iterative genetic algorithm that uses new patterns of convergence and exploration
to quickly converge on a suboptimal but satisfactory (satisficing) answer, and then explore more possible solutions.
We control this process of convergence and exploration by altering the number of individuals transferred from each
generation. The number of individuals that we reseed varies inversely with the number of iterations that have
occurred. In such a way, exploration increases as more iterations are performed. We examined the performance of
our algorithm by optimizing two biological problems each with complex search spaces. The first such model
involves optimizing eight parameters that describe the angles of various bones within a snake's jaw to find the
maximum cross sectional area of the jaw's gap. The second model involves determining sixteen parameter values
that minimize the difference between the calculated swimming motion of a sunfish generated by the model and the
actual motion deduced from video of the modeled sunfish swimming in a flume. The difficulties in optimizing this
model for accuracy include the size of the search space, the interdependence of variables, and the number of
unusable points. These obstacles manifest themselves in an irregularly shaped landscape that standard optimization
algorithms cannot easily navigate. Our algorithms patterns of exploration and convergence coupled with other
mechanics for moving between one iteration and the next allow the iterated genetic algorithm to be more effective.
Results indicate that our algorithm reaches a satisficing answer in 29% less time than previous algorithms and more
consistently generates accurate solutions than previous algorithms. Our approach of converging before exploration
could potentially be applied to any such sufficiently complex optimization problem. The method has the potential to
be used to optimize a growing field of practical problems in real world domains and applications.
Keywords: Rule, Exploration, Genetic Algorithms

1. Existing Work
Genetic algorithms (GA) optimize a function using a controlled stochastic process that mimics the process of natural
selection and evolution. Genetic algorithms are most useful when applied to problems that cannot be solved
analytically and are too large to solve iteratively. While educated guesses can approximate workable values, each of
the parameters for many problems must be precise in order to get accurate motion. Also, with such a large number
of combinations, it is impractical to iteratively test a problem until good values are found. Complicated optimization
problems have been shown to be best-solved using genetic algorithms 3.
  As attempts are made to solve more complex problems in chemistry, construction, aerodynamics and other areas it
is often found that practical, real world, sufficing answer are needed in a reasonable amount of time. The
complexity of these problems includes very high dimensionality, discontinuous search space and highly
interdependent variables. Much of current efforts focus on improving GA’s to more effectively handle these various
complex facets.
  These objectives are the hallmarks of current GA research that focus on trying to find better answers, more
reliably, in less time. Any one of these goals is suitable for advancing current techniques. The danger in attempting
to make a GA more streamlined is that it may converge to local minima or maxima rather than a global one. This
provides a difficult balance for GA's to straddle, finding as good an answer as quickly as possible without becoming
stuck in local optima.
  One approach for maximizing the function of a GA in a shorter amount of time is the Micro-GA 1. This GA uses
very small populations, as low as five, in an attempt to converge quickly. Micro GA's have been demonstrated to be
faster than traditional GA's but have not been extensively tested on the large complex domain problems that are
becoming more pervasive today. The ultra small nature of the micro-GA may make it more likely to rapidly
  The existing work on GA's shows this trend in trying to explore more area in the search space, but still converge
on an answer in reasonable time. The micro-GA is one approach that drives as hard as possible for convergence.
The heuristic rules developed in this paper attempt to further optimize a GA to balance its rate of exploration and

2. Previous Work
The heuristics in this paper are built on top of a Meta-Control Mechanism for GA's known as Iterated Genetic
algorithms (IGA). The concept of an IGA is to run a standard GA in the normal way for short durations instead of
an entire run, then in between each iteration select which individuals will be used to start the next iteration. In this
way the IGA tries to control the flow of individuals from one generation, or iteration, to the next.
  A specific, IGA, GAITER 2 performs runs on each generation using an already optimized GA called GADO or
Genetic Algorithm for Continuous Design Optimization 3. One of the primary strengths of GADO is its ability to
rapidly converge. After each generation has occurred, GAITER clusters all of the individuals together into groups
based on quality of the answers then redistributes a number of individuals from different clusters to the next
generation combined with a number of random individuals. The number of random individuals in the Original GA
varies based on a set-triggering rule.
  The original rule for varying the number of individuals to be reseeded is based on the occurrence of stagnation. If
the population of one iteration changes less than then a set tolerance, normally two percent then increase the number
of random individuals by twenty, with a maximum of all but twenty individuals randomly generated. In addition if
there is a change of more than two percent twenty fewer random individuals are seeded to the next iteration. In this
way GAITER attempts to alter its amount exploration and convergence based on the results it finds within the search
space. In the original work GAITER found answers in as little as eight percent of the time of GADO.

3. Thesis
The original GAITER was so successful because of its use of heuristic rules to guide the stochastic elements of the
GA. The rules for clustering and the decision rule for how many random individuals to reseed into the next
generation both provide more information to the GA enabling it to make more intelligent decisions about
exploration and convergence.
  Offering increased performance in complicated search spaces, an algorithm called Negative Linear Reseeding
(NLR) is a new decision rule for how many individuals to bring to the next generation that is based on additional
information. In addition to the measure of merit of the preceding generation the algorithm reseeds based on the
number of generations that have already occurred.
  Most genetic algorithms start out by exploring the search space, and after finding a promising possibility
converging on that region of search space in order to refine the answer. In general, this principle works well in
small dimensions because an algorithm can sample a significant portion of the search space before choosing a
section to focus on. However, when a problem has a high number of dimensions there are too many trends to
explore any significant portion of them. So the genetic algorithm has a low chance of converging on a global
  Instead of following this standard search pattern, Negative Linear Reseeding follows an inverted scheme. NLR
starts out by converging on every promising trend so that it can quickly discover how soon that path stagnates at a
local optimum. This way the algorithm has a clearer idea of the trends in the search space. As time progresses, our
new algorithm gradually explores more of the search space which is important since the algorithm has more of an
opportunity to stagnate at the end of a run.
  NLR works well in high dimensional, complex search spaces where more exploration is needed to find better
answers, particularly to break out of the ever-present local optima that the GA tends to become trapped in.
   If a condition indicating stagnation occurs (change less then two percent) Negative Linear Reseeding in
accordance with equation (1) will add a random number of individuals inversely proportional to the number of runs

that have occurred and the total number of individuals in the population. Else it will reseed zero random individuals.

   ReseedPop = MaxPopulation * ( (MaxGenerations+1-currentIteration) / MaxGenerations) )                 (1)

  Following equation (1), the algorithm scales the number of individuals to be reseeded to encourage more
exploration when it is more likely that stagnation at a local optima has occurred and more exploration is needed.

4. Experiments
Negative Linear Reseeding algorithm was tested against the original GAITER in two problem domains. Both
domains embody the difficult problems for GA's including high dimensionality and incomplete search spaces.
  The first experimental domain discovers the maximum gape of a snake’s jaw. The snake jaw problem involves
eight dimensions that describe the angles between bones involved in describing the snake’s mouth. In this case the
complexity of solving the problem is compounded by the fact that the variables are strongly related to each other and
also some points are unrealizable, as a snake simply cannot move its jaw to certain positions. This sort of
complexity is the most difficult for a GA to successfully navigate.
  In addition Negative Linear Reseeding was tested on a fluid model of a swimming fish. In this fitness function the
parameters describe the motion of the fish as it interacts with vortex points within the water, then a comparison is
done between the calculated fish model and a video of a fish to determine how close to the real world the models
parameters are. This complex fluid dynamics model involves sixteen parameters to describe the actions of the fish.
This even higher dimensionality problem is more complex and another good test of the robustness of our algorithms
changes to GAITER's selection process.
   The snake algorithm was evaluated on four separate instances of test sets, for both GAITER and Negative Linear
Reseeding. In each instance 252 different sets of conditions for the snake were tested. From these large numbers of
instances we can form an accurate statistics with which to compare the average performance of GAITER versus
Negative Linear Reseeding.
  The standard metric for comparison for this type of problem is to compare the number of evaluations it took to
reach a set percentage of the maximum answer that was found by any algorithm for that particular test.
  Points of note for these tests is that Negative Linear Reseeding seems to be decreasing the number of evaluations
by approximately two percent in our test data. In addition we note a twenty-two percent reduction in variance in
Negative Linear Reseeding at the higher range of measures of merit (90, 95, 98) this seems to indicate that the
Negative Linear Reseeding algorithm is more consistent than the original GAITER.
  It also seems prudent to examine this large number of test cases for the best answers that are found, as this is
another goal of optimizing genetic algorithms. So we also examine the number of Best answers found by each of
the instances of the two algorithms.
  The important information here is that it appears as if Negative Linear Reseeding is reaching that 100% answer
more frequently and also when it reaches the answer it appears to reach it earlier then the GAITER runs. The 100%
best measures of merit are obviously not vastly superior to the 98% that all instances are accessing but the fact that
Negative Linear Reseeding does appear to be pushing further and faster is encouraging for our original hypothesis.

                                       Figure 1. NLR improvement over basic

Results for the fish show that, on average, we are reaching 98% of the best solution 26% sooner than GAITER.
While, at the same time it is taking 1% longer to reach 75% of the best solution. These results suggest two possible
explanations. First is that the algorithm is performing like predicted. Because of our inverted search's lack of
exploration in the beginning, it spends many of its early evaluations focusing on less than optimum sections of the
search space. Hence its poor performance getting to middle solutions. However, later on in the run it can explore
when exploration is more profitable. The second explanation is that exploration is superior to convergence so
constant exploration should be performed in lieu of convergence. Further experiments that evaluated the advantage
of the explanation show sores performance.
  Since the largest improvement came during the time when GAITER was only carrying over a few parameter sets,
we performed a test that simulated this environment. When the process was making progress we carried over the all
the parameter sets just like in the convergence to exploration idea. However, when the process was stagnating, we
carried over only 5% of the parameter sets just like at the end of the convergence to exploration process. This
provided for an environment of constant exploration. The results of this test showed very poor performance from the
only exploration idea. This suggests that exploration at the end, but not always, is important to discovering good
solutions, and it is even more important if preformed at the end of a run as opposed to the beginning.

5. Conclusion
Our original hypothesis was that we could improve the performance of genetic algorithms with heuristic rules
incorporating additional information. Additionally, we could improve genetic algorithms performance in complex
search spaces by encouraging even more exploration in the form of more random individuals as the number of
iterations increased.
  The experiments demonstrate a marked improvement in our algorithm compared to the original GAITER, this
suggests both that Negative Linear Reseeding could be used to improve the performance of genetic algorithms, but
more importantly it provides support for our hypothesis. This suggests that future research into the optimization of
genetic algorithms should pursue the ideas of additional rules based on more information, and try to explore more in
complex landscapes to reduce stagnation.

6. References

1. Microgenetic algorithms for stationary and nonstationary function optimization Krishnakumar, Kalmanje AA
(Univ. of Alabama) Proc. SPIE Vol. 1196, p. 289-296, Intelligent Control and Adaptive Systems, Guillermo
Rodriguez; Ed. (SPIE Homepage) 02/1990
2. Exploration or Convergence? Another Meta-Control Mechanism for GAs C.W. Liew and Mayank Lahiri
Proceedings of Eighteenth International Florida AI Society Conference (FLAIRS), May 2005.
3. Khaled Rasheed. "GADO: A Genetic Algorithm for Continuous Design Optimization". Technical Report DCS-
TR-352, Department of Computer Science, Rutgers University, New Brunswick, NJ, 1998. Ph.D. Thesis.