Docstoc

Lecture6

Document Sample
Lecture6 Powered By Docstoc
					     CC282
Genetic Algorithm




 Lecture 6 slides for CC282 Machine Learning, R. Palaniappan, 2008   1
                        Lecture 06 – Outline

•   Introduction
•   GA terminology
•   GA basic description
•   Encoding of chromosomes
•   Selection operator in GA
•   Crossover and mutation operators in GA
•   Applications
    – Evolving ANN
    – Genetic Programming
• Toy example
• Advantages and disadvantage of GA

                  Lecture 6 slides for CC282 Machine Learning, R. Palaniappan, 2008   2
        Genetic Algorithm (GA) - Introduction

• GA is a part of evolutionary computation
• GA is inspired by Darwin’s theory of evolution -
  problems are solved by an evolutionary process
  resulting in the survival of the fittest
• EC was introduced in 1960s by Recheneberg
• J. Holland invented GA in the 70s
• J. Koza used GA to evolve programs (GP) in 1992



              Lecture 6 slides for CC282 Machine Learning, R. Palaniappan, 2008
             Genetic Algorithm (GA) - Terminology

Living organisms consist of cells. Cells contains DNA
carrying the genetic material of the organism defining its
traits
• Chromosomes are strings of DNA and serve as a model for the whole organism
   (genetic material)
• Genes - blocks of DNA of which the chromosomes consist. It can be said that each
   gene encodes a trait or feature
• Alleles are possible values for a trait (i.e. the gene)
• Genome - a complete set of genetic material (i.e. all chromosomes), this is called a
   population in GA
• Crossover is the operation when genes from parents combine to form a whole
   new chromosome during reproduction producing offspring
• Mutation is when some elements of the genetic material is changed (normally
   through a random procedure)
• Fitness of an organism is measured by its degree of success/failure in survival
                       Lecture 6 slides for CC282 Machine Learning, R. Palaniappan, 2008
Hypothesis/search space - revisited

                                           • Each point is a possible solution
                                             and has a fitness value
                                           • Fitness measure how good the
                                             solution is
                                           • Fitness in this case is opposite to
                                             error measure
                                           • GA searches for the best/optimal
                                             solution, though there is no
                                             guarantee that it will find it
                                           • GA finds a solution in a
                                             evolutionary manner
                                           • Other similar methods are hill
                                             climbing, tabu search, simulated
                                             annealing
     Lecture 6 slides for CC282 Machine Learning, R. Palaniappan, 2008
                               GA – Basic description
Steps in brief:
• GA begins with an initial population, i.e. a                                                                           START
   set of solutions/chromosomes
• Fitness of each chromosome is computed                                                                          Randomly generate an
                                                                                                                  initial population
• Selection operators are applied that
   favours more fit chromosomes                                                                                   Evaluate fitness of
                                                                      Replace old population with new one         each individual
• Crossover - with the hope that by
   recombination of parents, offspring                                Generate offspring by mutation with
                                                                      probability, Pm
   produced may be fitter than the parents ->
   chromosomes recombine to produce                                   Generate offspring by crossover with
   offspring                                                          probability, Pc

• Mutation operator is applied
                                                                      Select individuals to mate             no
• Assess the fitness of the new population –                                                                              Terminate

   stop if the optimal solution is achieved or if                                                                                yes
   the maximum generation number is                                                                                          STOP
   reached
• Else, repeat to next generation with
   selection, crossover, mutation operators
                           Lecture 6 slides for CC282 Machine Learning, R. Palaniappan, 2008
                                         The GA algorithm
GA(Fitness, Fitness_threshold, max_generation, popsize, Pc, Pm)
Fitness: A function that assigns an evaluation score, given a hypothesis
Fitness_threshold: A threshold specifying the termination criterion
Max_generation: The maximum generation number to terminate GA
popsize: The size of the population
Pc: Crossover probability, i.e. the fraction of the population to be replaced by crossover operator at each
generation
Pm: Mutation probability, i.e. the fraction of the population to be replaced by mutation operator at each
generation
•       Initialise population: P ← Generate popsize random hypotheses
•       Evaluate: for each h in P, compute Fitness(h)
•       While [maxh Fitness(h)] < Fitness_threshold | generation < max_generation
        1. Selection: Select popsize members of P (with replacement) to add to Pnext
        2. Crossover: Pairs of hypotheses are randomly selected using Pc. For each pair,
        <h1,h2>, produce two offspring by applying the crossover operator. Add all
        offspring to Pnext
        3. Mutate: Invert a randomly selected bit in random members of Pnext using
        probability Pm
        4. Update: P ← Pnext
        5. Evaluate: for each h in P, compute Fitness(h)
•       Return the hypothesis from P that has the highest fitness




                               Lecture 6 slides for CC282 Machine Learning, R. Palaniappan, 2008
        GA – Some preliminary design questions
• Encoding
   – GA operates on the coding of parameters rather than the parameter
     itself
   – These parameters are called chromosomes and are a string of values
     which represent potential solutions to the given problem
   – The encoding could be binary, decimal or continuous – which to use?
• Constraints - Any constraint to the gene values?
• Fitness – How to obtain the fitness for each chromosome?
• Selection - How to select candidate chromosomes?
• The other two operators - How to perform Crossover and
  Mutation?

                   Lecture 6 slides for CC282 Machine Learning, R. Palaniappan, 2008
            Chromosomes – binary representation
• Chromosomes are mostly represented by a string of bits
• Each bit/group of bits represents some characteristic/attribute/feature
• Values of each feature are checked
    – represent each feature with enough bits to cover all possible values


• Recall the play-tennis example:

• Wind : {strong, weak} can be represented by two bits
• Example:
• Wind =strong, {10}, , Wind =weak, {01}, Wind =strong or weak {11}

• Outlook: {cloudy, rainy, sunny} can be represented by three bits
  eg: Outlook =cloudy or rainy then this is represented as 110
• So, a rule such as (Outlook=cloudy  rain)  (Wind=strong)  the
  chromosome representation is 11010

                        Lecture 7 slides for CC282 Machine Learning, R. Palaniappan, 2008
          Binary and decimal coding chromosomes
• Let us consider a more general situation
• Assume we have three variables, x, y and z
• Decimal coding is simply the integer values for genes, eg: x=35, y=191, z=5

•   Binary coding – the genes are coded in binary form
•   Let us assume that these variables can take integer values from 0 to 255
•   So, we need 8 bits for each variable (i.e. gene)
•   If x =35, y=191, z=5, we have
     – x=00100011, y=10111111, z=00000101
     – And the chromosome 001000111011111100000101
• But why go through the hassle of representing integers using binary
  coding?
     – Answer (see Exercise 6, question 4)



                       Lecture 6 slides for CC282 Machine Learning, R. Palaniappan, 2008   10
                       Continuous coding chromosomes
•   But what if we want genes to represent continuous values eg: x=0.67, y=1.56,
    z=3.45
•   Solution: use binary chromosome with approximation or use continuous valued
    chromosomes
•   We will not cover continuous valued chromosomes in this course
     –   As they require special type of GA operators


•   Binary chromosome with approximation eg: x=0.145 (assume 8 bits per gene)
     – Use the general equation:                                 xdecimal  xm in
                                                 xcontinuous 
                                                                 ( xm ax  xm in )


         round ( xcontinuous ( xmax  xmin )  x min)  xdecimal

     –   With 8 bits, xmax=255 and xmin=0
     –   0.145*255=36.975, round this to 37, so x =00100101
     –   So, x=00100101 is an approximation of x=0.145
     –   More bits will improve the approximation but computation becomes time consuming

                                 Lecture 6 slides for CC282 Machine Learning, R. Palaniappan, 2008   11
    Fitness function and gene contraints – an example
•   Let us consider a linear programming problem, which arise naturally in production
    planning:
•   Suppose a particular Ford plant can build Escorts at the rate of one per minute,
    Explorer at the rate of one every 2 minutes, and Lincoln Navigators at the rate of
    one every 3 minutes. The vehicles get 8, 5, and 4 miles per litre, respectively, and
    Parliament mandates that the average fuel economy of vehicles produced be at
    least 6 miles per litre. Ford loses £1000 on each Escort, but makes a profit of £5000
    on each Explorer and £15,000 on each Navigator. What is the maximum profit this
    Ford plant can make in one 8-hour day?

•   The fitness function here is the cost function, i.e. the profit Ford can make by
    building x Escorts, y Explorers, and z Navigators
•   And we want to maximize it

•   The fitness function is f=-1000x+5000y+15000z




                         Lecture 6 slides for CC282 Machine Learning, R. Palaniappan, 2008   12
                                  Gene constraints
• Using the same example in the previous slide:
• The constraints arise from the production times and Parliament mandate
  on fuel economy

• There are 480 minutes in an 8-hour day, and so the production times for
  the vehicles lead to the following limit:
    x+2y+3z  480


• The average fuel economy restriction can be written:
    8x+5y+4z  6(x+y+z) which simplifies to 2x-y-z  0


• There is an additional implicit constraint that the variables are all non-
  negative:
    x, y, z  0


                       Lecture 6 slides for CC282 Machine Learning, R. Palaniappan, 2008   13
                                          Selection
• Selection (aka reproduction) operator is applied many times
  to produce a mating pool of the new population
• There are a number of ways to do selection to ensure that the
  members of the population are drawn with the correct
  probability
   –   Roulette wheel (fitness proportionate) selection
   –   Tournament selection
   –   Steady-state selection
   –   Rank selection
   –   Elitism




                    Lecture 6 slides for CC282 Machine Learning, R. Palaniappan, 2008
         Roulette wheel (fitness proportionate) selection
                                                                     •     Chromosomes are selected according
                                                                           to their proportionate fitness
                                                                                                                  Fitness(hi )
                                                                             Fitnessproportionate (hi ) 
                                                                                                            
                                                                                                                popsize
                                                                                                                j 1
                                                                                                                          Fitness(h j )

                                                                     •     The higher fitness they are, the more
                                                                           chances they have to be selected
                                                                     •     Sampling can be viewed as playing a
                                                                           game of roulette where the pocket
Example:                                                                   sizes are proportional to the
fitness_chromosome A =6.0 180:                                            probability of selecting a particular
fitness_chromosome B =4.0 120:                                            individual
fitness_chromosome C =2.0 60:                                       •     Each new member of the population
Random number generated is 0.29 (about 104.4:),                            is drawn independently when the
so chromosome A is selected, repeat this process                           roulette wheel is spun randomly
two more times to obtain three chromosomes for                       •     In computer, this spin is done using a
Pnext                                                                      randomly generated number [0,1]
Since there is the possibility of A,A,A for Pnext, this
                                                                     •     But the best (so far) found solution
could result in ‘overcrowding’
                                                                           may be lost, eg: Pnext={B,B,C}
                               Lecture 6 slides for CC282 Machine Learning, R. Palaniappan, 2008
                                       Selection (ctd)
•   Tournament selection
     – Pick a few chromosomes (say, popsize/4 chromosomes) at random from the
        population
     – From these few, select the one fittest (i.e. with highest fitness), replace the
        rest and repeat the process popsize times
     – This method can retain some good chromosomes while giving chance for
        other weaker chromosomes to take part in mating

•   Steady-state selection
     – A few good (with high fitness) chromosomes are selected to replace the few
        bad (with low fitness) chromosomes
     – The rest of population (the in-between fitness ones) are selected by other
        methods or all are selected to remain in Pnext




                        Lecture 7 slides for CC282 Machine Learning, R. Palaniappan, 2008
                                         Selection (ctd)
• Rank selection
   – The other selection methods will have problems if the fitness
     differs a lot
   – For example, if the best chromosome fitness is 90% of all the rest,
     then using roulette wheel, the other chromosomes will have very
     few chances to be selected
   – Rank selection first ranks the population and then every
     chromosome receives fitness from this ranking (i.e. probability of
     selection is proportional to rank)
   – The worst will have fitness 1, second worst 2 etc and the best will
     have fitness N (number of chromosomes in population)
   – Then, using these new fitness values, roulette wheel selection                            Figure from
                                                                                               http://cs.felk.cvut.cz/~xobitko/ga/selection.html
     method is performed
   – Using this, all the chromosomes have a fair chance to be selected
   – But this method can lead to slower convergence, because the
     best chromosomes do not differ so much from other ones
• Elitism
   – First, copies the best chromosome (or a few best chromosomes) to new population
   – The rest is done using the any other selection methods, normally roulette wheel
   – Can very rapidly increase performance of GA, as it prevents losing the best found solution
                           Lecture 6 slides for CC282 Machine Learning, R. Palaniappan, 2008                                           17
                                           Crossover
• Even though reproduction increases the percentage of better fitness
  chromosomes, the procedure is considerably sterile; it cannot create new
  and better chromosomes
• This function is left over to crossover and to a lesser but critical extent, to
  mutation
• Crossover process simulates the exchange of genetic material that occurs
  during biological reproduction
• In this process pairs in the breeding population are mated randomly with a
  crossover rate, Pc
• Typical crossover properties include that an offspring inherits the common
  feature from the parents along with the ability of the offspring to inherit
  two completely different features
• Popular crossover techniques: one point, two point and uniform crossover




                      Lecture 6 slides for CC282 Machine Learning, R. Palaniappan, 2008   18
                                   Crossover (ctd)
• First, randomly select a pair of parents (i.e. two chromosomes)
• Perform crossover (swapping of bits) to obtain offspring, repeat this
  process Pc*popsize/2 times with the used parent chromosomes not
  included
• Example: if Pc=0.5 and popsize=20, then do crossover 5 times

• Single point and two-point crossover:

                                                                                          Single point crossover



                Crossover points



                                                                                          Two point crossover




                      Lecture 6 slides for CC282 Machine Learning, R. Palaniappan, 2008                            19
                               Crossover (ctd)
• The uniform crossover scheme works as follows
• A randomly generated bit string called the crossover mask
  generalises the process
• A bit value of 1 in this bit string indicates that corresponding
  bits in the parents are to be exchanged while a 0 bit indicates
  no bit interchange




                  Lecture 6 slides for CC282 Machine Learning, R. Palaniappan, 2008   20
                                              Mutation
•   Mutation consists of making small alterations to the values of one or more genes in
    a chromosome
•   Mutation randomly perturbs the population’s characteristics, and prevents
    evolutionary dead ends
•   Most mutations are damaging rather than beneficial and hence mutation rate must
    be low to avoid the destruction of species
•   It works by randomly selecting a bit with a certain mutation rate in the string and
    reversing its value
•   Mutation is applied to the randomly chosen bit in a chromosome chosen randomly
•   If Pm is 0.01, with a popsize of 20 with 18 bits each, then the mutation is repeated
    for 0.01 x 18 x 20 =3.6 ≈4 times




            Mutation example (for a randomly chosen bit in a randomly chosen chromosome)




                         Lecture 6 slides for CC282 Machine Learning, R. Palaniappan, 2008   21
                                Applications

• The possible applications of genetic algorithm are
  immense
• Any problem that has a large search domain could be
  suitably tackled by GA
• We shall explore (very briefly) on the use of GA to
  evolve neural network weights and to evolve
  function/programs in genetic programming
• We’ll also look at a simple toy example


               Lecture 6 slides for CC282 Machine Learning, R. Palaniappan, 2008
    Evolving NN weights using GA – a simple example
•   GA has been used successfully to evolve NN weights
•   GA is suitable for evolving the weights of a neural network –
    standard learning techniques such as backpropagation would
    take thousands upon thousands of iterations to converge
•   But GA could (given the appropriate direction) evolve suitable
    weights within a hundred or so iterations

•   Example
•   Obtain the weights for perceptron unit for learning the OR
    function (we saw this in the previous lecture)
•   But rather than using backpropagation to update the weights, we
    can use GA                    x =1   0

                                              w0

                              x1    w1

                                                     z
                                                                                 y

                                    w2

                              x2

                                               A simple artificial neuron model


                       Lecture 6 slides for CC282 Machine Learning, R. Palaniappan, 2008
     Evolving NN weights using GA – a simple example
1.   Initial parameters
     –   Fitness function: 1/MSE of desired to actual output, GA will maximise this
         fitness function              1        1
                               Fitness function          
                                                              1 4
                                                                 ( yi  d i ) 2
                                                    MSE
                                                              4 i 1


     –   Coding, binary approximation: w1, w2 and w0 weights, say with each 6 bits, so
         chromosome length is 18
     –   Popsize=20, i.e. 20 chromosomes, initially generated randomly
     –   Pc=0.5, Pm=0.01
     –   MSE_limit=0.1, so, fitness_threshold=10; max_generation=100


2.   Gene constraints, w1, w2 and w0 in the range [-1,1]
3.   Apply selection (say, tournament selection), crossover (say one point) and
     mutation to produce a new population
4.   Repeat step 3 until convergence to an acceptable solution
     (fitness>fitness_threshold or generation>max_generation)
                       Lecture 6 slides for CC282 Machine Learning, R. Palaniappan, 2008   24
           Genetic programming (GP) – An example
•   In programming languages such as LISP, the mathematical notation is not written
    in standard notation, but in prefix notation
     – Examples:
       + 1 2 : 1+2
       * + 1 2 2 : (1+2)*2
       * + - 2 1 4 9 : ((2-1)+4)*9
     – Notice the difference between the left–hand side and the right? Apart from the order
       being different, there are no use of parenthesis
     – The prefix method makes life a lot easier for programmers and compilers alike, because
       order precedence is not an issue
•   You can build expression trees out of these strings that then can be easily
    evaluated. For example, the trees for the previous three expressions are.




                         Lecture 6 slides for CC282 Machine Learning, R. Palaniappan, 2008
      Genetic programming (GP) –An example (ctd)

• Having numerical data and primitive functions, but no expression to
  conjoin the data with the primitive functions, a genetic algorithm
  can be used to evolve an expression tree to create a very close fit to
  the data
• By “splicing” and “grafting” the trees and evaluating the resulting
  expression with the data and testing it to the primitive functions,
  the fitness function can return how close the expression is
• The limitations of genetic programming lie in the huge search space
  the GA have to search for - an infinite number of equations
• Therefore, normally before running a GA to search for an equation,
  the user tells the program which primitive functions to search
  under



                   Lecture 6 slides for CC282 Machine Learning, R. Palaniappan, 2008
      Genetic programming (GP) – An example (ctd)
• Assume we have data like the following and we wish to obtain the function
  that maps z using x and y

                             x                     y                      z
                            0.1                   0.5                  0.81
                            0.3                   0.4                  0.99
                            0.6                   0.2                  1.31
                              .                     .                     .
                              .                     .                     .
                              .                     .                     .
                            0.4                   0.5                  1.20


• Assume the only available primitive functions are sin,, sqr, sqrt
• GP will splice and graft the trees using these primitive functions with the
  fitness function to minimise prediction error of z using x and y data as
  above

                     Lecture 6 slides for CC282 Machine Learning, R. Palaniappan, 2008   27
       Genetic programming (GP) – example (ctd)
• Crossover example in GP ->

• Mutation randomly changes
  the primitive function

• The actual function is
     z  sin( x)  x 2  y




                                                                                Crossover example




                    Lecture 6 slides for CC282 Machine Learning, R. Palaniappan, 2008               28
                                     Toy example

• Consider: a + 2b + 3c + 4d = 30, where a, b, c, d are positive integers
• Use GA to find a, b, c and d
   – Assume decimal coding is used
   – Choose say 5 random initial solution sets (i.e. popsize=5) forming the
     initial population with the constraint 1 ≤ a, b, c, d ≤ 30

                              Chromosome                              (a, b, c, d)
                                     1                               (1, 28, 15, 3)
                                     2                               (14, 9, 2, 4)
                                     3                               (13, 5, 7, 3)
                                     4                              (23, 8, 16, 19)
                                     5                               (9, 13, 5, 2)




                    Lecture 6 slides for CC282 Machine Learning, R. Palaniappan, 2008
                                            Example (ctd)

•   Calculate the fitness value for each chromosome, i.e. calculate the absolute
    difference of each expression to 30, take inverse, this will be our fitness value
•   Eg: Chromosome 1, expression=1+2*28+3*15+4*3=114
                            Chromosome                Absolute diff             Fitness value
                                   1                  |114-30|=84                   1/84
                                   2                   |54-30|=24                   1/24
                                   3                   |56-30|=26                   1/26
                                   4                 |163-30|=133                   1/133
                                   5                   |58-30|=28                   1/26

     –   Since expression values that are lower are closer to the desired answer (30), these values are more
         desirable
     –   So, take the inverse of the absolute difference as fitness value
     –   Now, GA will try to maximise higher fitness values
     –   In order to create a system where chromosomes with more desirable fitness values are more likely to be
         chosen as parents, we have to do selection
     –   Assume we use the roulette wheel (fitness proportionate) method

                            Lecture 6 slides for CC282 Machine Learning, R. Palaniappan, 2008
                                            Example (ctd)

•   Calculate the fitness proportion (likelihood) for each chromosome to be
    picked/selected as parent. e.g. take the sum of the all fitness values (0.135266),
    and calculate the percentages from there

                                                  Fitness(hi )
•   Use      Fitnessproportionate (hi ) 
                                            
                                                popsize
                                                j 1
                                                          Fitness(h j )




                                 Chromosome                          Fitness proportion
                                         1                        (1/84)/0.135266 = 8.80%
                                         2                        (1/24)/0.135266 = 30.8%
                                         3                        (1/26)/0.135266 = 28.4%
                                         4                       (1/133)/0.135266 = 5.56%
                                         5                        (1/28)/0.135266 = 26.4%




                             Lecture 6 slides for CC282 Machine Learning, R. Palaniappan, 2008
                                    Example (ctd)

• Spin the roulette wheel for 5 times
• Assume the result was


                            Chromosome                  Chromosome selected
                                                        after spinning roulette
                                                                 wheel
                                    1                                 1
                                    2                                 2
                                    3                                 5
                                    4                                 5
                                    5                                 3




• Since chromosome 4 had a poor fitness, it’s chances of survival was
  slim and died out in the selection process


                    Lecture 6 slides for CC282 Machine Learning, R. Palaniappan, 2008
                                          Example (ctd)

•   Do crossover, say single point
•   The offspring of each of these parents contains the genetic information of both
    father and mother

For example;
• a father has the solution set a1, b1, c1, d1, and a mother has the solution set a2,
    b2, c2, d2, then there can be three pairs of possible crossovered offspring (| =
    crossover point):

      Father chromosome                  Mother Chromosome                           Offspring

      a1 | b1,c1, d1                     a2 | b2, c2, d2                             a1 b2,c2, d2 or a2 b1,c1, d1

      a1, b1 | c1, d1                    a2, b2 | c2, d2                             a1 b1,c2, d2 or a2 b2,c1, d1

      a1, b1, c1 | d1                    a2, b2, c2 | d2                             a1 b1,c1, d2 or a2, b2,c2, d1




                          Lecture 6 slides for CC282 Machine Learning, R. Palaniappan, 2008
                                            Example (ctd)

• Assume that through random parent selections, we have the following
  parent chromosomes
• Applying crossover to our example to produce one offspring for each pair of
  parents (assuming the crossover points are chosen randomly):
• Note: normally, there would be two offspring from parents but for simplicity
  of discussion, assume only one offspring is produced here

           Father chromosome                       Mother Chromosome                            Offspring

           (13 | 5, 7, 3)                          (1 | 28, 15, 3)                              (13, 28, 15, 3)
           (9, 13 | 5, 2)                          (14, 9 | 2, 4)                               (9, 13, 2, 4)
           (13, 5, 7 | 3)                          (9, 13, 5 | 2)                               (13, 5, 7, 2)
           (14 | 9, 2, 4)                          (9 | 13, 5, 2)                               (14, 13, 5, 2)
           (13, 5 | 7, 3)                          (9, 13 | 5, 2)                               (13, 5, 5 2)




                            Lecture 6 slides for CC282 Machine Learning, R. Palaniappan, 2008
                                         Example (ctd)

• Apply mutation to a randomly chosen chromosome, say gene a in
  chromosome 1
• Mutation here would change the randomly selected gene value from
  0 to 30
    (13, 28, 15, 3)  (8, 28, 15, 3)


• Recalculate the fitness value for the offspring representing the new
  generation:
                         Offspring                 Absolute difference       Fitness Value
                               chromosome
                         (8, 28, 15, 3)            |121-30|=91               1/91

                         (9, 13, 2, 4)             |57-30|=27                1/27

                         (13, 5, 7, 2)             |57-30|=22                1/22

                         (14, 13, 5, 2)            |63-30|=33                1/33

                         (13, 5, 5, 2)             |46-30|=16                1/16




                       Lecture 6 slides for CC282 Machine Learning, R. Palaniappan, 2008
                       Example - Commentary

• The average fitness value for the offspring chromosomes were 0.026,
  while the average fitness value for the parent chromosomes were 0.017
• Progressing at this rate, one chromosome should eventually reach a very
  high fitness value (i.e. when absolute difference is close= 0), that is when
  an optimal solution is found
• If you tried and simulated this yourself, you may actually get a fitness
  average that is lower on some generations, but on the long–run, the
  fitness levels will increase
• For systems where the population is larger (say 50, instead of 5), the
  fitness levels should be more steadily and stably approach the desired
  level, i.e. nearly every generation will have better solutions than previous
  ones




                     Lecture 6 slides for CC282 Machine Learning, R. Palaniappan, 2008
                  GA strengths and weaknesses
Advantage
• Often achieves good results
• In most cases, fitness function can be designed easily to fit the hypothesis
   (solution)
• Can be easily hybridised with many other ML algorithms to yield improved
   results
• There is no hard and fast rules, many users use variations freely in their
   applications
Disadvantage
• There is no guarantee that GA converges to the optimal solution
    – Because of incomplete searches
    – Because of hypothesis crowding, i.e. most chromosomes become similar and
      the fitness is high but not best and GA can’t progress further due to lack of
      variety




                     Lecture 6 slides for CC282 Machine Learning, R. Palaniappan, 2008
                        Lecture 6: Study guide

At the end of this section, you should be able to
• Define chromosome, gene, allele, crossover, mutation, fitness function
• Describe how GA work using a flowchart or an algorithm
• Explain how chromosomes and hypothesis are represented in GA, i.e.
    coding in GA
• Estimate the fitness function of a given population
• Describe chromosome selection mechanisms
• Perform crossover between two chromosomes using a single, two-point
    and uniform masks
• Perform mutation
• Explain how GA can be used to evolve NN weights
• State the main advantages and disadvantage of GA




                    Lecture 6 slides for CC282 Machine Learning, R. Palaniappan, 2008

				
DOCUMENT INFO