GENETIC ALGORITHMS AND GENETIC PROGRAMMING

Document Sample
GENETIC ALGORITHMS AND GENETIC PROGRAMMING Powered By Docstoc
					GENETIC ALGORITHMS AND
 GENETIC PROGRAMMING
           John R. Koza
    Consulting Professor (Medical Informatics)
             Department of Medicine
                School of Medicine
               Consulting Professor
      Department of Electrical Engineering
              School of Engineering
               Stanford University
            Stanford, California 94305

            koza@stanford.edu
http://www.smi.stanford.edu/people/koza/
  DEFINITION OF THE GENETIC
       ALGORITHM (GA)

The genetic algorithm is a probabalistic search
  algorithm that iteratively transforms a set
  (called a population) of mathematical objects
  (typically fixed-length binary character
  strings), each with an associated fitness value,
  into a new population of offspring objects using
  the Darwinian principle of natural selection
  and using operations that are patterned after
  naturally occurring genetic operations, such as
  crossover (sexual recombination) and mutation.
      GENETIC ALGORITHM (GA)

Generation 0             Generation 1

Individuals    Fitness   Offspring

011            $3        111

001            $1        010

110            $6        110

010            $2        010
     HAMBURGER RESTAURANT
           PROBLEM

• Price
  1 = $ 0.50 price
  0 = $10.00 price
• Drink
  1 = Coca Cola
  0 = Wine
• Ambiance
  1 = Fast snappy service
  0 = Leisurely service with tuxedoed waiter
CHROMOSOME (GENOME) OF THE
     GLOBAL OPTIMUM


McDONALD's

    1        1    1
        THE SEARCH SPACE
                 1   000
                 2   001
                 3   010
                 4   011
                 5   100
                 6   101
                 7   110
                 8   111

• Alphabet size K=2, Length L=3
• Size of search space: KL=2L=23=8
 IMPRACTICALITY OF RANDOM
   OR ENUMERATIVE SEARCH


• 81-bit problems are very small for GA
• However, even if L is as small as 81, 281 ~
  1027 = number of nanoseconds since the
  beginning of the universe 15 billion years
  ago
GA FLOWCHART
           GENERATION 0

      Generation 0
1     011 3
2     001 1
3     110 6
4     010 2
Total
Worst
Average
Best
  DEFINITION OF THE GENETIC
       ALGORITHM (GA)

The genetic algorithm is a probabalistic search
  algorithm that iteratively transforms a set
  (called a population) of mathematical objects
  (typically fixed-length binary character
  strings), each with an associated fitness value,
  into a new population of offspring objects using
  the Darwinian principle of natural selection
  and using operations that are patterned after
  naturally occurring genetic operations, such as
  crossover (sexual recombination) and mutation.
    PROBABILISTIC SELECTION
       BASED ON FITNESS

• Better individuals are preferred
• Best is not always picked
• Worst is not necessarily excluded
• Nothing is guaranteed
• Mixture of greedy exploitation and
  adventurous exploration
• Similarities to simulated annealing (SA)
PROBABILISTIC SELECTION
   BASED ON FITNESS


       0.17
                0.25



                       0.08
          0.5
     DARWINIAN FITNESS
  PROPORTIONATE SELECTION

       Generation 0    Mating pool
1      011 3     .25   011     3
2      001 1     .08   110     6
3      110 6     .50   110     6
4      010 2     .17   010     2
Total      12                  17
Worst      1                   2
Average    3.00                4.5
Best       6                   6
  DEFINITION OF THE GENETIC
       ALGORITHM (GA)

The genetic algorithm is a probabalistic search
  algorithm that iteratively transforms a set
  (called a population) of mathematical objects
  (typically fixed-length binary character
  strings), each with an associated fitness value,
  into a new population of offspring objects using
  the Darwinian principle of natural selection
  and using operations that are patterned after
  naturally occurring genetic operations, such as
  crossover (sexual recombination) and mutation.
      MUTATION OPERATION
• Parent chosen probabilistically based on
  fitness     Parent
               010
• Mutation point chosen at random
               Parent
               --0
• One offspring
               Offspring
               011
 AFTER MUTATION OPERATION

       Generation 0    Mating pool Generation 1
1      011 3     .25   011     3
2      001 1     .08   110     6
3      110 6     .50   110     6
4      010 2     .17   010     2   --- 011 3
Total      12                  17
Worst      1                   2
Average    3.00                4.5
Best       6                   6
       CROSSOVER OPERATION


• 2 parents chosen probabilistically based on
  fitness
              Parent 1 Parent 2
                011       110
     CROSSOVER (CONTINUED)
• Interstitial point picked at random
        Fragment 1    Fragment 2
            01-           11-
• 2 remainders
         Remainder 1 Remainder 2
            --1         --0
• 2 offspring produced by crossover
        Offspring 1   Offspring 2
             111          010
AFTER CROSSOVER OPERATION

       Generation 0    Mating pool Generation 1
1      011 3     .25   011     3   2   111 7
2      001 1     .08   110     6   2   010 2
3      110 6     .50   110     6
4      010 2     .17   010     2
Total      12                  17
Worst      1                   2
Average    3.00                4.5
Best       6                   6
      AFTER REPRODUCTION
          OPERATION
       Generation 0 Mating pool Generation 1
1      011 3     .25
2      001 1     .08
3      110 6     .50 110    6   --- 110 6
4      010 2     .17
Total      12               17
Worst      1                2
Average    3.00             4.5
Best       6                6
  DEFINITION OF THE GENETIC
       ALGORITHM (GA)

The genetic algorithm is a probabalistic search
  algorithm that iteratively transforms a set
  (called a population) of mathematical objects
  (typically fixed-length binary character
  strings), each with an associated fitness value,
  into a new population of offspring objects using
  the Darwinian principle of natural selection
  and using operations that are patterned after
  naturally occurring genetic operations, such as
  crossover (sexual recombination) and mutation.
            GENERATION 1

       Generation 0    Mating pool   Generation 1
1      011 3     .25   011     3     2   111 7
2      001 1     .08   110     6     2   010 2
3      110 6     .50   110     6     --- 110 6
4      010 2     .17   010     2     --- 011 3
Total      12                  17             18
Worst      1                   2              2
Average    3.00                4.5            4.5
Best       6                   6              7
  DEFINITION OF THE GENETIC
       ALGORITHM (GA)

The genetic algorithm is a probabalistic search
  algorithm that iteratively transforms a set
  (called a population) of mathematical objects
  (typically fixed-length binary character
  strings), each with an associated fitness value,
  into a new population of offspring objects using
  the Darwinian principle of natural selection
  and using operations that are patterned after
  naturally occurring genetic operations, such as
  crossover (sexual recombination) and mutation.
  DEFINITION OF THE GENETIC
       ALGORITHM (GA)

The genetic algorithm is a probabalistic search
  algorithm that iteratively transforms a set
  (called a population) of mathematical objects
  (typically fixed-length binary character
  strings), each with an associated fitness value,
  into a new population of offspring objects using
  the Darwinian principle of natural selection
  and using operations that are patterned after
  naturally occurring genetic operations, such as
  crossover (sexual recombination) and mutation.
       PROBABILISTIC STEPS

• The initial population is typically random
• Probabilistic selection based on fitness
  - Best is not always picked
  - Worst is not necessarily excluded
• Random picking of mutation and
  crossover points
• Often, there is probabilistic scenario as
  part of the fitness measure
ANTENNA DESIGN
            ANTENNA DESIGN

• The problem (Altshuler and Linden 1998) is to
  determine the x-y-z coordinates of the 3-
  dimensional position of the ends (X1, Y1, Z1, X2,
  Y2, Z2,… , X7, Y7, Z7) of 7 straight wires so that
  the resulting 7-wire antenna satisfies certain
  performance requirements
• The first wire starts at feed point (0, 0, 0) in the
  middle of the ground plane
• The antenna must fit inside the 0.5 cube
           ANTENNA GENOME

X1    Y1      Z1      X2      Y2      Z2      …
+0010 -1110   +0001 +0011 -1011       +0011 …

 • 105-bit chromosome (genome)
 • Each x-y-z coordinate is represented by 5 bits
   (4-bit granularity for data plus a sign bit)
 • Total chromosome is 3  7  5 = 105 bits
         ANTENNA FITNESS

• Antenna is for ground-to-satellite
  communications for cars and handsets
• We desire near-uniform gain pattern 10
  above the horizon
• Fitness is measured based on the
  antenna's radiation pattern. The
  radiation pattern is simulated by
  National Electromagnetics Code (NEC)
         ANTENNA FITNESS

• Fitness is sum of the squares of the
  difference between the average gain and
  the antenna's gain
• Sum is taken for angles  between -90
  and +90 and all azimuth angles  from
  0 to 180
• The smaller the value of fitness, the
  better
GRAPH OF ANTENNA FITNESS
U. S. PATENT 5,719,794
10-MEMBER TRUSS
          10-MEMBER TRUSS
• Prespecified topological arrangement of the 10
  members, the load, and the wall (Goldberg and
  Samtani 1986)
• Truss has 10 members (6 are length of 30 feet
  and 4 are length 30√2 = 41 feet)
• The problem is to determine the cross-sectional
  areas (A1, … , A10) of each of the 10 members
  so as to minimize weight of the material for a
  truss that supports the 2 loads
• The weight is based on volume (i.e., cross-
  sectional area  length)
                   TRUSS GENOME

A1       A2   A3    A4    A5    A6    A7    A8   A9   A10
0010 1110 0001 0011 1011 0011 1111 0011 0011 1010


     •   40-bit chromosome (genome)
     •   4-bit granularity for truss diameters
     •   0000 = smallest diameter
     •   1111 = largest diameter
     •   Total chromosome is 4  10 = 40 bits
           TRUSS FITNESS
• Two-part (multiobjective) fitness
  measure
  – First, fitness is computed by taking the sum,
    over the 10 members, of the cross-sectional
    area of each member times the length of
    each member (30 feet or 30√2 = 41 feet).
  – Second, a penalty (up to 10%) is imposed for
    violating the stress constraints. Stresses are
    computed using standard mechanical
    engineering techniques.
• The smaller the total fitness, the better
CELLULAR AUTOMATA
      STATE TRANSITION TABLE

#     WWW   WW   W   X   E   EE   EEE   Rule
0     0     0    0   0   0   0    1     a0
1     0     0    0   0   0   1    0     a1
2     0     0    0   0   1   0    0     a2
3     0     0    0   0   1   1    0     a3
4     0     0    1   1   0   0    0     a4
…     …     …    …   …   …   …    …     …
127   1     1    1   1   1   1    1     a127
     CELLULAR AUTOMATA

A0   A1   A2          …         A127
a0   a1   a2          …         a127



• 128-bit chromosome (genome)
PROBLEM-SPECIFIC GENOMES

      N  M GENOME

      1   1   0   1   1   1   1
      1   1   1   1   0   1   1
      1   0   1   1   1   0   0
      1   1   1   1   1   1   1
      1   1   0   1   1   1   1
      1   1   1   1   0   1   1
      1   1   0   1   1   1   0
   GENETIC ALGORITHM USING
   VARIABLE-LENGTH STRINGS

 • 5-WIRE ANTENNA (5  15 = 75 bits)
 X1     Y1     Z1     …    X5     Y5     Z5
+0010 -1110   +0001   …   +0010 -1110   +0001


 • 4-WIRE ANTENNA (4  15 = 60 bits)
 X1     Y1      Z1    …    X4     Y4      Z4
+1010 -0110 +1101     …   +1010 -0110 +1001
GENETIC PROGRAMMING
            THE CHALLENGE


"How can computers learn to solve problems without
  being explicitly programmed? In other words, how
  can computers be made to do what is needed to be
  done, without being told exactly how to do it?"

  Attributed to Arthur Samuel (1959)
           CRITERION FOR SUCCESS


"The aim [is] ... to get machines to exhibit behavior, which
  if done by humans, would be assumed to involve the use
  of intelligence.―

     Arthur Samuel (1983)
            REPRESENTATIONS

• Decision trees        • Binary decision
                          diagrams
• If-then production
                        • Formal grammars
  rules
                        • Coefficients for
• Horn clauses            polynomials
• Neural nets           • Reinforcement
• Bayesian networks       learning tables
• Frames                • Conceptual clusters
• Propositional logic   • Classifier systems
A COMPUTER PROGRAM
  GENETIC PROGRAMMING (GP)

• GP applies the approach of the genetic
  algorithm to the space of possible
  computer programs
• Computer programs are the lingua
  franca for expressing the solutions to a
  wide variety of problems
• A wide variety of seemingly different
  problems from many different fields can
  be reformulated as a search for a
  computer program to solve the problem.
       GP  MAIN POINTS
• Genetic programming now routinely delivers
  high-return human-competitive machine
  intelligence.
• Genetic programming is an automated
  invention machine.
• Genetic programming has delivered a
  progression of qualitatively more substantial
  results in synchrony with five approximately
  order-of-magnitude increases in the
  expenditure of computer time.
   DEFINITION OF “HIGH-
         RETURN”

The AI ratio (the ―artificial-to-intelligence‖
 ratio) of a problem-solving method as the
 ratio of that which is delivered by the
 automated operation of the artificial
 method to the amount of intelligence that
 is supplied by the human applying the
 method to a particular problem
          DEFINITION OF
           “ROUTINE”
A problem solving method is routine if it is
 general and relatively little human effort
 is required to get the method to
 successfully handle new problems within
 a particular domain and to successfully
 handle new problems from a different
 domain.
         CRITERIA FOR
    ―HUMAN-COMPETITIVENESS‖

• Previously patented, an improvement over a
  patented invention, or patentable today
• Publishable in its own right as a new scientific
  result independent of the fact that the result was
  mechanically created
• Holds it own in regulated competition against
  humans (or programs)
• 5 other similar criteria that are ―arms-length‖
  from the fields of AI, ML, GP
PROGRESSION OF QUALITATIVELY MORE
SUBSTANTIAL RESULTS PRODUCED BY GP



•   Toy problems
•   Human-competitive non-patent results
•   20th-century patented inventions
•   21st-century patented inventions
•   Patentable new inventions
GP FLOWCHART
  A COMPUTER PROGRAM IN C
int foo (int time)
{
   int temp1, temp2;
   if (time > 10)
       temp1 = 3;
   else
       temp1 = 4;
   temp2 = temp1 + 1 + 2;
   return (temp2);
}
OUTPUT OF C PROGRAM
      Time   Output
       0       6
       1       6
       2       6
       3       6
       4       6
       5       6
       6       6
       7       6
       8       6
       9       6
       10      6
       11      7
       12      7
        PROGRAM TREE




(+ 1 2 (IF (> TIME 10) 3 4))
CREATING RANDOM PROGRAMS
    CREATING RANDOM PROGRAMS


• Available functions
  F = {+, -, *, %, IFLTE}
• Available terminals
  T = {X, Y, Random-Constants}
• The random programs are:
  – Of different sizes and shapes
  – Syntactically valid
  – Executable
      GP GENETIC OPERATIONS


•   Reproduction
•   Mutation
•   Crossover (sexual recombination)
•   Architecture-altering operations
MUTATION OPERATION
            MUTATION OPERATION

• Select 1 parent probabilistically based on fitness
• Pick point from 1 to NUMBER-OF-POINTS
• Delete subtree at the picked point
• Grow new subtree at the mutation point in same
  way as generated trees for initial random
  population (generation 0)
• The result is a syntactically valid executable
  program
• Put the offspring into the next generation of the
  population
CROSSOVER OPERATION
           CROSSOVER OPERATION


• Select 2 parents probabilistically based on fitness
• Randomly pick a number from 1 to NUMBER-OF-
  POINTS for 1st parent
• Independently randomly pick a number for 2nd parent
• The result is a syntactically valid executable program
• Put the offspring into the next generation of the
  population
• Identify the subtrees rooted at the two picked points
        REPRODUCTION OPERATION


• Select parent probabilistically based on
  fitness
• Copy it (unchanged) into the next
  generation of the population
    FIVE MAJOR PREPARATORY STEPS FOR
                   GP




•   Determining the set of terminals
•   Determining the set of functions
•   Determining the fitness measure
•   Determining the parameters for the run
•   Determining the method for designating a result and
    the criterion for terminating a run
ILLUSTRATIVE GP RUN
SYMBOLIC REGRESSION

    Independent Dependent
    variable X  variable Y
    -1.00       1.00
    -0.80       0.84
    -0.60       0.76
    -0.40       0.76
    -0.20       0.84
    0.00        1.00
    0.20        1.24
    0.40        1.56
    0.60        1.96
    0.80        2.44
    1.00        3.00
                  PREPARATORY STEPS

    Objective:          Find a computer program with one input
                        (independent variable X) whose output
                        equals the given data
1   Terminal set:       T = {X, Random-Constants}


2   Function set:       F = {+,     -,   *, %}


3   Fitness:            The sum of the absolute value of the
                        differences   between     the    candidate
                        program’s output and the given data
                        (computed over numerous values of the
                        independent variable x from –1.0 to +1.0)
4   Parameters:         Population size M = 4


5   Termination:        An individual emerges whose sum of
                        absolute errors is less than 0.1
     SYMBOLIC REGRESSION

POPULATION OF 4 RANDOMLY CREATED
   INDIVIDUALS FOR GENERATION 0
    SYMBOLIC REGRESSION x2 + x + 1


FITNESS OF THE 4 INDIVIDUALS IN GEN 0




 x+1        x2 + 1      2             x

 0.67       1.00       1.70          2.67
         SYMBOLIC REGRESSION x2 + x + 1

                 GENERATION 1




                                                     Second offspring
                                First offspring of   of crossover of
                Mutant of (c)   crossover of (a)     (a) and (b)
                                and (b)               picking ―+‖ of
Copy of (a)     picking ―2‖     picking ―+‖ of       parent (a) and
                as mutation     parent (a) and
                point                                left-most ―x‖ of
                                left-most ―x‖ of     parent (b) as
                                parent (b) as        crossover points
                                crossover points
CLASSIFICATION
      GP TABLEAU – INTERTWINED
              SPIRALS
    Objective:    Create a program to classify a given
                  point in the x-y plane to the red or
                  blue spiral
1   Terminal set: T = {X,Y,Random-Constants}
2   Function set: F = {+,-,*,%,IFLTE,SIN,COS}
3   Fitness:     The number of correctly classified
                 points (0 – 194)
4   Parameters: M = 10,000. G = 51
5   Termination: An individual program scores 194
WALL-FOLLOWER
FITNESS
BEST OF GENERATION 57
BOX MOVER – BEST OF GEN 0
     BOX MOVER
GEN 45 – FITNESS CASE 1
TRUCK BACKER UPPER
         TRUCK BACKER UPPER
• 4-Dimensional control problem
   –   horizontal position, x
   –   vertical position, y
   –   angle between trailer and horizontal, t
   –   angle between trailer and cab, d
• One control variable (steering wheel turn
  angle)
• State transition equations map the 4 state
  variables into 1 output (the control variable)
• Simulation run over many initial conditions
  and over hundreds of time steps
GENETIC PROGRAMMING: ON THE
PROGRAMMING OF COMPUTERS BY
 MEANS OF NATURAL SELECTION
          (Koza 1992)
      2 MAIN POINTS FROM 1992 BOOK

• Virtually all problems in artificial intelligence,
  machine learning, adaptive systems, and
  automated learning can be recast as a search
  for a computer program.
• Genetic programming provides a way to
  successfully conduct the search for a computer
  program in the space of computer programs.
        SOME RESULTS FROM 1992 BOOK

•   Intertwined Spirals      •   Randomizer
•   Truck Backer Upper       •   Cellular Automata
•   Broom Balancer           •   Task Prioritization
•   Wall Follower            •   Image Compression
•   Box Mover
                             •   Econometric Equation
•   Artificial Ant
                             •   Optimization
•   Differential Games
•   Inverse Kinematics       •   Boolean Function Learning
•   Central Place Foraging   •   Co-Evolution of Game-
•   Block Stacking               Playing Strategies
PROGRESSION OF QUALITATIVELY MORE
SUBSTANTIAL RESULTS PRODUCED BY GP



•   Toy problems
•   Human-competitive non-patent results
•   20th-century patented inventions
•   21st-century patented inventions
•   Patentable new inventions
            COMPUTER PROGRAMS




• Subroutines provide one way to REUSE code  possibly
  with different instantiations of the dummy variables
  (formal parameters)
• Loops (and iterations) provide a 2nd way to REUSE code
• Recursion provide a 3rd way to REUSE code
• Memory provides a 4th way to REUSE the results of
  executing code
         SYMBOLIC REGRESSION

Fitness case L0   W0   H0   L1   W1   H1   Dependent variable D
1           3     4    7    2    5    3                       54
2           7     10   9    10   3    1                      600
3           10    9    4    8    1    6                      312
4           3     9    5    1    6    4                      111
5           4     3    2    7    6    1                      -18
6           3     3    1    9    5    4                     -171
7           5     9    9    1    7    6                      363
8           1     2    9    3    9    2                      -36
9           2     6    8    2    6    10                     -24
10          8     1    10   7    5    1                       45
EVOLVED SOLUTION




(- (* (* W0 L0) H0)
   (* (* W1 L1) H1))
DIFFERENCE IN VOLUMES




    D = L0W0H0 – L1W1H1
AUTOMATICALLY DEFINED
   FUNCTION volume
   AUTOMATICALLY DEFINED
      FUNCTION volume

(progn
 (defun volume (arg0 arg1 arg2)
  (values
   (* arg0 (* arg1 arg2))))
 (values
   (- (volume L0 W0 H0)
       (volume L1 W1 H1))))
    AUTOMATICALLY DEFINED
         FUNCTIONS

• ADFs provide a way to REUSE code
• Code is typically reused with different
  instantiations of the dummy variables
  (formal parameters)
          ADDITION OF V0 AND V1

Fitness case L0   W0   H0   L1   W1   H1   V0    V1    D
1           3     4    7    2    5    3    84    30    54
2           7     10   9    10   3    1    630   30    600
3           10    9    4    8    1    6    360   48    312
4           3     9    5    1    6    4    135   24    111
5           4     3    2    7    6    1    24    42    -18
6           3     3    1    9    5    4    9     180   -171
7           5     9    9    1    7    6    405   42    363
8           1     2    9    3    9    2    18    54    -36
9           2     6    8    2    6    10   96    120   -24
10          8     1    10   7    5    1    80    35    45
DIVIDE AND CONQUER
       DIVIDE AND CONQUER


• Decompose a problem into sub-problems
• Solve the sub-problems
• Assemble the solutions of the sub-
  problems into a solution for the overall
  problem
CHANGE OF REPRESENTATION
 CHANGE OF REPRESENTATION


• Identify regularities
• Change the representation
• Solve the overall problem
      ADF IMPLEMENTATION
• Each overall program in population includes
  – a main result-producing branch (RPB) and
  – function-defining branch (i.e., automatically defined
    function, ADF)
• In generation 0, create random programs with
  different ingredients for the RPB and the ADF
  – Terminal set for ADF typically contains dummy
    arguments (formal parameters), such as ARG0,
    ARG1, …
  – Function set of the RPB contains ADF0
  – ADFs are private and associated with a particular
    individual program in the population
           ADF MUTATION

• Select parent probabilistically on the
  basis of fitness
• Pick a mutation point from either RPB or
  an ADF
• Delete sub-tree rooted at the picked point
• Grow a new sub-tree at the picked point
  composed of the allowable ingredients
  appropriate for the picked point
• The offspring is a syntactically valid
  executable program
            ADF CROSSOVER

• Select parent probabilistically on the basis of
  fitness
• Pick a crossover point from either RPB or an
  ADF of the FIRST patent
• The choice of crossover point in the SECOND
  parent is RESTRICTED to the picked RPB or
  to the picked ADF
• The sub-trees are swapped
• The offspring are syntactically valid executable
  programs
GENETIC PROGRAMMING II: AUTOMATIC
 DISCOVERY OF REUSABLE PROGRAMS
            (Koza 1994)
        MAIN POINTS OF 1994 BOOK


• Scalability is essential for solving non-trivial
  problems in artificial intelligence, machine
  learning, adaptive systems, and automated
  learning
• Scalability can be achieved by reuse
• Genetic programming provides a way to
  automatically discover and reuse subprograms
  in the course of automatically creating computer
  programs to solve problems
            COMPUTER PROGRAMS




• Subroutines provide one way to REUSE code  possibly
  with different instantiations of the dummy variables
  (formal parameters)
• Loops (and iterations) provide a 2nd way to REUSE code
• Recursion provide a 3rd way to REUSE code
• Memory provides a 4th way to REUSE the results of
  executing code
                     MEMORY




Settable   Indexed   Matrix   Relational memory
(named)    vector    memory
variables memory
    LANGDON'S DATA STRUCTURES


•   Stacks
•   Queues
•   Lists
•   Rings
            COMPUTER PROGRAMS




• Subroutines provide one way to REUSE code  possibly
  with different instantiations of the dummy variables
  (formal parameters)
• Loops (and iterations) provide a 2nd way to REUSE code
• Recursion provide a 3rd way to REUSE code
• Memory provides a 4th way to REUSE the results of
  executing code
    AUTOMATICALLY DEFINED
        ITERATION (ADI)
• The overall program includes an iteration-
  performing branch (IPB) in addition to a
  result-producing branch (RPB) and function-
  defining branches (ADF)
• There are no infinite loops because the iteration
  is performed over a known, fixed set
   – protein or DNA sequence (of varying length)
   – time-series data
   – two-dimensional array of pixels
• Memory is usually involved and is used to
  communicate between IPB, RPB, and ADF
      TRANSMEMBRANE SEGMENT
       IDENTIFICATION PROBLEM




Goal is to classify a given protein segment
  as being a transmembrane domain or
 non-transmembrane area of the protein
         TRANSMEMBRANE SEGMENT
          IDENTIFICATION PROBLEM
(progn
    (defun ADF0 ()
(ORN (ORN (ORN (I?) (H?)) (ORN (P?) (G?))) (ORN (ORN (ORN (Y?)
   (N?)) (ORN (T?) (Q?))) (ORN (A?) (H?))))))
    (defun ADF1 ()
(values (ORN (ORN (ORN (A?) (I?)) (ORN (L?) (W?))) (ORN (ORN
   (T?) (L?)) (ORN (T?) (W?))))))
    (defun ADF2 ()
(values (ORN (ORN (ORN (ORN (ORN (D?) (E?)) (ORN (ORN (ORN
   (D?) (E?)) (ORN (ORN (T?) (W?)) (ORN (Q?) (D?)))) (ORN (K?)
   (P?)))) (ORN (K?) (P?))) (ORN (T?) (W?))) (ORN (ORN (E?)
   (A?)) (ORN (N?) (R?))))))
 (progn (loop-over-residues
       (SETM0 (+ (- (ADF1) (ADF2)) (SETM3 M0))))
 (values (% (% M3 M0) (% (% (% (- L -0.53) (* M0 M0)) (+ (% (%
   M3 M0) (% (+ M0 M3) (% M1 M2))) M2)) (% M3 M0))))))
       TRANSMEMBRANE SEGMENT
       IDENTIFICATION PROBLEM




• in-sample correlation of 0.976
• out-of-sample correlation of 0.968
• out-of-sample error rate 1.6%
    AUTOMATICALLY DEFINED
         LOOP (ADL)


•   loop initialization branch, LIB
•   loop condition branch, LCB
•   loop body branch, LBB
•   loop update branch, LUB
ADL
            COMPUTER PROGRAMS




• Subroutines provide one way to REUSE code  possibly
  with different instantiations of the dummy variables
  (formal parameters)
• Loops (and iterations) provide a 2nd way to REUSE code
• Recursion provide a 3rd way to REUSE code
• Memory provides a 4th way to REUSE the results of
  executing code
    AUTOMATICALLY DEFINED
       RECURSION (ADR)


•   recursion condition branch, RCB
•   recursion body branch, RBB
•   recursion update branch, RUB
•   recursion ground branch, RGB
ADR
              HUMAN-COMPETITIVE RESULTS
               (NOT RELATED TO PATENTS)

Transmembrane segment identification problem for proteins
Motifs for D–E–A–D box family and manganese superoxide dismutase family of proteins
Cellular automata rule for Gacs-Kurdyumov-Levin (GKL) problem
Quantum algorithm for the Deutsch-Jozsa “early promise” problem
Quantum algorithm for Grover’s database search problem
Quantum algorithm for the depth-two AND/OR query problem
Quantum algorithm for the depth-one OR query problem
Protocol for communicating information through a quantum gate
Quantum dense coding
Soccer-playing program that won its first two games in the 1997 Robo Cup competition
Soccer-playing program that ranked in the middle of field in 1998 Robo Cup competition
Antenna designed by NASA for use on spacecraft
Sallen-Key filter
PROGRESSION OF QUALITATIVELY MORE
SUBSTANTIAL RESULTS PRODUCED BY GP



•   Toy problems
•   Human-competitive non-patent results
•   20th-century patented inventions
•   21st-century patented inventions
•   Patentable new inventions
GENETIC PROGRAMMING III: DARWINIAN
  INVENTION AND PROBLEM SOLVING
    (Koza, Bennett, Andre, Keane 1999)
SUBROUTINE DUPLICATION
SUBROUTINE CREATION
SUBROUTINE DELETION
ARGUMENT DUPLICATION
ARGUMENT DELETION
        16 ATTRIBUTES OF A SYSTEM FOR
     AUTOMATICALLY CREATING COMPUTER
                  PROGRAMS

• Starts with "What needs   • Self-organization of
  to be done"                 hierarchies
• Tells us "How to do it"   • Automatic determination of
• Produces a computer         program architecture
  program                   • Wide range of
• Automatic determination     programming constructs
  of program size           • Well-defined
• Code reuse                • Problem-independent
• Parameterized reuse       • Wide applicability
• Internal storage          • Scalable
• Iterations, loops, and    • Competitive with human-
  recursions                  produced results
GENETIC PROGRAMMING PROBLEM
         SOLVER (GPPS)
   AUTOMATIC SYNTHESIS OF
   BOTH THE TOPOLOGY AND
SIZING OF ANALOG ELECTRICAL
    CIRCUITS BY MEANS OF
   DEVELOPMENTAL GENETIC
        PROGRAMMING
AUTOMATED CIRCUIT SYNTHESIS


• The topology of a circuit includes specifying the
  gross number of components in the circuit, the
  type of each component (e.g., a capacitor), and
  a netlist specifying where each lead of each
  component is to be connected.
• Sizing involves specifying the values (typically
  numerical) of each of the circuit's components.
COMPONENT-CREATING FUNCTIONS


•   Resistor R function
•   Capacitor C function
•   Inductor L function
•   Diode D function
•   Transistor Q function (3-leaded)
COMPONENT-CREATING FUNCTIONS
TOPOLOGY-MODIFYING FUNCTIONS



• SERIES division
• PARALLEL division
• VIA
• FLIP
TOPOLOGY-MODIFYING FUNCTIONS
 DEVELOPMENT-CONTROLLING
        FUNCTIONS


• END function
• NOP (No Operation) function
• SAFE_CUT function
THE INITIAL CIRCUIT
        DEVELOPMENTAL GP




(LIST (C (– 0.963 (– (– -0.875 -0.113)
  0.880)) (series (flip end) (series
  (flip end) (L -0.277 end) end) (L (– -
  0.640 0.749) (L -0.123 end)))) (flip
  (nop (L -0.657 end)))))
CAPACITOR-CREATING FUNCTION


(LIST (C (– 0.963 (– (– -0.875
 -0.113) 0.880)) (series (flip
 end) (series (flip end) (L –
0.277 end) end) (L (– -0.640
 0.749) (L -0.123 end)))) (flip
 (nop (L -0.657 end)))))
CAPACITOR-CREATING FUNCTION
  SERIES DIVISION FUNCTION

(LIST (C (– 0.963 (– (– -0.875
 -0.113) 0.880)) (series (flip
 end) (series (flip end) (L –
0.277 end) end) (L (– -0.640
 0.749) (L -0.123 end)))) (flip
 (nop (L -0.657 end)))))
SERIES DIVISION
DEVELOPMENTAL GP
   EVALUATION OF FITNESS

                                   z0
                           IN             OUT
                +
                            Embryonic Circuit

Program Tree


       Fully Designed Circuit (NetGraph)


        Circuit Netlist (ascii)


    Circuit Simulator (SPICE)


      Circuit Behavior (Output)

               Fitness
DESIRED BEHAVIOR OF A LOWPASS
           FILTER
EVOLVED CAMPBELL FILTER




      U. S. patent 1,227,113
        George Campbell
American Telephone and Telegraph
               1917
     EVOLVED ZOBEL FILTER




          U. S. patent 1,538,964
                Otto Zobel
American Telephone and Telegraph Company
                   1925
EVOLVED SALLEN-KEY FILTER
EVOLVED DARLINGTON EMITTER-
     FOLLOWER SECTION




       U. S. patent 2,663,806
        Sidney Darlington
    Bell Telephone Laboratories
               1953
NEGATIVE FEEDBACK
HAROLD BLACK’S RIDE ON THE
   LACKAWANNA FERRY




       Courtesy of Lucent Technologies
             20th-CENTURY PATENTS
Campbell ladder topology for filters
Zobel “M-derived half section” and “constant K” filter sections
Crossover filter
Negative feedback
Cauer (elliptic) topology for filters
PID and PID-D2 controllers
Darlington emitter-follower section and voltage gain stage
Sorting network for seven items using only 16 steps
60 and 96 decibel amplifiers
Analog computational circuits
Real-time analog circuit for time-optimal robot control
Electronic thermometer
Voltage reference circuit
Philbrick circuit
NAND circuit
Simultaneous synthesis of topology, sizing, placement, and routing
PROGRESSION OF QUALITATIVELY MORE
SUBSTANTIAL RESULTS PRODUCED BY GP



•   Toy problems
•   Human-competitive non-patent results
•   20th-century patented inventions
•   21st-century patented inventions
•   Patentable new inventions
SIX POST-2000 PATENTED
      INVENTIONS
EVOLVED HIGH CURRENT LOAD CIRCUIT
REGISTER-CONTROLLED CAPACITOR
           CIRCUIT
LOW-VOLTAGE CUBIC CIRCUIT
VOLTAGE-CURRENT-CONVERSION
          CIRCUIT
LOW-VOLTAGE BALUN CIRCUIT
TUNABLE INTEGRATED ACTIVE FILTER
21st-CENTURY PATENTED INVENTIONS



Low-voltage balun circuit
Mixed analog-digital variable capacitor circuit
High-current load circuit
Voltage-current conversion circuit
Cubic function generator
Tunable integrated active filter
PROGRESSION OF QUALITATIVELY MORE
SUBSTANTIAL RESULTS PRODUCED BY GP



•   Toy problems
•   Human-competitive non-patent results
•   20th-century patented inventions
•   21st-century patented inventions
•   Patentable new inventions
 NOVELTY-DRIVEN EVOLUTION

• Two factors in fitness measure
  – Circuit’s behavior in the frequency domain
  – Largest number of nodes and edges (circuit
    components) of a subgraph of the given circuit that
    is isomorphic to a subgraph of a template
    representing the prior art. Graph isomorphism
    algorithm with the cost function being based on the
    number of shared nodes and edges (instead of just
    the number of nodes).
 NOVELTY-DRIVEN EVOLUTION


• For circuits not scoring the maximum
  number of hits (101), the fitness of a
  circuit is the product of the two factors.
• For circuits scoring 101 hits (100%-
  compliant individuals), fitness is the
  number of shared nodes and edges
  divided by 10,000.
PRIOR ART TEMPLATE
NON-INFRINGING SOLUTION NO. 1
NON-INFRINGING SOLUTION NO. 5
GP AS AN INVENTION MACHINE
CIRCUIT-CONSTRUCTING PROGRAM TREE
             WITH ADFs
LOWPASS FILTER WITH ADFs
ADF0
  AUTOMATIC SYNTHESIS OF
     CIRCUIT LAYOUT

INCLUDING THE PLACEMENT OF
COMPONENTS AND ROUTING OF
           WIRES

 ALONG WITH THE TOPOLOGY
        AND SIZING
           CIRCUIT LAYOUT

• Circuit placement involves the assignment of
  each of the circuit's components to a particular
  physical location on a printed circuit board or
  silicon wafer.
• Routing involves the assignment of a particular
  physical location to the wires between the leads
  of the circuit's components.
LAYOUT
LAYOUT — GENERATION 0
  100%-COMPLIANT LOWPASS FILTER

GENERATION 25 WITH 5 CAPACITORS AND
   11 INDUCTORS  AREA OF 1775.2
  100%-COMPLIANT LOWPASS FILTER

GENERATION 30 WITH 10 INDUCTORS AND
    5 CAPACITORS  AREA OF 950.3
 100%-COMPLIANT LOWPASS FILTER

BEST-OF-RUN CIRCUIT OF GENERATION
    138 WITH 4 INDUCTORS AND 4
    CAPACITORS  AREA OF 359.4
LAYOUT  60 DB AMPLIFIER
AUTOMATIC SYNTHESIS
OF BOTH THE TOPOLOGY
    AND TUNING OF
    CONTROLLERS
PROGRAM TREE FOR A
   CONTROLLER
         CONTROLLER BLOCKS

• gain                 •   lead
• integrator           •   lag
• differentiator       •   two-parameter lag
• adder                •   absolute value
• subtractor           •   limiter
• multiplier           •   divider
• differential input   •   delay
  integrators          •   conditional operators
• inverter                 (switches)
      FUNCTION SET FOR
    CONTROLLER SYNTHESIS


F = {GAIN,INVERTER,LEAD,LAG,LAG2,
  DIFFERENTIAL_INPUT_INTEGRATOR,
  DIFFERENTIATOR, ADD_SIGNAL,
  SUB_SIGNAL,ADD_3_SIGNAL,ADF0,
  ADF1,ADF2,ADF3,ADF4}
      TERMINAL SET FOR
    CONTROLLER SYNTHESIS


T = { REFERENCE_SIGNAL,
  CONTROLLER_OUTPUT,
  PLANT_OUTPUT}
    CONSTRAINED SYNTACTIC
         STRUCTURE

• A grammar that specifies what functions
  and terminals are allowed as arguments
  to particular functions
• For example, the first argument of the
  GAIN function must be a value-setting
  subtree whereas the second can be from
  the general pool of functions
• Also known as strong typing
TWO-LAG PLANT
           8 FITNESS CASES

• 8 elements of the fitness              measure
  represent 2  2  2 choices:
  – 2 different values of the plant's internal gain,
    K (1.0 and 2.0),
  – 2 different values of the plant's time constant
     (0.5 and 1.0),
  – 2 different values for the height of the
    reference signal (rising from 0 to 1 volts or
    from 0 to 1 microvolts at t = 100 milliseconds
               FITNESS MEASURE

• For each of these 8 fitness cases, a transient
  analysis (time domain) is performed using the
  SPICE simulator.
• The contribution to fitness for the 8 elements is


  – Integral of time-weighted absolute error (ITAE)
  – e(t) is difference between plant output and reference signal.
  – Multiplication by B (106 or 1) makes both reference signals equally
    influential.
  – Additional weighting function, A, heavily penalizes non-compliant
    amounts of overshoot. A weights all variations up to 2% above the
    reference signal by 1.0, but bigger variations by 10.0.
EVOLVED CONTROLLER FOR TWO-LAG
            PLANT
LESS ITAE AND OVERSHOOT
BETTER DISTURBANCE REJECTION
REVERSE ENGINEERING OF METABOLIC
           PATHWAYS
EVOLVED PATHWAY
ANTENNA SYNTHESIS USING GP




   (PROGN3
     (TURN-RIGHT 0.125)
     (LANDMARK
       (REPEAT 2
         (PROGN2
           (DRAW 1.0 HALF-MM-WIRE)
           (DRAW 0.5 NO-WIRE)))
     (TRANSLATE-RIGHT 0.125 0.75))
USING A TURTLE TO DRAW TWO-
    DIMENSIONAL ANTENNA
BEST-OF-RUN ANTENNA FROM
      GENERATION 90
3-DIMENSIONAL ANTENNA
    NASA EVOLVED ANTENNA




To be on satellite to be launched in 2004
OTHER STRUCTURES
GENETIC NETWORK FOR lac
         operon
         EVOLVED NETWORK
(IF (< LACTOSE_LEVEL 9.139 ) (IF (<
REPRESSOR_LEVEL 6.270 ) (IF (> GLUCOSE_LEVEL
5.491 ) 2.02 (IF (< CAP_LEVEL 0.639 ) 2.033 (IF
(< CAP_LEVEL 4.858 ) (IF (> LACTOSE_LEVEL 2.511 )
(IF (> CAP_LEVEL 7.807 ) 5.586 (IF (>
LACTOSE_LEVEL 2.114 ) 1.978 2.137 ) ) 0.0 ) (IF
(> REPRESSOR_LEVEL 4.015 ) 0.036 (IF (<
GLUCOSE_LEVEL 5.128 ) 10.0 (IF (< REPRESSOR_LEVEL
4.268 ) 2.022 9.122 ) ) ) ) ) ) (IF (> CAP_LEVEL
0.842 ) 0.0 5.97 ) ) (IF (< CAP_LEVEL 1.769 )
2.022 (IF (< GLUCOSE_LEVEL 2.382 ) (IF (>
LACTOSE_LEVEL 1.256 ) (IF (> LACTOSE_LEVEL 1.933
) (IF (> GLUCOSE_LEVEL 2.022 ) (IF (<
GLUCOSE_LEVEL 5.183 ) 6.323 (IF (> CAP_LEVEL
1.208 ) 9.713 0.842 ) ) 10.0 ) (IF (>
GLUCOSE_LEVEL 6.270 ) 2.109 ) 1.965 ) ) 0.665 )
1.982 ) ) )
      IN C-STYLE PSEUDO CODE

if(LACTOSE_LEVEL < 9.139)   else
                            {
{
                                if(CAP_LEVEL < 1.769)
    if(REPRESSOR_LEVEL <        {
   6.270)                           LAC_mRNA_LEVEL =
    {                          2.022;
        LAC_mRNA_LEVEL =        }
   2.022;                       else
    }                           {
    else                            if(GLUCOSE_LEVEL <
    {                          2.382)
                                    {
        LAC_mRNA_LEVEL =
                                         LAC_mRNA_LEVEL
   0.0;                        = 10.0;
    }                               }
}                                   else
                                    {
                                         LAC_mRNA_LEVEL
                               = 1.982;
                            } } }
 PARAMETERIZED TOPOLOGIES



• One of the most important characteristics
  of computer programs is that they
  ordinarily contain inputs (free variables)
  and conditional operations
PARAMETERIZED TOPOLOGY FOR
      LOWPASS FILTER
PARAMETERIZED TOPOLOGY FOR
      HIGHPASS FILTER
PARAMETERIZED TOPOLOGY FOR
GENERAL-PURPOSE CONTROLLER
EVOLVED EQUATIONS FOR GENERAL-
     PURPOSE CONTROLLER
EVOLVED EQUATIONS FOR GENERAL-
     PURPOSE CONTROLLER
              PATENTABLE NEW INVENTIONS



PID tuning rules that outperform the Ziegler-Nichols and Åström-Hägglund tuning rules
General-purpose controllers outperforming Ziegler-Nichols and Åström-Hägglund rules
PROGRESSION OF QUALITATIVELY MORE
SUBSTANTIAL RESULTS PRODUCED BY GP



•   Toy problems
•   Human-competitive non-patent results
•   20th-century patented inventions
•   21st-century patented inventions
•   Patentable new inventions
PARALLELIZATION WITH SEMI-ISOLATED
         SUBPOPULATIONS
      GP PARALLELIZATION

• Like Hormel, Get Everything Out of the
  Pig, Including the Oink
• Keep on Trucking
• It Takes a Licking and Keeps on Ticking
• The Whole is Greater than the Sum of
  the Parts
               PETA-OPS


• Human brain operates at 1012 neurons
  operating at 103 per second = 1015 ops per
  second
• 1015 ops = 1 peta-op = 1 bs (brain second)
 EVOLVABLE HARDWARE
CORNER OF XILINX XC6216
FUNCTION UNIT FOR CELL OF
      XILINIX XC6216
        SORTING NETWORK
A       B       C




                          D
H   G       F    E
EVOLVED SORTING NETWORK
                             GP 1987–2002

System               Dates       Speed-up     Human-      Problem Category
                                 over first   competitive
                                 system       results
Serial LISP          1987–1994   1 (base)     0           toy problems
64 transputers       1994–1997   9            2           human-competitive
                                                          results not related to
                                                          patented inventions
64 PowerPC’s         1995–2000   204          12          20th-century
                                                          patented inventions
70 Alpha’s           1999–2001   1,481        2           20th-century
                                                          patented inventions
1,000 Pentium II’s   2000–2002   13,900       12          21st-century
                                                          patented inventions
4-week runs on       2002-2003   130,000      2           patentable new
1,000 Pentium II’s                                        inventions
   PROMISING GP APPLICATION AREAS


• Problem areas involving many variables
  that are interrelated in highly non-linear
  ways
• Inter-relationship of variables is not well
  understood
• Discovery of the size and shape of the
  solution is a major part of the problem
• "Black art" problems
  PROMISING GP APPLICATION AREAS
           (CONTINUED)


• Areas where you simply have no idea
  how to program a solution, but where
  you know what you want
  PROMISING GP APPLICATION AREAS
           (CONTINUED)


• Problem areas where a good approximate
  solution is satisfactory
    design
    control
    bioinformatics
    classification
    data mining
    system identification
    forecasting
   PROMISING GP APPLICATION AREAS
            (CONTINUED)

 Areas where large computerized databases are
  accumulating and computerized techniques are
  needed to analyze the data
    genome, protein, microarray data
    satellite image data
    astronomical data
    petroleum databases
    financial databases
    medical records
    marketing databases
   PROMISING GP APPLICATION AREAS
            (CONTINUED)


 Areas for which humans find it very
  difficult to write good programs
    parallel computers
    cellular automata
    multi-agent strategies
    field-programmable game arrays
    digital signal processors
    swarm intelligence
CHARACTERISTICS SUGGESTING
        USE OF GP

(1) discovering the size and shape of the solution,
(2) reusing substructures,
(3) discovering the number of substructures,
(4) discovering the nature of the hierarchical references among
    substructures,
(5) passing parameters to a substructure,
(6) discovering the type of substructures (e.g., subroutines, iterations,
    loops, recursions, or storage),
(7) discovering the number of arguments possessed by a substructure,
(8) maintaining syntactic validity and locality by means of a
    developmental process, or
(9) discovering a general solution in the form of a parameterized
    topology containing free variables
        DESIGNING A GIRAFFE

•   Long neck
•   Long tongue
•   Vegetable-digesting enzymes in stomach
•   4 legs
•   Long legs
•   Brown coloration
    THE DESIGN OF A GOOD GIRAFE

Neck       Tongue     Carnivorous? Number Leg        Coloration
length     length                  of legs length


15.11      14 inches No           4         9.96 feet Brown
feet
Floating   Floating   Boolean     Integer   Floating Categorical
point      point                            point
    NON-LINEARITY — GIRAFE

• Taken one-by-one, some gene values found in a
  giraffe, such as the long neck contribute (alone)
  negatively to fitness
   – requires considerable material to construct
   – requires considerable energy to maintain
   – prone to injury (thereby hurting rate of survival and
     reproduction)
• Thus, maximizing any one variable will not
  lead to the global optimum solution
  NON-LINEARITY (CONTINUED)


• When the variables are taken in pairs
  (there are 15 possible pairs), many
  combinations of pairs (e.g., Long neck
  and long tongue) are doubly detrimental
 NON-LINEARITY (CONTINUED)

• But, certain combinations of traits, when taken
  together, are "co-adapted sets of alleles" that
  yield a very fit animal for eating high acacia
  leaves in the jungle environment, having good
  camouflage, having high escape velocity when
  faced with predators, and exploiting a niche
  (and avoiding competition) with other animals
  feeding on low-hanging vegetation
SEARCH METHODS IN GENERAL

•   Initial structure(s)
•   Fitness measure
•   Operations for creating new structures
•   Parameters
•   Termination criterion and method of
    designating the result
SPACE WITH MANY LOCAL OPTIMA
          SEARCH METHODS

• Blind random search does not use acquired
  information in deciding on the future direction
  of the search
• Hill combing and gradient descent use acquired
  information; however, they are prone to
  becoming trapped on local optima
• The previous point is especially true for non-
  trivial search spaces
 7 DIFFERENCES BETWEEN GP
AND ARTIFICIAL INTELLIGENCE
   AND MACHINE LEARNING
        APPROACHES
         REPRESENTATION


Genetic programming overtly conducts it
search for a solution to the given problem
in program space
    ROLE OF POINT-TO-POINT
   TRANSFORMATIONS IN THE
           SEARCH

Genetic programming does not conduct its
search by transforming a single point in the
search space into another single point, but
instead transforms a set of points into
another set of points
ROLE OF HILL CLIMBING IN THE
          SEARCH

Genetic programming does not rely
exclusively on greedy hill climbing to
conduct its search, but instead allocates a
certain number of trials, in a principled
way, to choices that are known to be
inferior
 DETERMINISM IN THE SEARCH


Genetic programming conducts its search
probabilistically
      ROLE OF AN EXPLICIT
       KNOWLEDGE BASE


Genetic programming does NOT make use
of a knowledge base
ROLE OF FORMAL LOGIC IN THE
          SEARCH


Genetic programming does not utilize
formal logic in it’s search strategy.
  Contradictory alternatives are created
  and actively maintained.
      UNDERPINNINGS OF THE
          TECHNIQUE


Biologically inspired
             TURING (1948)

Turing made the connection between
searches and the challenge of getting a
computer to solve a problem without
explicitly programming it in his 1948 essay
―Intelligent Machines‖

  "Further research into intelligence of
   machinery will probably be very greatly
   concerned with 'searches' ... ―
 TURING’S 3 APPROACHES TO
MACHINE INTELLIGENCE (1948)



      LOGIC-BASED SEARCH

One approach that Turing identified is a
search through the space of integers
 representing candidate computer
 programs.
    TURING’S 3 APPROACHES
         (CONTINUED)


        CULTURAL SEARCH

A second approach is the "cultural search―
which relies on knowledge and expertise
acquired over a period of years from
others (akin to present-day knowledge-
based systems).
   TURING’S 3 APPROACHES
        (CONTINUED)


GENETICAL OR EVOLUTIONARY
          SEARCH

"There is the genetical or evolutionary
search by which a combination of genes is
looked for, the criterion being the survival
value.―
             TURING (1950)

From Turing’s 1950 paper "Computing
Machinery and Intelligence" …

  ―We cannot expect to find a good child-
   machine at the first attempt. One must
   experiment with teaching one such machine
   and see how well it learns. One can then try
   another and see if it is better or worse.
   There is an obvious connection between this
   process and evolution, by the identifications‖
   TURING (1950) (CONTINUED)


―Structure of the child machine =
Hereditary material

―Changes of the child machine =
Mutations

―Natural selection =
  Judgment of the experimenter‖