# Artificial Intelligence Methods University of Nottingham Malaysia by mudoc123

VIEWS: 2 PAGES: 32

• pg 1
```									Artificial Intelligence Methods
G52AIM
University of Nottingham
Malaysia campus

Andrzej Bargiela    2007/2008
Genetic Programming (GP)
Automatic programming
Program synthesis or
Program induction

One of the central challenges of computer science is:
To get a computer to do what needs to be done,
without telling it how to do it.
In essence, this is the beginning of computer programs that program
themselves.

Genetic programming is the application of
evolutionary theory to computer programming.
Introduction:
What is a Computer Program?
   A computer program is an entity that receives inputs, performs
computations, and produces outputs.
    Computer programs perform:
   basic arithmetic and conditional computations on variables of
various types (including integer, floating-point, and Boolean
variables),
   iterations and recursions,
   store intermediate results in memory,
   organize groups of operations into reusable subroutines,
   pass information to subroutines in the form of dummy variables
(formal parameters),
   receive information from subroutines in the form of return values,
and
   organize subroutines and a main program into a hierarchy.

2007/2008                    G52AIM Artificial Intelligence Methods
Introduction: Genetic Programming
(a branch of genetic algorithms)

• Genetic programming addresses this challenge by providing a
method for automatically creating a working computer program
from a high-level problem statement of the problem.

   Genetic programming is a domain-independent method that
genetically breeds a population of computer programs to solve a
problem.
   Genetic programming iteratively transforms a population of
computer programs into a new generation of programs by
applying analogs of naturally occurring genetic operations.
   The genetic operations include crossover (recombination),
mutation, reproduction and architecture altering operations

2007/2008                     G52AIM Artificial Intelligence Methods
Introduction: GP Quick Overview
   Developed: USA in the 1990’s
   Early names: J. Koza
   Typically applied to:
   machine learning tasks (prediction, classification…)
   Attributed features:
   competes with neural nets and alike
   needs huge populations (thousands)
   slow
   Special:
   non-linear chromosomes: trees, graphs
   mutation possible but not necessary (disputed!)

2007/2008                  G52AIM Artificial Intelligence Methods
GP Technical Summary Tableau

Representation              Tree structures
Recombination               Exchange of subtrees
Mutation                    Random change in
trees
Parent selection            Fitness proportionate
Survivor selection          Generational
replacement
2007/2008        G52AIM Artificial Intelligence Methods
Starting Point for GP
   A run of genetic programming is a competitive search
among a diverse population of programs composed
of the available functions and terminals
   Genetic programming starts from a high-level
statement of the requirements of a problem and
attempts to produce a computer program that solves
the problem.
   The human user communicates the high-level
statement of the problem to the genetic
programming system by performing certain well-
defined 5 preparatory steps.

2007/2008            G52AIM Artificial Intelligence Methods
To Specify the GP Ingredient

2007/2008    G52AIM Artificial Intelligence Methods
5 Preparatory Steps of
Genetic Programming
(1) the set of terminals (e.g., the independent variables of the
problem, zero-argument functions, and random constants) for
each branch of the to-be-evolved program,

(2) the set of primitive functions for each branch of the to-be-
evolved program,

(3) the fitness measure (for explicitly or implicitly measuring the
fitness of individuals in the population),

(4) certain parameters for controlling the run, and

(5) the termination criterion and method for designating the
result of the run.

2007/2008               G52AIM Artificial Intelligence Methods
Function Set & Terminal Set
(The Important Components & alphabet of the programs to be made of)

   The identification of the function set and terminal set for a
particular problem (or category of problems) is usually a
straightforward process. This function set and terminal set is
useful for a wide variety of problems (and corresponds to the
basic operations found in virtually every general-purpose digital
computer).
For some problems:
   The function set may consist of merely the arithmetic functions
of addition, subtraction, multiplication, and division as well as a
conditional branching operator.
   The terminal set may consist of the program’s external inputs
(independent variables) and numerical constants.

2007/2008                 G52AIM Artificial Intelligence Methods
T&F and Fitness Measure
   The first two preparatory steps (Set of
Functions and Terminals) define the
search space
   whereas the fitness measure implicitly
specifies the search’s desired goal.

2007/2008        G52AIM Artificial Intelligence Methods
Ways for Measuring Fitness
1.   in terms of the amount of error between its output
and the desired output,
2.   the amount of time (fuel, money, etc.) required to
bring a system to a desired target state,
3.   the accuracy of the program in recognizing patterns
or classifying objects into classes,
4.   the payoff that a game-playing program produces,
or
5.   the compliance of a complex structure (such as an
antenna, circuit, or controller) with user-specified
design criteria.
6.   More…

2007/2008             G52AIM Artificial Intelligence Methods
T&F: Examples

Arithmetic formula                                       y 
                                      2     ( x  3)      
            5 1
   Logical formula      (x  true)  (( x  y )  (z  (x  y)))
i =1;
while (i < 20)
   Program                               {
i = i +1
}

Trees are a universal form for Representation

2007/2008             G52AIM Artificial Intelligence Methods
T&F:
Tree based representation

              y 
2     ( x  3)      
            5 1

2007/2008    G52AIM Artificial Intelligence Methods
T&F:
Tree based representation

(x  true)  (( x  y )  (z  (x  y)))

2007/2008    G52AIM Artificial Intelligence Methods
T&F:
Tree based representation

i =1;
while (i < 20)
{
i = i +1
}

2007/2008    G52AIM Artificial Intelligence Methods
Credit Scoring: Problem
   Bank wants to distinguish good from
   Model needed that matches historical
data
ID               No of children          Salary                Marital status   OK?

ID-1                          2                 45000                      Married      0
ID-2                          0                 30000                        Single     1
ID-3                          1                 40000                     Divorced      1
…
2007/2008                    G52AIM Artificial Intelligence Methods
Credit Scoring:
Rule Generation
A possible model:
IF (NOC = 2) AND (S > 80000) THEN good ELSE bad
In general:
IF formula THEN good ELSE bad
 Only unknown is the right formula, hence

 Our search space (phenotypes) is the set of formulas
(genotypes) is: parse trees
Natural fitness of a formula:
 percentage of well classified cases of the model

 it stands for natural representation of formulas

2007/2008          G52AIM Artificial Intelligence Methods
T&F: Tree Rep. of a Rule
IF (NOC = 2) AND (S > 80000) THEN good
Tree representation
AND

=                                                >

NOC          2                               S                80000

2007/2008           G52AIM Artificial Intelligence Methods
GP for Specific Problems
For many other problems, the
ingredients include
specialized functions and
terminals
such as:

2007/2008   G52AIM Artificial Intelligence Methods
Programming a Mopping Robot
   if the goal is to get genetic programming to
automatically program a robot to mop the
entire floor of an obstacle-laden room,
   the human user must tell genetic
programming what the robot is capable of
doing:
   the robot may be capable of executing
functions such as :
moving, turning, and swishing the mop

2007/2008          G52AIM Artificial Intelligence Methods
Synthesizing an Analog Electrical Circuit
   The function set may enable genetic
programming to construct circuits from
components such as transistors, capacitors,
and resistors.
   Once the human user has identified the
primitive ingredients for a problem of circuit
synthesis, the function set and terminal set
can be used to automatically synthesize an
amplifier, computational circuit, active filter,
voltage reference circuit, or any other circuit
composed of these ingredients.

2007/2008           G52AIM Artificial Intelligence Methods
Synthesizing an Amplifier
   if the goal is to get genetic programming to
automatically synthesize an amplifier Then:
   the fitness function is the mechanism for
telling genetic programming to synthesize a
circuit that amplifies an incoming signal (as
opposed to, say, a circuit that suppresses the
low frequencies of an incoming signal or a
circuit that computes the square root of the
incoming signal).
2007/2008          G52AIM Artificial Intelligence Methods
Example: Driving a Car
   a program that drives a car
   There is no ideal solution,
   There is no one solution to driving a car.
   Some solutions drive safely at the expense of time, while others drive fast at a
high safety risk.
   Therefore, driving a car consists of making compromises of speed versus safety,
as well as many other variables.
   In this case genetic programming will find a solution that attempts to
compromise and be the most efficient solution from a large list of variables.

   the program will find one solution for a smooth concrete highway, while it will
find a totally different solution for a rough unpaved road.

   Generally Speaking :

Genetic programming works best for several types of problems.

2007/2008                      G52AIM Artificial Intelligence Methods
(The fourth preparatory steps)

   The fourth preparatory step entails specifying the
control parameters for the run.
   The most important control parameter is the
population size.

   Other control parameters include the probabilities of
performing the genetic operations, the maximum size
for programs, and other details of the run

2007/2008               G52AIM Artificial Intelligence Methods
Initialisation
   Maximum initial depth of trees Dmax is set
   Full method (each branch has depth = Dmax):
   nodes at depth d < Dmax randomly chosen from function set F
   nodes at depth d = Dmax randomly chosen from terminal set T
   Grow method (each branch has depth  Dmax):
   nodes at depth d < Dmax randomly chosen from F  T
   nodes at depth d = Dmax randomly chosen from T
   Common GP initialisation: ramped half-and-half, where
grow & full method each deliver half of initial population

2007/2008                G52AIM Artificial Intelligence Methods
Termination
(The fifth preparatory step)

   To Specify the termination criterion and the method
of designating the result of the run.
   The termination criterion may include a maximum
number of generations to be run as well as a
problem-specific success predicate.
   In practice, one may manually monitor and manually
terminate the run when the values of fitness for
numerous successive best-of-generation individuals
appear to have reached a plateau.
   The single best-so-far individual is then harvested
and designated as the result of the run.

2007/2008                  G52AIM Artificial Intelligence Methods
4 Steps for Running GP
(to solve a problem)

1) Generate an initial population of random compositions of the
functions and terminals of the problem (computer programs).
2) Execute each program in the population and assign it a fitness value
according to how well it solves the problem.
3) Create a new population of computer programs.
i) Copy the best existing programs
ii) Create new computer programs by mutation.
iii) Create new computer programs by crossover ( reproduction).
4) The best computer program that appeared in any generation, the
best-so-far solution, is designated as the result of genetic programming
[Koza 1992].

2007/2008                  G52AIM Artificial Intelligence Methods
Running Genetic Programming
   After the human user has performed the preparatory
steps for a problem, the run of genetic programming
can be launched.
   Once the run is launched, a series of well-defined,
problem-independent executional steps (that is, the
flowchart of genetic programming) is executed.
   Important Note: Genetic programming is problem-
independent in the sense that the flowchart
specifying the basic sequence of executional steps is
not modified for each new run or each new problem.

2007/2008             G52AIM Artificial Intelligence Methods
Flowchart (Executional Steps) of
Genetic Programming

    There is usually no discretionary human intervention
or interaction during a run of genetic programming
(although a human user may exercise judgment as
to whether to terminate a run).
    The flowchart shows the genetic operations of
1.   crossover, (in this flowchart two-offspring version of the crossover operation)
2.   reproduction, and
3.   mutation as well as
4.   the architecture-altering operations.

2007/2008                     G52AIM Artificial Intelligence Methods
The figure below is a flowchart showing the executional steps of a run of genetic programming.

2007/2008                      G52AIM Artificial Intelligence Methods
Acknowledgements

Most of the lecture slides