Document Sample

Genetic Programming Tutorial Byoung-Tak Zhang Artificial Intelligence Lab (SCAI) Dept. of Computer Engineering Seoul National University btzhang@scai.snu.ac.kr This material available online at http://scai.snu.ac.kr/~btzhang/ http://scai.snu.ac.kr/~btzhang/ The Second Asia Pacific Conference on Simulated Evolution and Learning (SEAL98) Canberra, Australia Tuesday, 24 November 1998 2:00 - 5:30 PM Genetic Programming Tutorial, B.T. Zhang 1 Genetic Programming Tutorial, B.T. Zhang 2 Outline M Introduction Background on Evolutionary Algorithms (EAs) M Genetic Programming (GP) Representation, Genetic Operators, Running a GP M GP Applications AI, Alife, Engineering, Science Introduction M Advanced Topics Variants of Genetic Programming Techniques for Enhancing GP Performance M Guidelines Promising Application Areas Research Issues M Further Information on GP Genetic Programming Tutorial, B.T. Zhang 3 Genetic Programming Tutorial, B.T. Zhang 4 Evolutionary Algorithms (EAs) Analogy to Evolutionary Biology M A computational model inspired by natural evolution M Individual (Chromosome) = Possible solution and genetics M Population = A collection of possible solutions M Proved useful for search, machine learning and M Fitness = Goodness of solutions optimization Population-based search (vs. point-based search) M Selection (Reproduction) = Survival of the fittest M M Probabilistic search (vs. deterministic search) M Crossover = Recombination of partial solutions M Collective learning (vs. individual learning) M Mutation = Alteration of an existing solution M Balance of exploration (global search) and exploitation (local search) Genetic Programming Tutorial, B.T. Zhang 5 Genetic Programming Tutorial, B.T. Zhang 6 Simulated Evolution Canonical Evolutionary Algorithm Population (chromosomes) begin Decoded t=0 /* generation */ Offspring strings initialize P(t) /* population */ New evaluate P(t) generation while (not termination-condition) do Parents begin Genetic Evaluation operators (fitness) t = t+1 select P(t) from P(t-1) /* selection */ Manipulation crossover-mutate P(t) /* genetic operators */ Mating evaluate P(t) /* fitness function */ Selection Reproduction end (mating pool) end Genetic Programming Tutorial, B.T. Zhang 7 Genetic Programming Tutorial, B.T. Zhang 8 Variants of Evolutionary Algorithms Genetic Operators for Bitstring M Evolutionary Programming (EP) Chromosomes Fogel et al., 1960’s FSMs, mutation only, tournament selection • Reproduction: make copies of chromosome (the fitter the M Evolution Strategy (ES) chromosome, the more copies) Rechenberg and Schwefel, 1960’s Real values, mainly mutation, ranking selection 10000100 10000100 M Genetic Algorithm (GA) 10000100 Holland et al., 1970’s • Crossover: exchange subparts of two chromosomes Bitstrings, mainly crossover, proportionate selelection 100|00100 10011111 M Genetic Programming (GP) 111|11111 11100100 Koza, 1992 Trees, mainly crossover, proportionate selection • Mutation: randomly flip some bits M Others 00000100 00000000 Genetic Programming Tutorial, B.T. Zhang 9 Genetic Programming Tutorial, B.T. Zhang 10 Selection Selection Schemes Create random initial population M Proportionate selection Reproduce offspring in proportion to fitness fi. Insert to Evaluate population M Ranking selection population Select individuals according to rank(fi). Select individuals for variation M Tournament selection Choose q individuals at random, the best of which survives. Vary M Generational vs. steady-state Genetic Programming Tutorial, B.T. Zhang 11 Genetic Programming Tutorial, B.T. Zhang 12 Theory of Bitstring EAs Schema Theorem M Assumptions f (H , t) d (H ) o( H ) Bitstrings of fixed size m( H , t + 1) ≥ m( H , t ) 1 − pc (1 − pm ) f (t ) n −1 Proportionate selection M Definitions Schema H: A set of substrings (e.g., H = 1**0) m( H , t ) Number of members of H Order o: number of fixed positions (FP) (e.g., o(H) = 2) pc , p m Probability of crossover and mutation, respectively Defining length d: distance between leftmost FP and rightmost FP (e.g., d(H) = 3) M [Holland, 1975] Interpretation: Fit, short, low-order schemata (or building blocks) exponentially grow. Genetic Programming Tutorial, B.T. Zhang 13 Genetic Programming Tutorial, B.T. Zhang 14 Some Applications of EAs M Optimization (e.g., numerical optimization, VLSI circuit design, gas turbine design, factory scheduling) M Automatic Programming (e.g., automatic induction of LISP programs, evolving optimal sorting algorithms) M Complex Data Analysis and Time-Series Prediction Genetic Programming (GP) (e.g., prediction of “chaotic” technical systems, financial market prediction, protein-structure analysis) M Machine and Robot Learning (e.g., rule induction for expert systems, evolutionary learning of neural networks, cooperation of multiple mobile agents, robot navigation) Genetic Programming Tutorial, B.T. Zhang 15 Genetic Programming Tutorial, B.T. Zhang 16 GP Trees GP Tree: An Example M Genetic programming uses variable-size tree- S-expression: (+ 1 2 (IF (> TIME 10) 3 4)) representations rather than fixed-length strings Terminals = {1, 2, 3, 4, 10, TIME} of binary values. Functions = {+, >, IF} M Program tree + = S-expression = LISP parse tree 1 2 IF M Tree = Functions (Nonterminals) + Terminals > 3 4 TIME 10 Genetic Programming Tutorial, B.T. Zhang 17 Genetic Programming Tutorial, B.T. Zhang 18 A GP Tree for Kepler’s Law GP as Automatic Programming M GP-tree representation of Kepler’s third law: M GP evolves a program for solving a class of problem P2 = cA3 instances. The solution found by GP is a program that PROGRAM ORBITAL_PERIOD solves many problem instances. C # Mars # SQRT M GP is an automatic programming method. A = 1.52 P = SQRT(A * A * A) Problem Instance Xi PRINT P * END ORBITAL_PERIORD Fitness (defun orbital_period () Genetic GP A * Cases ; Mars ; Programming Program { Xi -> Yi } (setf A 1.52) A A (sqrt (* A (* A A)))) Solution Yi to Problem Xi Genetic Programming Tutorial, B.T. Zhang 19 Genetic Programming Tutorial, B.T. Zhang 20 Genetic Programming Setting Up for a GP Run Procedure 1. The set of terminals 1. Choose a set of possible functions and terminals for the program: F = {+, -, *, /, √ }, T = {A}. 2. The set of functions 2. Generate an initial population of random trees 3. The fitness measure (programs) using the set of possible functions and terminals. 4. The algorithm parameters population size, maximum number of generations 3. Calculate the fitness of each program in the population by running it on a set of “fitness cases” (a set of input crossover rate and mutation rate for which the correct output is known). maximum depth of GP trees etc. 4. Apply selection, crossover, and mutation to the population to form a new population. 5. The method for designating a result and the criterion for terminating a run. 5. Repeat steps 3 and 4 for some number of generations. Genetic Programming Tutorial, B.T. Zhang 21 Genetic Programming Tutorial, B.T. Zhang 22 Crossover: Subtree Exchange Mutation + + × + × √ + + b a b b √ √ √ / × / + a a b b √ a b b √ √ - + + a a b b a × × + √ √ √ a b b √ b a b a Genetic Programming Tutorial, B.T. Zhang 23 Genetic Programming Tutorial, B.T. Zhang 24 Example GP Run: Majority Majority: Best Program at Generation 0 OR M Problem: Given five binary inputs x1, x2, …, Training error x5, return y = 1 if three or more of xi are 1 and = 4/20 output y=0 otherwise. AND x4 Generalization error = 8/32 M Fitness cases given (20 out of 32): x1 x2 x3 x4 x5 ==> y OR x3 ------------------- 0 0 0 0 0 ==> 0 0 0 0 0 1 ==> 0 0 0 1 0 1 ==> 0 0 0 1 1 0 ==> 0 x5 OR 0 0 1 1 1 ==> 1 ….. 1 1 0 0 0 ==> 0 x1 x1 1 1 0 0 1 ==> 1 1 1 1 0 1 ==> 1 Genetic Programming Tutorial, B.T. Zhang 25 Genetic Programming Tutorial, B.T. Zhang 26 Majority: Best Program at Majority: Best Program at Generation 1 Generation 11 Training error OR Training error OR = 3/20 = 2/20 Generalization error Generalization error x2 AND AND AND = 8/32 = 6/32 x4 x1 x4 x1 x2 x3 Genetic Programming Tutorial, B.T. Zhang 27 Genetic Programming Tutorial, B.T. Zhang 28 Majority: Best Program at Majority: Evolution of Fitness Generation 17 Values OR Training error = 1/20 Generalization error AND AND = 5/32 x3 x2 x4 OR x1 x5 Genetic Programming Tutorial, B.T. Zhang 29 Genetic Programming Tutorial, B.T. Zhang 30 A List of GP Applications M Genetic programming has been applied to a wide range of problems in artificial intelligence, Genetic Programming artificial life, engineering, and science, including Applications the following: Symbolic Regression Multi-Agent Strategies Simulated Robotic Soccer Time Series Prediction Circuit Design Evolving Neural Networks Genetic Programming Tutorial, B.T. Zhang 31 Genetic Programming Tutorial, B.T. Zhang 32 Symbolic Regression Symbolic Regression: Fitness Cases [Koza, 1998] M Given: a set of N data points Independent Variable X Dependent Variable Y -1.0 0.0 -0.9 -0.1629 D = { (xi , yi ) | i =1,...,N } -0.8 -0.7 -0.2624 -0.3129 -0.6 -0.3264 -0.5 -0.3125 Find: a symbolic expression of the function f that -0.4 -0.3 -0.2784 -0.2289 minimizes the error measure: -0.2 -0.1 -0.1664 -0.0909 0 0.0 N 0.1 0.1111 E f ( D) = ∑ ( yi − f (x i )) 2 0.2 0.3 0.2494 0.4251 i =1 0.4 0.6496 0.5 0.9375 M Useful for system identification, model building, 0.6 0.7 1.3056 1.7731 empirical discovery, data mining, and time series 0.8 2.3616 0.9 3.0951 prediction. 1.0 4.0000 Genetic Programming Tutorial, B.T. Zhang 33 Genetic Programming Tutorial, B.T. Zhang 34 Symbolic Regression: Symbolic Regression: Experimental Setup Generation 0 M Median individual with raw fitness of 23.67 Objective: Find a function of one independent variable, in symbolic form, that fit a given sample of 20 (x i, y i) data point. (COS (COS (+ (- (* x x) (% x x) ) x) ) ) Terminal set: x (the independent variable). Function set: +, -, *, %, SIN, COS, EXP, RLOG Fitness cases: The given samples of 21 data points (x i, y i) where the x i come from the interval [-1, +1]. Raw fitness: The sum, taken over the 21 fitness cases, of the absolute value of difference between value of produced by the individual program and the target values y i of the dependent variable. Standardized fitness: Equals raw fitness. Hits: Number of fitness cases (0-21) for which the value of the dependent variable produced by the individual program comes within 0.01 of the target value y i of the dependent variable. Wrapper: None. Parameters: M = 500, G = 51 Success Predicate: An individual program scores 21 hits. Genetic Programming Tutorial, B.T. Zhang 35 Genetic Programming Tutorial, B.T. Zhang 36 Symbolic Regression: Symbolic Regression: Generation 34 Observations M Best-of-run individual with + M GP works on this problem. raw fitness of 0.00 (100% x * M The answer is algebraically correct (hence no correct) + x further cross validation is needed) x * M It’s not how a human programmer would have (+ x (* (+ x (* (* (+ * x x (- (COS (- x x) ) (- x written it. + x x) ) ) x) x) ) x) ) Not parsimonious x - COS X - X cos - Equivalent to x4 + x3 + x2 + x M The extraneous functions - SIN, EXP, RLOG, - x x and (effectively) RCOS are all absent in the x x best individual of generation 34. Genetic Programming Tutorial, B.T. Zhang 37 Genetic Programming Tutorial, B.T. Zhang 38 Multi-Agent Strategies Multi-Agent Strategies: [Benenett III, 1996] Fitness Function M The Foraging Problem ∑ (t n food * f food ) + ∑ (t max * f max * d food ) m 32×32 grid for the ant colony 1,000,000 world n = Number of food pellets transported to the nest Two food locations with 72 tfood = Number of time steps elapsed when the food pellet arrived at the nest food pellets (black) ffood = Number of sequential IF functions executed by the The nine grid locations of the ant who transported the food pellet nest (gray) m = Number of food pellets not transported to nest. M Objective tmax = Maximum allotted time step = 4,000 fmax = Maximum possible value of ffood = 400,000 Find a multi-agent parallel algorithm that causes dfood = Manhattan distance between food pellet and nest efficient central-place foraging behavior in the ant colony. pmax = Maximum number of points per agent = 100 Genetic Programming Tutorial, B.T. Zhang 39 Genetic Programming Tutorial, B.T. Zhang 40 Multi-Agent Strategies: Multi-Agent Strategies: Experimental Setup Results The best individual of the run appeared in generation M Function set 90, had a fitness value of 7.4, and scored 144 hits. IF_FOOD_HERE, IF_FOOD_FORWARD, IF_CARRYING_FOOD, IF_NEST_HERE, IF_FACING_NEST, IF_SMELL_FOOD, IF_SMELL_PHEROMONE, IF_PHEROMONE_FORWARD M Terminal set MOVE_FORWARD, TURN_RIGHT, TURN_LEFT, MOVE_RANDOM, GRAB_FOOD, UNCONDITIONAL_DROP_PHEROMONE, NO_ACTION M Parameters Population size: M = 64,000 Maximum number of generations: G = 100 Genetic Programming Tutorial, B.T. Zhang 41 Genetic Programming Tutorial, B.T. Zhang 42 Simulated Robotic Soccer Robot Soccer: Fitness Function [Cho and Zhang, 1998] M For Dashing behavior to the ball M Environment for Dash-and-Dribble Behavior 4 f1 = ∑ {c1 max( X r , Yr ) + c2 S r + c3Cr − c4 M r+ K } 22×14 grid soccer field r =1 a ball and a target position M For Dribbling behavior to the target position 4 offensive robots (moving in 8 directions) 4 f 2 = ∑ {c1 max( X r , Yr ) + c2 S r + c3Cr − c4 M r+c5 Ar + K } 11 opponent robots (obstacles) r =1 Symbol Description Xr x-axis distance between target and robot r Yr y-axis distance between target and robot r Sr number of steps moved by robot r Cr number of collisions made by robot r Mr distance between starting and final position of robot r Ar penalty for moving away from other robots ci coefficient for factor i K positive constant Genetic Programming Tutorial, B.T. Zhang 43 Genetic Programming Tutorial, B.T. Zhang 44 Robot Soccer: Experimental Setup Robot Soccer: Cooperative Behaviors of Robots Prameter Value Terminal set FORWARD, AVOID, RANDOM- MOVE, STOP, TURN-TARGET TURN-BALL Function set IF-BALL, IF-ROBOT, IF-TARGET, IF-OPPONENT, PROG2, PROG3 Fitness cases 20 training worlds, 20 test worlds Robot world 32 by 32 grid, 64 obstacles, 1 ball to dribble Population size 100 Max generation 200 Crossover rate 1.0 Mutation rate 0.1 Max tree depth 10 Selection scheme truncation selection with elitism Training case Test case Genetic Programming Tutorial, B.T. Zhang 45 Genetic Programming Tutorial, B.T. Zhang 46 Time Series Prediction Time Series Prediction: [Oakley, 1996] Example M Given: τ previous values in a time series M The Mackey-Glass delay differential series dxt bxt −∆ = − axt x(t ) = ( x(t ), x(t − 1), , x(t − τ )) dt 1 + xt − ∆ c Find: a function f which predicts the next value a = 0.1, b = 0.2, c = 10.0, and ∆ = 30.0 of the series x(t + 1) = f (x(t )) = f ( x(t ), x(t − 1), , x(t − τ )) M Examples: Logistic map, sun-spots, stock price index, currency exchange rate Genetic Programming Tutorial, B.T. Zhang 47 Genetic Programming Tutorial, B.T. Zhang 48 Time Series Prediction: Time Series Prediction: Results Experimental Setup Objective Predict next 65 points at 5 places in series Frequency distribution of Summary of the fittest S-expressions Terminal set Embedded data at t = 1, 2, 3, 4, 5, 6, 11, 16, 21, 31; R generations at which fittest Function set +, -, %, * Series Mackey-Glass Fitness cases Actual members of the data series S-expression was found: Number 50 Raw fitness Sum over the 325 fitness cases of squared error between Mean generations to fittest 13.38 predicted and actual points Std. dev. of generation 16.46 Standardized fitness Same as raw fitness Median generations 5 Hits Predicted and actual points are within 0.001 of each other No. of generations ≥ 25 10 Wrapper None No. of generations ≥ 40 9 Parameters M = 500, G = 51 Mean best fitness 10.22 Success predicate None Std. dev. of best fitness 3.371 Max. depth of new individuals 6 Median best fitness 10.51 Max. depth of new subtrees for mutants 4 No. of duplicate fitnesses 3 Max. depth of individuals after crossover 17 Overall best fitness 3.851 Fitness-proportionate reproduction fraction 0.1 Typical linear fitness 11.44 Crossover at any point fraction 0.2 Mean left parentheses 9.120 Crossover at function points fraction 0.7 Std. dev. of left parens. 17.64 Selection method Fitness-proportionate (by normalized fitness) Median left parens. 3 Generation method Ramped half-and-half No. left parens ≥ 10 8 No. left parens ≥ 20 6 Genetic Programming Tutorial, B.T. Zhang 49 Genetic Programming Tutorial, B.T. Zhang 50 Circuit Design [Koza et al., 1997] Circuit Design: Functions M Component-creating functions Resistor R, capacitor C, inductor L Diode D, transistor QT0, Logical AND0 function M Connection-creating functions SERIES division function PSS and PSL parallel division function STAR1 division function VIA0 function Genetic Programming Tutorial, B.T. Zhang 51 Genetic Programming Tutorial, B.T. Zhang 52 Circuit Design: Fitness Evaluation Evolving Neural Networks [Zhang et al., 1993, 1995] IN z0 OUT M Genetic operators are used to adapt Connection weights Program Tree Embryonic Circuit Network topology Fully Designed Circuit (NetGraph) Network size Circuit Netlist (ascii) Neuron types Circuit Simulator (SPICE) using the neural tree representation Circuit Behavior (Output) scheme Fitness Genetic Programming Tutorial, B.T. Zhang 53 Genetic Programming Tutorial, B.T. Zhang 54 Evolving Neural Networks: Evolving Neural Networks: Method Neural Tree Representation M Neural trees are used as genotype for the Generate M Networks evolution of neural networks. Evaluate Fitness of Nets Fitness Function M Nonterminal nodes: neural units M Terminal nodes: input units yes Acceptable Net Found? STOP M Root node: output unit no M Links: connection weights wij from j to i Select Fitter Networks Selection Strategy M Layer of node i : path length of the longest path to a terminal node of the substrees of i. Create M New Networks Genetic Operators Genetic Programming Tutorial, B.T. Zhang 55 Genetic Programming Tutorial, B.T. Zhang 56 Evolving Neural Networks: Evolving Neural Networks: A Neural Tree Features of Neural Trees M Expressiveness: arbitrary feedforward networks of heterogeneous neurons can be represented by neural trees. M Parsimony: sparse networks with partial x2 x5 x7 x1 x3 x4 x7 connectivity M En/decoding: genotype and phenotype equivalent in functionality x1 x3 x5 x2 x4 x5 x6 M Examples: sigma-pi neural networks. Genetic Programming Tutorial, B.T. Zhang 57 Genetic Programming Tutorial, B.T. Zhang 58 Evolving Neural Trees: Evolutionary Neural Trees: Structural Adaptation by Crossover Results for Mackey-Glass Time Series M Neuron type, topology, size and shape of networks are adapted by crossover. dx(t ) ax(t − ) = − bx(t ) dt 1 + x10 (t − x(t + 10 ) = (x (t ), x (t + 1),..., x (t + 9)) Genetic Programming Tutorial, B.T. Zhang 59 Genetic Programming Tutorial, B.T. Zhang 60 Evolutionary Neural Trees: Evolutionary Neural Trees: Results for Mackey-Glass Data Neural Trees Evolved Genetic Programming Tutorial, B.T. Zhang 61 Genetic Programming Tutorial, B.T. Zhang 62 Evolutionary Neural Trees: Evolutionary Neural Trees: Results for Far-Infrared NH3 Laser Comparison to Back-Propagation Networks Hidden Num. Training Prediction Method Units Weights Error Error Neural trees 30 153 0. 52 0. 58 Backpropagation 1 100 601 0.53 0.56 Backpropagation 2 300 1801 0.69 0.84 Genetic Programming Tutorial, B.T. Zhang 63 Genetic Programming Tutorial, B.T. Zhang 64 Evolutionary Neural Trees: Performance for Test Data Advanced Topics Target Function Difference between True Values and Predicted Values for the Test Data Computed Approximation Programming Tutorial, B.T. Zhang Genetic 65 Genetic Programming Tutorial, B.T. Zhang 66 Variants of Genetic Stack-Based GP [Perkis, 1994] Programming M Expressions in trees can be rewritten in postfix Stack-based GP notation. Strongly-typed GP Linear GP Y COS 5 * X X Y - / + Ontogenetic GP Y COS 5 * X X Y - / + Cellular GP Breeder GP Genetic Programming Tutorial, B.T. Zhang 67 Genetic Programming Tutorial, B.T. Zhang 68 Strongly-Typed GP Linear GP [Nordin and Banzhaf, 1993] [Montana, 1995] M STGP = Strongly Typed Genetic Programming Tree-based Genome Linear Genome M Motivation 01100 * a=a+x 00101 Don’t create and evaluate trees that are syntactically illegal (or at least silly) with respect to the data. 11100 Provide a good way to specify constraints from the input space. 11110 + 6 b=a+c M STGP only really makes sense if the input data is typical 00111 M Mutation and Crossover must now respect the type 11011 constraints. 1 * 10101 c=b+6 00111 M Generic functions: Argument types determine return type 01011 M Generic data-types: e.g. “List-of-?” where “?” is 11001 instantiated at runtime. X 4 a=b+7 10100 Genetic Programming Tutorial, B.T. Zhang 69 Genetic Programming Tutorial, B.T. Zhang 70 Linear GP: Binary GP & Compiling GP Linear GP: Crossover in CGP save save a=0 a=0 b=0 Parents b=0 a=a+b a=a+b Phenotype c=a+7 d=c*b c=a+7 d=c*b (High-level language constructs) c=b-a c=b-a a=b+d a=b+d restore restore return return Binary Genotype-Phenotype Compilation GP Mapping save save a=0 a=0 b=0 b=0 Compiling GP a=a+b c=b-a c=a+7 d=c*b a=b+d d=c*b a=a+b c=b-a Phenotype (bit strings) Executable Code c=a+7 restore a=b+d restore return Children return Genetic Programming Tutorial, B.T. Zhang 71 Genetic Programming Tutorial, B.T. Zhang 72 Linear GP: CGP Crossover in Bitstrings Linear GP: Mutation in CGP M Crossover between instructions M Mutation in operands crossover point point mutation 32 bit instruction 32 bit instruction Op-code Operand1 Operand2 Register address allowed? M Crossover within instructions M Mutation in op-code Constant value allowed crossover point = protected field point mutation Op-code allowed? 32 bit instruction 32 bit instruction Op-code Operand1 Operand2 Genetic Programming Tutorial, B.T. Zhang 73 Genetic Programming Tutorial, B.T. Zhang 74 Ontogenetic GP Cellular GP [Gruau, 1992] [Spector and Stoffel, 1996] GP trees (genotype) are used to construct neural M Phylogeny: Development of a population over networks (phenotype). evolutionary time The fitness of the genotype is measured through the M Ontogeny: Development of an individual over its lifetime performance of the phenotype on the desired task. M Linear genome of GP terminals and non-terminals M Addition of ontogenetic operators segment-copy copies part of the linear program over another part of the program shift-left rotates the program to the left shift-right rotates the program to the right Push-x % - shift-left push-x noop * * dup % - + push-x % dup % shift-right dup shift-left push-x shift-right * + shift-right - - push-x Genetic Programming Tutorial, B.T. Zhang 75 Genetic Programming Tutorial, B.T. Zhang 76 Breeder GP (BGP) Breeder GP: Motivation for GP Theory [Zhang and Muehlenbein, 1993, 1995] M In GP, parse trees of Lisp-like programs are used ES (real-vector) GA (bitstring) GP (tree) as chromosomes. M Performance of programs are evaluated by Muehlenbein et al. training error and the program size tends to grow (1993) Breeder GA (BGA) as training error decreases. (real-vector + bitstring) M Eventual goal of learning is to get small Zhang et al. (1993) generalization error and the generalization error Breeder GP (BGP) tends to increase as program size grows. (tree + real-vector + bitstring) M How to control the program growth? Genetic Programming Tutorial, B.T. Zhang 77 Genetic Programming Tutorial, B.T. Zhang 78 Breeder GP: MDL-Based Fitness Breeder GP: Adaptive Occam Functions Method [Zhang et al., 1995] Fi (t ) = Ei (t ) + α (t )Ci (t ) F ( A | D) = FD + FA = βE ( D | A) + αC ( A) − 2 Ebest (t − 1) N if Ebest (t − 1) > ε Cbest (t ) α (t ) = 1 N −2 otherwise E ( D | A) Training error of program A for data set D Ebest (t − 1)Cbest (t ) C ( A) Structural complexity of program A ε Desired performance level in error α, β Relative importance to be controlled Ebest (t − 1) Training error of best progr. at gen t-1 Cbest (t ) Complexity of best progr. at gen. t Genetic Programming Tutorial, B.T. Zhang 79 Genetic Programming Tutorial, B.T. Zhang 80 Promising Application Areas M Problem areas where a good approximate solution (but not necessarily optimal solution) is satisfactory Guidelines (e.g., AI and AL applications). M Problem areas where discovery of functional structure (as apposed to parameter estimation) is a major part of the problem (e.g., symbolic regression). M Problem areas involving many variables whose inter-relationship is not well understood (e.g., structural design). Genetic Programming Tutorial, B.T. Zhang 81 Genetic Programming Tutorial, B.T. Zhang 82 Promising Application Areas Research Issues (1/3) Cont’d M Speed-up methods for GP runs M Problem areas where data are observable but Parallel implementation of GP [Koza et al. 96] [Stoffel underlying structure is not known (e.g., discovery & Spector 96] of rules in data). Training subset selection [Gathercole & Ross 97] [Zhang & Cho 98] [Zhang & Joung 98] M Problem areas where primitive functions can be guessed but their combinations are not well M Issues of introns and program growth control understood (e.g., circuit design). Introns and bloat [Langdon 97] [Rosca & Ballard 97] [Soule & Foster 97] [Banzhaf 97] M Problem areas where programming by hand is Fixed complexity penalty [Iba et al. 94] [Rosca et al. 97] difficult (e.g. multi-agent strategies) Adaptive Occam method for controlling bolat [Zhang & Muehlenbein 93, 95] Genetic Programming Tutorial, B.T. Zhang 83 Genetic Programming Tutorial, B.T. Zhang 84 Research Issues (2/3) Research Issues (3/3) M Finding and exploiting parameterizable M Intelligent crossover and mutation [Luke and submodules Spector 97] [Angeline 97] [Poli and Langdon ADF [Koza 94] [O’Reilly 96] 98] GLiB [Angeline 93] M Handling vectors and complex data structures AR [Rosca 94], ARL [Rosca & Ballard 96] [Langdon 98] Libraries [Teller & Veloso 95] [Zhang et al. 97] M Automatic setup of GP parameters [Angeline ADM [Spector 96] 96] Architecture Altering Operations [Koza 95] M Employing more general program constructs, such as recursion, iteration, and internal states. Genetic Programming Tutorial, B.T. Zhang 85 Genetic Programming Tutorial, B.T. Zhang 86 Web Sites and E-mail Lists M Web sites Genetic programming home page: http://www.genetic-programming.org/ Further Information M Genetic programming (GP) list To subscribe, send e-mail message to: Genetic-Programming-Request@CS.Stanford.Edu The body of the message must consist of exactly the words: subscribe genetic-programming M GP bibliography William Langdon of the University of Birmingham maintains a bibliography on GP at http://www.cs.bham.ac.uk/~wbl Genetic Programming Tutorial, B.T. Zhang 87 Genetic Programming Tutorial, B.T. Zhang 88 Upcoming GP-Related Texts on GP Conferences M Genetic Programming: On Programming Computers by Means of Natural Selection, John Koza, MIT Press, 1992. Genetic and Evolutionary Computation Conference M Genetic Programming II: Automatic Discovery of Reusable (GECCO-99) Programs, John Koza, MIT Press, 1994. N http://www-illigal.ge.uiuc.edu/gecco/ M Genetic Programming: An Introduction, Wolfgang Banzhaf et al., Second European Conference on Genetic Morgan Kaufmann Publishers, 1998. Programming (EuroGP-99) M Genetic Programming and Data Structures: Genetic N http://www.cs.bham.ac.uk/~rmp/eebic/eurogp99 Programming + Data Structures = Automatic Programming, IEEE Congress on Evolutionary Computation William B. Langdon, Kluwer, 1998. (CEC-99) N http://garage.cps.msu.edu/cec99/ Genetic Programming Tutorial, B.T. Zhang 89 Genetic Programming Tutorial, B.T. Zhang 90 GP Conference/Workshop AiGP Series and Journals Proceedings M Proceedings of GP Conferences M Advances in Genetic Programming (AiGP) Series J. Koza et al. (Eds.) Genetic Programming 1996: Proceedings of the K. E. Kinnear Jr. (Ed.) Advances in Genetic Programming, MIT First Annual Conference, July 28-31, 1996, Stanford University, Press, 1994. MIT Press, 1996. P. J. Angeline and K. E. Kinnear Jr. (Eds.) Advances in Genetic J. Koza et al. (Eds.) Genetic Programming 1997: Proceedings of the Programming 2, MIT Press, 1996. Second Annual Conference, July 13-16, 1997, Stanford University, Morgan Kaufmann, 1997. L. Spector, W. B. Langdon, U.-M. O’Reilly, and P. Angeline (Eds.) Advances in Genetic Programming 3, MIT Press, 1999. J. Koza et al. (Eds.) Genetic Programming 1998: Proceedings of the Third Annual Conference, July 22-25, 1998, University of M Selected journals for GP and EC in general Wisconsin, Madison, Morgan Kaufmann, 1998. Genetic Programming and Evolvable Machines, Kluwer (in M Proceedings of EuroGP Workshops preparation) W. Banzhaf, R. Poli, M. Schoenauer, and T. C. Forgaty (Eds.) Evolutionary Computation, MIT Press. Genetic Programming: First European Workshop. EuroGP’98, IEEE Transactions on Evolutionary Computation, IEEE Press. April, 1998, Paris, France, Lecture Notes in Computer Science, Volume 1391, Springer-Verlag, 1998. Genetic Programming Tutorial, B.T. Zhang 91 Genetic Programming Tutorial, B.T. Zhang 92

DOCUMENT INFO

Shared By:

Categories:

Tags:
genetic programming, genetic algorithms, Genetic Programming Tutorial, neural networks, machine learning, Evolutionary Computation, Artificial Intelligence, John R. Koza, Computer Science, genetic algorithm

Stats:

views: | 13 |

posted: | 6/13/2011 |

language: | English |

pages: | 23 |

How are you planning on using Docstoc?
BUSINESS
PERSONAL

By registering with docstoc.com you agree to our
privacy policy and
terms of service, and to receive content and offer notifications.

Docstoc is the premier online destination to start and grow small businesses. It hosts the best quality and widest selection of professional documents (over 20 million) and resources including expert videos, articles and productivity tools to make every small business better.

Search or Browse for any specific document or resource you need for your business. Or explore our curated resources for Starting a Business, Growing a Business or for Professional Development.

Feel free to Contact Us with any questions you might have.