Grammatical Evolution : Evolving Programs for an Arbitrary Language
Conor Ryan, JJ Collins, Michael O Neill
This article discusses a variation of GADS, Genetic Algorithm for Developing Software. This
paper first summarizes the Backus Naur Form(BNF) to describe the example problem, and
critics several problems which GADS can occur. To help these problems, this article suggests
GE, Grammatical Evolution, a different way of dealing chromosome in GA.
2. Contents of the paper
To describe a grammar-related problem, a lot of GA processes use BNF. BNF is a notation for
expressing the grammar of a language in the form of production rules. BNF grammars consists
of terminals T, non-terminals N, starting symbol S, and set of production rules P. For example, a
grammar expressing an mathematical expression can be described in BNF, like
and production rule P will be like
<expr>:=<expr><bin_op><expr> | (<expr>) | <un_op>(<expr>) | <var>
<bin_op>:= + | - | * | /
<un_op>:= sin | cos | tan | log | exp
<var> := Z
GADS uses fixed length linear chromosomes, each gene meaning the production rules to be
applied. If, when interpreting the genes, it doesn’t make syntactic sense, it is ignored.
This approach has at least two serious problems. First, if the interpreting reached the end of
chromosome, all the non-terminals unresolved are turned into default values, which are defined
for each production rules. These default values can harm 1 the evolution, because it is out of the
gene’s control. Second, the gene contains several ignored rules, and it can drastically change
the interpretation of the gene. To dealing with these problems, Paterson, the author of the
article which suggests GADS, recommends to increase the population size of genome size.
The author of this article suggests another way to interpreting the genes, GE. Like GADS, GE
also considers each gene means a production rule, but GE does not think this gene matches 1-
to-1 to the whole production rule. Instead, GE classifies all of the production rules with the
production symbol and chooses one out of the set of significant rules. Our gene only instructs
which rule to choose. For example, if the first unresolved non-terminal is <expr>, GE holds the
four production rules evolving from <expr>, and choose one rule out of these four with the next
gene (for example, taking modular 4).
This approach discards all the ‘ignored genes’ in GADS, because there is no gene ignored and
each gene has a meaning in interpreting the grammar. It drastically decreases the distance of
the genes and size of the schema, thus the search of GA gets more powerful. Moreover, we can
deal the unresolved symbol by reusing the genes.
3. Valuations of the paper
The main example suggested in this article was helpful to understand the main ideas, such as
(My thinking) This default values does not always harm the evolution. But “how the default rules are
defined” matters in evolution. For example, a mathematical expression, suppose <expr> are resolved as Z,
<op> as +. Then, if Z is in the range of [-1..+1], our expression Z9, which is generated by the chromosome
is “devoured” by default rule +Z. That is why the author mentions the default rules can harm the evolution.
works of GADS or GE. The words were easy to understand, and the approach was really easy
to follow if the readers have some base knowledge about GA.
But there were several points that does not help the readers. First, the test result was not
clearly explained. The article just says ‘the GE consistently found the solution’, not explaining
the solution itself or how it was valuated, etc. Also, this article only deals only one problem,
mathematical expression. Another example could be helpful to express the power of GE. The
effect of default rules harming the evolution is not also described well.
4. Relations to other papers
First published in 1998, this article is cited by a bunch of other articles 2, and proves the power
of GE in a lot of problems. It had several modifications, and still used in knapsack problem,
numerical approximation, fractal curves, regular expression generation, and generating code of
VERILOG and even musical scores.