# if_derivation

Document Sample

```					               SHORT-CIRCUIT EVALUATION OF BOOLEAN EXPRESSIONS
Let's consider this in connection with generating code for the IF statement. A suitable grammar for this
statement is:
IF_STATEMENT → SIMPLE_IF | SIMPLE_IF_ELSE_BLOCK

The production for the if statement without ELSE
SIMPLE_IF → IF_PREFIX TRUE_BLOCK ENDIF

Productions for the if statement including ELSE
SIMPLE_IF_ELSE_BLOCK → SIMPLE_IF_ELSE FALSE_BLOCK ENDIF
SIMPLE_IF_ELSE → IF_PREFIX TRUE_BLOCK ELSE

Productions employed for both cases
IF_PREFIX → IF BOOL_EXPRESSION
BOOL_EXPRESSION → BOOL_EXPRESSION OR BOOL_FACTOR
I BOOL_FACTOR
BOOL_FACTOR → BOOL_FACTOR AND BOOL_SECONDARY
| BOOL SECONDARY
BOOL_SECONDARY → NOT BOOL_PRIMARY I BOOL_PRIMARY
BOOL_PRIMARY → ARITH_EXPRESSION = ARITH_EXPRESSION
I ARITH_EXPRESSION <> ARITH_EXPRESSION
I ARITH_EXPRESSION > ARITH_EXPRESSION
I ARITH_EXPRESSION >= ARITH_EXPRESSION
I ARITH_EXPRESSION < ARITH_EXPRESSION
I ARITH_EXPRESSION <= ARITH_EXPRESSION
I ( BOOL_EXPRESSION )

For the purpose of readibility let's employ the macros:
The code required for the various productions is described below

BOOL_PRIMARY → ARITH_EXPRESSION = ARITH_EXPRESSION
For productions such as these we generate code to compare the arithmetic expressions, followed by a
conditional jump instruction which will be executed if the comparison involved is true. In the present case
we would generate in code_s:
JE 0000
coded as
0f 84 0000
and we would set \$\$.head_of_true_chain = code_i - 2
and                \$\$.head_of_false_chain = 0

As we carry out the present algorithm, for each boolean expression that we generate code for, we will
produce an entry on symbol stack whose head_of_true chain points to a chain of (the offset fields within)
conditional branch instructions that branch if the boolean expression as a whole is true, and whose
head_of_false_chain points to a similar chain for when the boolean expression is false. In the code
above, there are no branches for if the comparison is false, so \$\$.head_of_false_chain is 0. There is a
single branch for if the condition is true. Since it's the only one in the chain, it is itself the end of the chain.
So the offset field of the JE instruction is 0. \$\$.head_of_true chain here points to this offset field.
BOOL SECONDARY → NOT BOOL PRIMARY
Clearly all the code needs to do here is to set \$\$.head_of_false_chain to \$2.head_of_true_chain, and

BOOL_FACTOR → BOOL_FACTOR AND BOOL_SECONDARY
See steps 3 and 5 in the example below
Consider the conditional branch right at the end of the code for \$1 (ie. BOOL_FACTOR). If this were a
branch for if \$1 is true, then it would have to branch to the start of the code for evaluating \$3. On the
other hand if it were a branch for if \$1 is false, then it would constitute a branch for if \$\$ was false, and
the code would here avoid having to evaluate \$3. So the first thing to do is to ensure that the conditional
branch at the end of the code for \$1 is a branch for if \$1 is false. You can tell which it is by checking
which of \$1.head_of_false_chain or \$1.head_of_true_chain is larger. If the conditional branch instruction
is a branch for if \$1 is true, then you need to change it to a branch for the opposite condition to make it a
branch for if \$1 is false. It so happens that this can be done in all cases by merely reversing the
low_order bit of the opcode involved (i.e. of the byte following the 0f byte). This can be done by means of
an exclusive-or with 1. E.g. if x is the value of \$1.head_of_true_chain, one could use code such as:
code_s[x - 1] = code_s[x-1] ^ 0x'01'
You will now need to fixup the false chain, since it has been given a new head, and
amend the values of \$1.head_of_true_chain and \$1.head_of_false_chain accordingly. After all this is
done, you can make all the members of the true chain for \$1 (conditional) branches to the start of the
code for \$3. If y is the larger value of \$1.head_of_false_chain and \$1.head_of_true_chain, then the code
for \$3 starts at code_s[y+2]. Now join together the false chains for \$3 and \$1, since these all are
branches for if \$\$ is false. Finally set \$\$.head_of_true_chain to \$3.head_of_true_chain, and
\$\$.head_of_false_chain to the head of the combined false chains referred to above. Be careful to allow
for the case where the false chain for \$3 is empty.

BOOL EXPRESSION → BOOL EXPRESSION OR BOOL FACTOR
See steps 7 and 9 in the example below.
The code for this production follows similar principles to that described above for AND. In this case, in
order that the last conditional branch instruction in the code for \$1 be a branch that, if taken, avoids
having to evaluate \$3, it needs to be a branch for if \$1 is true. If it isn't then you should change it by the
means described above, and adjust the true chain for \$1, and the values of \$1.head_of_false_chain and
\$1.head_of_true_chain accordingly. Make,the members of the false chain for \$1 into branches to the
start of the code for \$3. Then join together the true chains for \$3 and \$1, since these all are branches for
if \$\$ is true. Finally set \$\$.head_of_false_chain to \$3.head_of_false_chain, and \$\$.head_of_true_chain
to the head of the combined true chains referred to above. Again be careful to allow for the case where
the true chain for \$3 is empty.

IF_PREFIX → IF BOOLEAN_EXPRESSION
As in the code for AND, ensure that the last branch in the code for \$2 is a branch if \$2 is false, as this is
the case where we do not evaluate the block of statements following. Adjust the false chain for \$2 and
\$2.head_of_false_chain and \$2.head _of_true_chain accordingly. Now make all the members of the true
chain pointed to by \$2.head_of_true_chain into branches to code_i. Set \$\$.head_of_false_chain to

Code for the IF statement without ELSE

SIMPLE_IF → IF_PREFIX TRUE_BLOCK ENDIF
Make all the members of the false chain pointed to by \$1.head_of_false_chain into branches to code i.
Code for the IF statement including ELSE

SIMPLE_IF _ELSE → IF_PREFIX TRUE_BLOCK ELSE
Generate an unconditional branch with offset 0000, and set \$\$.head_of_true_chain to point to this offset
field. Make all the member of the false chain pointed to by \$1.head_of_false_chain into branches to
code_i.

SIMPLE_IF _ELSE_BLOCK → SIMPLE_IF _ELSE FALSE_BLOCK ENDIF
Make the branch pointed to by \$1.head_of_true_chain into a branch to code_i.

NOTE
Yacc reports a shift-reduce conflict for most grammars written for the IF statement.
For example, with the statement:
If A then if B then C else D
the shift-reduce conflict occurs in the state reached when the next input symbol
is “else”. If the shift action is chosen, the statement will be interpreted as:
if A then (if B then C else D)
whereas if the reduce action is taken, it will be interpreted as:
if A then (if B then C) else D
As you can see, if A is false in the first case, then D will not be executed (nor will C), whereas if A is false
in the second case, then D will be executed.
Nearly all programming languages adopt the first interpretation (obtained taking the shift action). Since
the default action that Yacc takes in shift-reduce conflicts is in fact the shift action, the above conflict can
be ignored for such languages, and while it is possible to produce grammars for the If statement that
avoid the conflict, this is not necessary.
EXAMPLE

Let’s consider the problem of evaluating the short-circuit code for:
if (a and b and c) or (d and e and f) or (g and h and i)
go to place1
else
go to place2
where the letters a – i are conditions such as X > Y+2.
Let the letters A – I denote the code for evaluating a – i respectively.
By e.g. Cjt300h we mean the code for evaluating condition c followed by a jump to offset 300h in the
case where c is true. For example, if c is X > Y + 2, then Cjt300h could represent the following code:
mov ax, x
cmp ax, y
je 300h1

By Cjf300h we mean the code for the comparison as above, followed by a jump to offset 300h in the case
where c is false, ie. the last instruction should be changed to jne 300h

Using our productions for the if statement, the following is a derivation of the example’s if statement.

1
By je 300h we mean the conditional jump to offset 300h in the code segment, but je is a relative jump whose operand
indicates how many bytes to jump forward or backward, so the actual operand required to branch to offset 300h has to be
evaluated.
DERIVATION OF THE EXAMPLE’S IF STATEMENT

The “bool_” prefixes in the nonterminals have been omitted for brevity.
The reductions are numbered in the order they occur in a parse using an LR(1) parsing machine.
The derivation of “go to true-place” from true-block, and of “go to false-place” from false-block
are not shown.

if_statement
46
simple_if_else_block
45
simple_if_else false-block endif
44
if_prefix true-block else
43
if expression
42
expression or factor
28                    41
expression or factor                       secondary
14                 27                               40
factor             secondary                            primary
13                     26                                     39
secondary                   primary                                ( expression)
12                            25                                             38
bool_primary                     ( expression )                                       factor
11                                      24                                              37
( expression )                                     factor                                    factor and secondary
10                                             23                                        34          36
factor                                factor and secondary                    factor and secondary primary
9                                   20                 22                    31          33         35
factor and secondary                    factor and secondary primary                 secondary      primary      i
6              8                    17              19            21                  30         32
factor and secondary      primary         secondary            primary           f               primary        h
3              5         7                 16                 18                                 29
secondary     primary        c             primary                 e                                 g
2              4                           15
primary           b                             d
1
a
SNAPSHOTS OF THE PARSE OF THE EXAMPLE IF STATEMENT

Our snapshots show relavant parts of the content of the code segment and symbol stack after selected productions, identified
by the numbering given in the derivation above. After a production such as
bool_factor → bool_factor and bool_secondary
with our original method, in which the entries in symbol_stack where grammar symbols, we replaced the top three members of
the stack by bool_factor. Now we are employing a stack in which each entry is a pair of numbers, that we have named
2
head_of_true_chain and head_of_false_chain, and our algorithm is designed to make the head_of_true_chain point to a back
chain in the code segment of jumps that will occur if the code derived from bool_factor is true, and head_of_false_chain point
to a back chain of such jumps that will occur if the code involved is false. For illustration purposes, our snapshots will also
depict the source code corresponding to the code that has been generated

1. The situation after reduction 1: primary → a

primary
symbol stack (abbrev. ss)                   t=n1 f=0

code segment (abbrev. cs)                  AJt0
n1
source code involved (abbrev. sc)        ( a

Here the code for comparison a, followed by a jump if true with a zero operand has been generated in the code segment and
n1 is the offset of the operand field involved. In symbol stack the “t=n1 f=0” means the head of the true chain for \$\$ has been
set to n1, and the head of its false chain has been set to zero

2. After reduction 5: secondary → primary

factor and      secondary
ss        t =n1 f=0           t=n2 f=0

cs       AJt0                BJt0
n1                  n2
sc     ( a        and        b

3. After reduction 6: factor → factor and secondary

factor
ss               t=n2 f=n1

cs       AJf0                BJt0
n1                  n2
sc     ( a        and        b

4. After reduction 8: secondary → primary

factor            and      secondary
ss               t=n2 f=n1                    t=n3 f=0

cs       AJf0                BJt0            CJt0
n1                  n2              n3
sc     ( a        and        b       and     c

2
While one could use a C pointer for this purpose, by “points to” I mean here “supplies the offset in the code segment of”
5. After reduction 9: factor → factor and secondary
Note that we have changed the BJt to BJf , since we want to avoid evaluated c in the case where b is false. If b is true,
we would need to evaluate c to determine whether a and b and c is true or not. So the code BJt
would have to be a condition branch to C , and C would thus be evaluated whether or not B were true.

factor
ss                           t=n3 f=n2

cs       AJf0                  BJfn1               CJt0
n1                    n2                 n3

sc   ( a           and         b          and      c

6. After reduction 27: factor → secondary

expression                        or                                      factor
ss                           t=n3 f=n2                                                               t=n6 f=n5

cs       AJf0                  BJfn1               CJt0                     DJf0                      EJfn4                     FJt0
n1                    n2                 n3                        n4                        n5                       n6

n1                      n2                   n3

sc   ( a           and         b          and      c )         or       ( d               and          e          and           f )

7. After reduction 28: expresssion → expression or factor

expression
ss                                                                                               t=n6 f=n5

cs       AJfn3+2                BJfn3+2                CJt0                 DJf0                       EJfn4                    FJt0
n1                     n2                    n3                   n4                         n5                       n6

sc   ( a           and         b          and      c )         or       ( d               and          e          and           f )

8. After reduction 41: factor → secondary

expression                                     factor
ss                                                                   t=n6 f=n5                                    t=n9 f=n8

cs   AJfn3+2         BJfn3+2              CJt0     DJf0             EJfn4           FJt0                   GJf0         HJfn1          IJt0
n1              n2                   n3       n4               n5              n6                    n7            n8             n9

sc ( a     and           b      and       c ) or ( d     and        e   and         f )         or     ( g     and      h   and        i )
9. After reduction 42:     expression → expression or factor

if   expression
ss                                                                                                        t=n9 f=n8

cs   AJfn3+2        BJfn3+2        CJt0       DJfn6+2    EJfn6+2       FJt0                   GJf0          HJfn1             IJt0
n1             n2             n3         n4        n5            n6                     n7            n8                 n9

sc ( a   and        b      and     c ) or ( d      and   e     and     f )         or    ( g        and        h     and       i )

10. After reduction 43: if_prefix → if expression

if_prefix
ss                                                                                                        t=0 f=n9

cs   AJfn3+2        BJfn3+2        CJtn9+2    DJfn6+2    EJfn6+2      FJt n9+2                GJf0          HJfn1             IJf0
n1             n2             n3        n4          n5            n6                    n7            n8                 n9     n10

sc ( a   and        b      and     c ) or ( d      and   e     and     f )         or    ( g        and        h     and       i )

11. After reduction 44: simple_if_else → if_prefix tblk else

simple_if_else
ss                                                                                                                             t=n11 f=n9

cs   AJfn3+2        BJfn3+2        CJtn9+2    DJfn6+2    EJfn6+2      FJt n9+2           GJf11+2           HJfn11+2           IJfn11+2 tblk J0
n1             n2             n3        n4          n5            n6               n7                 n8                 n9    n10 n11

sc ( a   and        b      and     c ) or ( d      and   e     and     f )         or    ( g        and        h     and       i )     tblk   else

12. After reduction 45: simple_if_else_block → simple_if_else fblk endif

simple_if_else fblk endif
ss                                                                                                                          t=n11 f=n9

cs   AJfn3+2    BJfn3+2       CJtn9+2   DJfn6+2    EJfn6+2     FJt n9+2       GJf11+2          HJfn11+2            IJfn11+2     tblk Jn13 fblk
n1         n2            n3       n4          n5           n6             n7             n8                n9           n10 n11 n12 n13

sc ( a   and    b    and      c ) or ( d     and    e    and    f )   or     ( g        and     h        and       i )          tblk   else blk

```
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
 views: 2 posted: 9/22/2011 language: English pages: 8