Top Down Operator
Vaughan R. Pratt
Massachusetts Institute of Technology
1. Survey of the Problem Domain. of this kind of oversight is our universal
preoccupation with BNF grammars and their
There is little agreement on the extent various offspring: type 1 [Chomsky 1959],
to which syntax should be a consideration indexed [Aho 1968], macro [Fischer 1968],
in the design and implementation of program- LR(k) [Knuth 1965], and LL(k) [Lewis 1968]
ming languages. At one extreme, it is con- grammars, to name a few of the more prominent
sidered vitat, and one may go to any lengths ones , together with their related automata
[Van Wijngaarden 1969, McKeeman 1970] to and a large body of theorems. I am person-
provide adequate syntactic capabilities. ally enamored of automata theory per se,
The other extreme is the spartan denial of but I am not impressed with the extent
a need for a rich syntax [Minsky 1970]. In to which it has so far been successfully
between, we find some language implementers applied to the writing of compilers or
willing to incorporate as much syntax as interpreters. Nor do I see a particularly
possible provided they do not have to work promising future in this direction. Rather,
hard at it [Wirth 1971]. I see automata theory as holding back the
development of ideas valuable to language
In this paper we present what should design that are not visibly in the domain
be a satisfactory compromise for a res~ect- of automata theory.
ably large propo~tion-of language designers Users of BNF grammars encounter diffi-
and implementers. We have in mind particularly culties when tryi~g to reconcile the con-
(i) - those who want to write translators flitting goals of practical generality
and interpreters (soft, firm or hardwired) (coping simultaneously with symbol tables,
for new or extant languages without having data types and their inter-relations, reso-
to acquire a large system to reduce the lution of ambiguity, unpredictable demands
labor, and by the BNF user, top-down semantics, etc.)
(ii) those who need a convenient yet and theoretical efficiency (the guarantee
efficient language extension mechanism that any translator using a given technique
accessible to the language user. will run in linear time and reasonable space,
regardless of the particular grammar used).
The approach described below is very BNF grammars alone do not deal adequately
simple to understand, trivial to implement, with either of these issues, and so they
easy to use, extremely efficient in prac- are stretched in some directions to increase
tice if not in theory, yet flexible generality and shrunk in others to improve
enough to meet most reasonable syntactic efficiency. 1 Both of these operations tend
needs of users in both categories (i) and to increase the size of the implementation
(ii) above. (What is “reasonable” is “life-support” system, that is, the soft-
addressed in more detail below) , More- ware needed to pre-proc.ess grammars and to
over, it deals nicely with error detec- supervise the execution of the resulting
tion. translator. This makes these methods
correspondingly less accessible and less
One may wonder why such an “obviougly” pleasant to use. Also , the stretching
utopian approach has not been generally operation is invariably done gingerly,
adopted already. I suspect the root cause dealing only with those issues that have
Work reported herein was supported in part at Stanford by the National Science Foundation
under grant no GJ 992, and the Office of Naval Research under grant number N-OOO14-67-A-
0112-0057 NR 044-402; by IBM under a post-doctoral fellowship at Stanford; by the IBM T.J.
Watson Research Center, Yorktown Heights, N.Y,; and by Project MAC, an MIT research Program
sponsored by the Advanced Research Projects Agency, Department of Defense, under Office of
Naval Research Contract Number NOO014-70-0362-OO06 and the National Science Foundation under
contract number GJOO-4327. Reproduction in whole or in part is permitted for any purpose
of the United States Government.
been anticipated, leaving no room for unexpect- lines on how to write modular. efficient.
ed needs. compact and comprehensible translators and
interpreters while preserving the impression
I am thinking here particularly of the that one is really writing a grammar rather
work of Lewis and Stearns and their colleames than a program.
on LL(k) grammars, table grammars, and att~i-
buted translations. Their approach, while The guidelines are based on some ele-
retaining the precision characteristic of mentary assumptions about the primary syn-
the mathematical sciences (which is unusual tactic needs of the average programmer.
in what is really a computer-engineering
and human-engineering problem) , is tempered First, the programmer already under-
with a sensitivity to the needs of transla- stands the semantics of both the problem
tor writers that makes it perhaps the most and the solution domains, so that it would
promising of the automata-theoretic seem appropriate to tailor the syntax to
approaches . To demonstrate its practicality, fit the semantics. Current practice entails
they have embodied their theory in an the reverse.
efficient Algol compiler.
Second, it is convenient if the pro=
A number down-to-earth
of issues are grammer can avoid having to make up a
not satisfactorily addressed by their system - special name for every object his program
deficiencies which we propose to make up computes. The usual way to do this is to
in the approach below; they are as follows. let the computation itself name the result -
e.g. the object which is the second argu-
(i) From the point of view of the lan- ment of “+” in the computation “a+b*c” is
guage designer, implementer or extender, the result of the computation “b*c”. We
writing an LL(k) grammar, and keeping it may regard the relation “is an argument of”
LL(k) after extending it, seems to be a as defining a class of trees over computa-
black art, whose main redeeming feature is tions ; the program then contains such
that the life-support system can at least trees, which need conventions for express-
localize the problems with a given grammar. ing linearly.
It would seem preferable, where possible,
to make it easier for the user to write Third, semantic objects may require
acceptable grammars on the first try, a varying degrees[of annotation at each invo-
property of the approach to be presented cation, depending on how far the particular
here. invocation differs in intent from the norm
(e.g. for loops that don’t start from 1,
(ii) There is no “escape clause” for or don~step by 1). The programmer needs
dealing with non-standard syntactic prob- to be able to formulate these annotations
lems (e.g. Fortrafi formqt statements), within the programming language.
The procedural approach of this paper makes
it possible for the user to deal with There are clearly many more issues
difficult problems in the same language than these in the design of programming
he uses for routine tasks. languages. However, these seem to be the
ones that have a significant impact on the
(iii) The life-support system must be up, syntax aspects. Let us now draw inferences
running and debugged on the user’s compu- from the above assumptions.
ter before he can start to take advantage
of the technique. This may take more effort 2.1 Lexical Semantics versus Syntactig
than is justifiable for one-shot applications. Semantic?
lie suggest an approach that requires only
a few lines of code for supporting soft- The traditional mechanism for assign-
ware. ing meanings to programs is to associate
semantic rules with phrase-structure
(it ) Lewis and Stearns consider only rules, or equivalently, with classes of
translators, in the ‘context of their LL(k) phrases. This is inconsistent with the
system; it remains to be determined how following reasonable model of a programmer.
effectively they can deal with interpreters.
The approach below is ideally suited for The programmer has in mind a set of
interpreters, whether written in software, semantic objects. His natural inclination
firmware or hardware. is to talk about them by assigning them
names , or tokens. He then makes up pro-
2. Three Syntactic Issues. grams using these tokens, tog,ether with
other tokens useful for program control,
To cope with unanticipated syntactic and some purely syntactic tokens. (No
needs, we adopt the simple expedient of clear-cut boundary separates these classes,)
allowing the language implementer to write This suggests that it is more natural to
arbitrary programs. By itself, this would associate semantics with tokens thanwith
represent a long step backwards; instead, classes of phrases.
we offer in place of the rigid structure
of a BNF-oriented meta-language a modicum This argument is independent of
of supporting software, and a set of guide- whether we specify program control expli-
citly, as in Algol-like languages, or We may as sume thetrees look like, e.g.
implicitly,” as in Planner-Conniver-like
languages. In either case, the programmer
wants to express his instructions or inten- apply
tions concerning certain objects. / \+
When a given class of phrases is character ./’\; rea( \,
ized unambi~usly by the pr~sence of a parti-
cular token, the effect is the same, but this / >,int
is not always the case in a BNF-style
semantic specification, and I conjecture /“\, j
that the difficulty of learning and using Y
a given language specified with a BNF
gratirnati incteases in proportion to the num- XY1
ber of rules not identifiable by a single
token. The existence of an operator grammar
[Floyd 1963] for Algol 60 provides a plausi-
ble account of why people succeed in learn-
ing Algol, a process known not to be
strongly correlated with whether they have That is, every node is labelled with a
seen the BNF of Algol. token whose arguments if any are its sub-
trees. Without further debate we shall
The~e are two advantages of separating adopt the following conventions for encod-
semantics from syntax in this way. First, ing trees as strings.
phrase-structure rules interact more strong-
ly than individual tokens because rules can (i) The string contains every Occurrence
share non-terminals whereas tokens have of the tokens in the tree. [which we cdl
nothing to share. So our assignment of the semantic tokens, whic~ include proced-
semantics to tokens has a much better chance ural Items such as “if’’,’’;”) together with
of being modular than an assignment to some additional syntactic tokens where
rules. Thus one can tailor the language necessary.
to one’s needs by selecting from a library,
or writing, the semantics of just those (ii) Subtrees map to contiguous sub-
objects that one needs for the task in hand, strings containing no semantic token out-
without having to worry about preordained side that subtree.
interactions between two semantic objects
at the syntactic level. Second, the lan- (iii) The order of arguments in the tree
guage designer is free to develop the is preserved. (Naturally these are orient-
syntax of his language without concern for ed trees in general.)
how it will affect the semantics; instead,
the semantics will affect decisions about (iv) A given semantic token in the lang-
the syntax. The next two issues ~linear- uage, together with any related syntactic
izing trees and annotating tokens) tokens , always appear in the same place
illustrate this point well. Thus syntax within the arguments; e.g. if we settle
is the servant of semantics, an appro- for Tt+a,b”, we may not use “a+b” as well.
priate relationship since the substance of (This convention is not as strongly motiva-
the message is conveyed with the semantics, ted as (i) -(iii); without it, however, we
variations in syntax being an inessential must be overly restrictive in other areas
trimming added on human-engineering more important than this one.)
If we insist that every semantic token
The idea of lexical semantics is take a fixed number of arguments, and that
implicit in the usual approach to macro it always precede all of its arguments
generation, although the point usually goes (prefix notation) we may unambiguously re-
unmentioned. I suspect many people find cover the tree from the string (and
syntax macros [Leavenworth 1966] similarly for postfix) as is well known.
appealing for reasons related to the above For a variable number cf arguments, the
discussion. LISP solution of having.syntactic tokens
Trees . (parentheses) at the beginning and end
2.2 Conventions for Linearizing
of a subtree’s string will suffice.
We argued at the beginning of section
Many people find neither solution
2 that in order to economize on names the They prefer
particularly easy to read.
programmer resorted to the use of trees.
The precedent is a long history of use of “ ab + cd = 4 sin (a+b) “ to
the same trick in natural language. Of II = ~ * a+b 2 *C +d 2 * 4 sin+ ab “p
necessity (for one-dimensional channels) or to “(= (+ (* a (+ b 2)) (* c (+ d 2)))
the trees are mapped into strings for trans- (* 4 (sin (+ a b)))) “,
mission and decoded at the other end. We although they will settle for
are concerned with both the human and 1! ahb+j + c*d+2 = 4*sin(a+b) “ in
computer engineering aspects of the coding. lieu of the first if necessary. (But
I have recently encountered some LISP users
claiming the reverse, so I may be biased.) The idea is to assign data types to
classes and then to totally order the
An unambiguous compromise is to require classes. An example might be, in ascending
parentheses but move the tokens, as in order, Outcomes (e.g., the pseudo-result of
“print”), Booleans, Graphs (e.g. trees,
This * (d
(c actually ~ 2)))quite = (4 * (sin
is readable. lists. ulexes), Strings. Algebraic [e.g.
if not’ve;y writable, but it ~s-diffi~ul~~o ;’ tnteger;, complex nos~ poly~omials, reai
tell if the parentheses balance, and it arrays) and References (as on the left side
nearly doubles the number of symbols. Thu S of an assignment.) We write
we seem forced inescapably into having to Strings c References , etc.
solve the problem that operator precedence
was designed for, namely the association We now insist that the class of the
problem. Given a substring AEB where A takes type at any argument that might participate
a right argument, B a left, and E is an in an association problem not be less than
expression, does E associate with A or B? the class of the data type of the result of
the function taking that argument. This
A simple convention would be to say E rule applies to coercions as well. Thu S
always associates to the left. However, in we may use “<” since its argument types
“print a + b“, it is clear that a is meant (Algebraic) are each greater than its
to associate with “+”, not “print”. The result type (Boolean.) We may not write
reason is that “(print a) + b“ does not “length XI’ (where x is a string or a graph)
make any conventional sense, “print” being since the argument type is less than the
a procedure not normally returning an result type. However, “ lx] ““ would be an
arithmetic value. The choice of “print acceptable substitute for ‘Tlength x“ as its
(a + b)” was made by taking into account argument cannot participate in an associa-
the data types of “print’”s right argument, tion problem.
“+’”s left argument, and the types returned
by each. Thus the association is a function Finally, we adopt the convention that
of these four types (call them aA,rA,aB,rB when all four data types in an association
are in the same class, the association is
for the argument and result respectively of A
to the left.
and B) that also takes into account the
legal coercions (implicit type conversions)
These restrictions on the language,
Of course, sometimes both associations
while slightly irksome, are certainly not
make sense,and sometimes neither. Also
as demanding as the LISP restriction that
or r ~ may depend on the type of E,
‘A every expression have parentheses around it.
further complicating matters. Thus the following theorem should be a little
One way to resolve the issue is simply to surprising, since it implies that the
announce the outcome in advance for each programmer never need learn any associations!
pair A and B, basing the choices on some
reasonable heuristics. Floyd 
suggested this approach, called operator Theorem 1. Given the above restrictions,
precedence. The outcome was stored in a every association problem has at most one
table. Floyd also suggested a way of en- solution consistent with the data types of
coding this table that would work in a the associated operators.
small number of cases, namely that a number
should be associated with each argument Proof. Let . ..AEB. . . be such a problem,
position by means of precedence functions ~ppose E may associate with both A
over tokens; these numbers are sometimes and B. Hence because E associates with A,
called “binding powers”. Then E is [aA]~ [rA]~ [aB]& [rB] (type x is in class[x])
associated with the argument position
since coercion is non-increasing, and the
having the higher number. Ties need never
type class of the result of “...AE” is not
occur if the numbers are assigned care-
greater than [r ], by an obvious inductive
fully; alternatively, ties may be broken
proof. Also fo$ E with B, [aB]~ [rB]~ [aA]~
by associating to the left, say. Floyd
showed that Algol 60 could be so treated. [rA] similarly. Thus [aA]=[aB], [rA]=[rB]>
and [aA]=[rB] , that is,all four are in the
One objection to this approach is
that there seems to be little guarantee same class. But the convention in this
that one will always be able to find a case is that E must associate with A,
set of numbers consistent with one’s needs. contradicting our assumption that E could
Another ob@ection is that the programmer associate with B as well. a
has to learn as many numbers as there are
argument positions, which for a respectable This theorem implies that the program-
language may be the order of a hundred. lie mer need not even think about association
present an approach to language design which except in the homogeneous case (all four
simultaneously solves both these problems, types in the same class), and then he just
without unduly restricting normal usage, remembers the left-associativity rule. More
yet allows us to retain the numeric simply, the rule is “always associate to the
approach to operator precedence. left unless it doesn’t make sense”.
What he does have to remember is how to write rule for the homogeneous case may be made for
expressions containing a given token (e.g. classes as a whole without upsetting theorem
he must know that one writes “ x “, not 2. This can be done by decrementing by 1
“length x“ ) and which coercions are allowed. the numbers for argument positions to the
These sorts of facts are quite modular, being right of all semantic tokens in that class,
contained in the description of the token that is, the right binding powers. Then
itself independently of the properties of the programmer must remember the classes for
any other token, and should certainly be which the exception holds. Applying this
easier to remember than numbers associated trick to some tokens in a class but not to
with each argument. others gives messy results, and so does not
seem worth the extra effort required to
Given all of the above, the obvious way remember the affected tokens,
to parse strings (i.e. recover their trees)
is, for each association problem, to The non-semantically motivated con-
associate to the left unless this yields ventions about and * and f may
semantic nonsense. Unfortunately, nonsense be implemented ~f~rt%r’s~b~ividing the
testing requires looking up the types rA appropriate classes (here the Booleans and
and a and verifying the existence of a Algebraic) into pseudo-classes, e.g.
coerc!on from r to a. For translation terms < factors < primaries, as in the BNF
this is not ser 4 OUS, But for interpretation for Algol 60. Then + is defined over
it might slow things down significantly. terms, * over factors and + over primaries,
Fortunately, there is an efficient solution with coercions allowed from primaries to
that uses operator precedence functions. factors to terms. To be consistent with
Algol, the primaries should be a right
Theorem 2. Given the above restrictions on associative class.
a language, there exists an assignment of
integers to the argument positions of each While these remarks are not essential
token in the language such that the correct to the basic approach, they do provide a
association, if any, is always in the direc- sense in which operator precedence is more
tion of the argument position with the than just an ad hoc solution to the associa-
larger number, with ties being broken to the tion problem. Even if the language designers
left. find these guidelines too restrictive, it
would not contradict the fact that operator
Proof. First assign even integers (to make precedence is in practice a quite satis-
room for the followin~terpolations) to the factory solution, and we shall use it in the
data type classes. Then to each argument approach below regardless of whether the
position assign an integer lying strictly theoretical justification is reasonable.
(where possible) between the integers Nevertheless we would be interested to see a
corresponding to the classes of the argument less restrictive set of conventions that
and result types. To see that this assign- offer a degree of modularity comparable
ment has the desired property, consider the with the above while retaining the use of
homogeneous and non-homogeneous cases in precedence functions. The approach of
the problem “.. .AE”.. .” as before. recomputing the precedence functions for
every operator after one change to the grammar
In the homogeneous case all four types is not modular, and does not
are in the same class and so the two numbers allow flexible access to individual items
must be equal, resulting in left association in a library of semantic tokens.
as desired. If two of the data types are
in different classes, then one of the An attractive alternative to precedence
inequalities in [aA]~[rA]2 [aB]L[rB] functions would be to dispose of the ordering
and rely purely on the data types and legal
(assuming E associates with A) must be strict.
coercions to resolve associations. Cases
If it is the first or third inequality,
which did not have a unique answer would be
then A’s number must be strictly greater
referred back to the programmer, which would
than B’s because of the strictness
be acceptable in an on-line environment, but
condition for lying between different
undesirable in batch mode. Our concern about
argument and result type class numbers.
efficiency for interpreters could be dealt
If it is the second inequality then A’s
with by having the outcome of each associa-
number is greater than B’s because ATS
tion problem marked at its occurrence, to
result type class number is greater than B’s
speed things up on subsequent encounters.
argument one. A similar argument holds if
Pending such developments, operator precedence
E associates with B, completing the proof. u
seems to offer the best overall compromise
in terms of modularity, ease of use and
Thus Theorem 1 takes care of what the
memorizing, and efficiency.
programmer needs to know, and Theorem 2
what the computer needs to know. In the
The theorems of this section may be
former case we are relying on the programmer’s
interpreted as theorems about BNF grammars,
familiarity with the syntax of each of
with the non-terminals playing the role of
his tokens; in Lhe latter, on the computer’s
data type classes. However, this is really
agility with numbers. Theorem 2 establishes
a draw-back of BNF; the non-terminals tempt
that the two methods are equivalent.
one to try to say everything with just context-
Exceptions to the left association
free rules, which brings on the difficulties
mentioned in Section 1. It would seem a variety of situations with default para-
preferable to refer to the semantic meters and variable-length parameter lists.
objects directly rather than to their No claim is made that the above examples
abstraction in an inadequate language. exhaust the possibilities, so our language
design should make provision not only
2.3 Annotation for the above, but for the unexpected as
well. This is one reason for preferring
When a token has more than two argu- a procedural embedding of semantics;
ments, we lose the property of infix nota- we can write arbitrary code to find all
tion that the arguments are delimited. the arguments when the language designer
This is a nice property to retain, feels the need to complicate things.
partly for readability, partly be-
cause complications arise, e.g. , if
,, _ It is to be used as both an infix 3. Implementation
and a prefix operator~ “ (“ also has this
property; as an infix it denotes applica- In the preceding section we argued
tion, as a prefix? a no-op. Accordingly
for lexical semantics, operator prece-
we require that all arguments be de- dence and a variety of ways of supplying
limited by at least one token: such a arguments. Tn this sectionwe reduce this
grammar Floyd  calls an operator to practice.
grammar. Provided the number of argu-
ments remains fixed it should be clear To combine lexical semantics with a
that no violenceis done by the extra procedural approach, we assign to each
arguments to theorems 1 and 2P since semantic token a program called its
the string of tokens and arguments semantic code, which contains almost all
including the two arguments at each the information about the token. To
end plays the same syntactic role as translate or interpret a string of
the single semantic token in the two- tokens, execute the code of each token in
argument case. We shall call the seman- turn from left to right.
tic tokens associated with a delimiter
its parents. Many tokens will expect arguments,
which may occur before or after the token.
An obvious choice of delimiters If the argument always comes before, as
is commas. However, this iS nQt with unary postfix operators such as
as valuable as a syntactic token that 1!t 11 we may parse
., expressions using the
documents the role of the argument following one-state parser.
following it. For example, “if a then
b else c“ is more readable [by a
human) then “if a, b, C“. Other
examples are “print x format f“, “for
i from s to f by d while c do b“,
“log x base b“, “solve e using m“, left + run code;
“x between y and Z’t, etc. advance
Sometimes arguments may be fre-
quently used constants, e.g.? “for This parser is initially positioned
i from 1 to n by 1 while true do b“, at the beginninq of the input. It runs
If an argument is uniquely identified the code of the current token, stares
by its preceding delimiter, an obvious the result in a variable called ‘left’,
trick is to permit the omission of advances the input, and repeats the pro-
that argument and its token to denote cess. If the input is exhausted, then
that a default value should be used. by default the parser halts and returns
Thus, we may abbreviate the previous The variable
the value of ‘left! .
example to “fpr i to n do b~’, as in ‘leftt may be consulted by the code of
extended Algol 68. Other obvious the next tokenr which will use the value
defaults are “log x“ for “log x of ‘left! as either the translation Or value
base 2“, “if x then y’l far ‘(if x then of the left-hand argument, depending on
y else nil”, and sa on. Note that
whether it is translating or interpre.
various arguments now may be involved ting.
in associations, depending on which
ones are absent. Alternatively, all arguments may
appear on the right, as with unary pre-
Another situation is that of the
fix operqtors such as ‘log’ and ‘sin~,
variable length parameter list, e.g. , In this case the code of a prefix operatQr
“clear at b, Cf d“. Commas are more can get its argument by calling the
appropriate here, although again we may
code of the following token. This pro-
need more varietyP as in ‘Iturn on a Qn cess will continue recursively until a
b off g on m off p off t“ (in which the token is encountered (e.g., a variable
unamed switches or bits are left as or a constant) that does not require an
they are) . All of these examples show The code of this token returns
that we want to be able to handle quite
the appropriate translation and then so rather than qo. This may appear
does the code of each of the other tokens, ‘A
w steful since we have to repeat the
in the reverse of the order in which they q -ql code on the q -q edge as Well.
were called. H8wever, this chang & allows us to take
advantage of the distinction between
Clearly we want to be able to deal c1 and ql, namely that “left” is unde-
with a mixture of these two types of f!?ned in state q. and defined inq—
tokens, together with tokens having both that is, some expression precedes 4
kinds of arguments (infi operators). token interpreted during the q -q
This is where the problem of association transition but not a token ink LJ rp eted
arises, for which we recommended operator during the q -q transition. We
precedence. We add a state to the parser, will call th~ c~de denoted by a token
%hus : with (without) a Precedinq expression
i,ts left (nuli) d~notatio~ or-led (nud).
‘ c -+ code; advance;
I left + run c or by split-
ting trans- nud
left+run c led
rbp < lbp/ stack instead c
q of variables
(the state = advance;
rbp<,lbp/ the variable run
c+~ed; on the stack) :
Starting in state qo, the parser inter-
prets a token after advancing past that
left+run c rbp<lb /
token, and then enters state ql. If a left
certain condition is satisfied, the parser
returns to qO to process the next token: It now makes sense for a token
otherwise it halts and returns the to denote two different codes. For
value of left by default. example, the nud of ‘-! denotes
unary minus, and its led, binary
We shall also change our strategy minus. We may do the same for ‘/’ (in-
when asking for a right-hand argument, teger-to-semaphore conversion as fn
making a recursive call of the parser it- Algol 68, versus division], ‘(’ (syntactic
self rather than of the code of the next grouping, as in a+(bxc), versus
token. In making this call we supply applications of variables or constants
the binding power associated with the whose value is a function, as in Y(F) ,
desired argument, which we call the rbp (IX.X2] (3), etc.), and ‘ E ‘ (the
(right binding power), wh~se value remains empty stxing versus the membership
fixed as this incarnation of the parser relationl .
runs. The lbp (left binding power~ is
a Property of the current token in the A possibly more important role
input stream, and in general will change for nuds and leds is in errcm detec-
each time state q is entered. tion. If a token only has a nud and
left binding powe~ ts tFieenlyp~&rty is given a left argument, or only has a led
of the token not in its semantic code. and is not given a left argument, or has
To return to q we require rbp ~ Ibp. Ig neither, therm on-existent semantic code
this test fail 8 , then by default the is invoked, which can be arranged to result
parser returns the last value of ~left” $n the calling of an error routine.
to whoever called it, which corresponds
to ‘A’ getting ‘E’ in ‘AEB~ if ‘A’
had called the parser tiiat read ‘E’. 50 far we have assumed that
If the test succeeds, the parser enters semantic code optionally calls the
state qo, in which case ‘B[ gets ~E~ parser once, and then retumxs the
instead. appropriate translatiem. One is at
liberty to have more elaborate code,
Because of the possibility of there however, when the code can read the
being several recursive calls of the input (but not-backspace it) , request
parser running simultaneously, a stack and use arbitrary amounts of storage,
of return addresses and right binding and carry out arbitrary computations
powers must be used. This stack plays in whatever language is available
essentially the same role as the stacks (for which an ideal choice is the
described explicitly in other parsing language being defined). These capa-
schemes. bilities give the apprQach the pQwer
of a Turing machine, to be used and
lie can embellish the pa,rser a little abused by the language implementer as
by having the edge leaving ql return to he sees fit. While one may object to
all this power on the ground that obscure The major difference between the
language descriptions can then be written, approach described here and the usual
for practical purposes the same objection operator precedence scheme is that we have
holds for BNF grammars, of which some quite modified the Flovd o~erator precedence parser
obscure yet brief examples exist. In to work top-down, _implementing the stack by
fact, the argument really runs the other means of recursion, a technique known as
way; the cooperative language implementer recursive descent. This would appear to
can use the extra power to produce more be of no value if it is necessary to imple-
comprehensible implementations, as ment a stack anyway in order to deal with
we shall see in section 4. the recursion. However, the crucial pro-
perty of recursive descent is that the stack
One use for this procedural entries are no longer just operators or
capability is for the semantic code to operands, but the environments of the pro-
read the delimiters and the arguments grams that called the parser recursively.
following them if any. Clearly any When the programs are very simple, and
delimiter that might come directly only call the parser once, this environment
after an argument should have a left gives us no more information than if we
binding power no greater than the binding had semantic tokens themselves on the stack.
power for that argument. For example, When we consider more complicated sorts of
the nud of ‘if’, when encountered constructions such as operators with various
in the context ‘if a then b else C( ~ default parameters the technique becomes
may call the parser for a~ verify that more interesting.
‘ then’ is present, advance, call the
parser for ‘b’, test if ‘else! is pre- While the above account of the al-
sent and if so then advance and call the gorithm should be more or less self-explana-
parser a third time. (.This resolves tory, it may be worth while summarizing the
the “dangling else” in the usual way.] properties of the algorithm a little more
The nud of ‘(’ will call the parser, urecisel~.
and then simply check that ‘)’ is pre= befiniti;n. An expression is a string S
sent and advance the input. Delimiters such that there exists a token t and an
of course may have multiple parents, environment E in which if the parser is
and even semantic code, such as ‘1’, started with the input at the beginning
which might have a nud (’absolute Value of St, it will stop with the input at t,
of’ as in ‘IX ~~~’)r~nd two parents, it- and return the interpretation —— S relative
self and !+’ !a+blc~ is shorthand to E.
for ‘if a then b else c ‘]. The ease Properties. (i) When the semantic code of
with which mandatory and optional delimiters a token t is run, it begins with the input
are dealt with constitutes one of the positioned just to the right of that token,
advantages of the top–down approach over and it returns the interpretation of an
the conventional methods for implementing expression ending just b;fore the final
operator precedence position of the input, and starting either
at t if t is a nud, or if t is a led then
The parser’s operation may perhaps be at the beginning of the expression of which
better understood graphically. Consider ‘left’ was the interpretation when the code
the example ‘if 3*a + b!+-3 = O then of t started.
print a + (b–1) else rewind’. We may (ii) When the parser returns the interpre-
exhibit the tree recovered by the parser tation of an expression S relative to en-
from this expression as in the diagram vironment E, S is immediately followed by
below. The tokens encountered during one a token with lbp~rbp in E.
incarnation of the parser are enclosed in (iii) The led of a token is called only if
a dotted circle, and are connected via it immediately follows an expression whose
down-and-left links, while calls on the interpretation the parser has assigned to
parser are connected to their caller by ‘left’.
down-and-right links. Delimiters label (iv) The lbp of a token whose led has just
the links of the expression they precede, been called is greater than the rbp of the
if any. The no-op ‘(’ is included, although current environment,
it is not really a semantic object. (v) Every expression is either returned
by the Parser or given to the following
l~d via- ’left’. -
. A token used only as a nud does noc
need a left binding power.
These proverties are the ones that make
the algorithm useful. They are all straight-
forward to verify. Property (i) says that
a semantic token pushes the input pointer
off the right end of the expression whose
tree it is the root, Properties (ii), (iv)
and (v) together completely account for the
two possible fates of the ntents of ‘left’.
Property (iii) guarantees that when the
code of a led runs, it has its left hand
argument interpreted for it in ‘left’, There (ii) boole(m,x,y): forms the bitwise
is no guarantee that a nud is never preceded by boolean combination of strings x and Y,
an expression; instead, property (v) guards where m is a string of four bits that
against losing an expression in’left’ by specifies the combination in the obvious
calling a nud which does not know the expres- way (1000 = Q, 1110 = or, 1001 = eqv etc)o
sion is there. Property (vi) says that If one string 1s exhaust= before t=other,
binding powers are only relevant when an boole continues from the beginning of the
argument is involved. exhausted string, cycling until both strings
are exhausted simultaneously. Boole is not
defined for strings of other than O’s and 1’s.
4“ %%%% e~am~les we shall assume that
lbp,nud-and led ar~ reallv the functions (iii) x isvalid: a predicate that holds only
Ibp(token), nud(token) and led(token). To when x is a string of all ones.
call the parser and simultaneously establish
a value for rbp in the environment of the We shall use these primitives to write
parser, we write parse (rbp), passing rbp as a prog~arn which will read a zero-th order
a parameter. when a led runs, its left hand proposltlon, parse it, determine the truth-
arguments interpretation is the value of table column for each subtree in the parse,
the variable left, which is local to the and print “theorem” or “non-theorem” when
parser callfig that led. “?” is encountered at the end of the proposi-
tion, depending on whether the whole tree
Tokens without an explicit nud are returns all ones.
assumed to have for their nud the value of
the variable ‘nonud’, and for their led, The theorem prover is defined by
‘noled’ . Also the variable evaluating the following expression.
‘ self ‘ will have as value the token
whose code is missing when the error occurs.
nonud + ‘if null led(self) then
In the language used for the semantic nud(self) -+ generate
code, we use a + b to define the value of else (print self;
expression a to be the value of expression print “has no argument’’)’;
b (not b itself); also, the value of a + b
is that of b. The value of an expression led(’’?”) + ‘if left isvalid then print “theorem”
is itself unless it has been defined ex- else print “non-theorem”;
plicitly by assignment or implicitly by parse 1’;
procedure definition; e.g., the value of lbp(’’?”) + 1;
3 is 3, of 1+1, 2. We write ‘a’ to mean
the expression a whose value is a itself, nud(’’(”) + ‘parse O 6 check “)’”;
as distinct from the value of a, e.g. lbp(’’)”) + O;
II+l!must be evaluated twice to yield 2.
A string x is written “x” ; this differs led(’’-+”) i- ‘boole(’’llOl”, left, parse 1) ‘ ;
from ‘x’ only in that x’ is now assumed to lbp(!!.+!!) +- 2;
be a token, so that the value of “1+1” is
the token 1+1, which does not evaluate to led(’’v”) -+ ‘boole(’’lllO”, left, parse 3)’;
2 in general. To evaluate a, then b, re- lbp(’’v”) +- 3;
turning the value of b, write a;b. If the
value of a is wanted instead, write aGb. led(’’A”) + ‘boole(’’1000”, left, parse 4) ‘ ;
(These are for side-effects.) We write (check lbp(’’A”) + 4;
X) for (if token = xthen advance else
(print “missing”; print x ; halt)). Every- nud(!!-!t) + ‘boole(’’OlOl”, parse 5, “O”)’ .
thing else should be self-explanatory.
(Since this language is the one implemented
in the second example, it will not hurt to To run the theorem prover, evaluate
see it defined and used during the first.)
k+l; parse O .
We give specifications, using this
approach, of an on-line theorem prover, and For example, we might have the following
a fragment of a small general-purpose exchange:
programming language. The theorem prover
is to demonstrate that this approach is (a+b)A(b+c)+(a+c)? theorem
useful for other applications than just a? non-theorem
programming languages. The translator av-a? theorem
demonstrates the flexibility of the approach.
until we turn the machine off somehow.
For the theorem prover’s semantics, we
assume that we have the following primitives The first definition of the program
available: deals with new variables; which is anything
without a prior meaning that needs a nud.
(i) generate; this returns the bit string The first new variable will get the constant
~klk and also 01 for its nud,the next 0011, then 00001111,
doubles k, assumed 1 initially. etc. Next, “?” is defined to work as a
delimiter; it responds to the value of its
left argument (the truth-table column for parses a list of expressions delimited by
the whole proposition), processes the next a’s, parsing each one by calling parse b,
proposition by calling the parser, and and it returns a LISP list of the results.
returns the result to the next level parser.
This parser then passes it to the next “?” The object is to translate, for
as its left argument, and the process example, a+b into (PLUS a b) , a;b into
con=ues, without building up a stack of (PROG2 a b), a&b into (PROG2 nil a b),
IT?f?!s since “?” is left associative. -a into (MINUS a) , Ax,y, . . ..z.a into
(LAMBDA (x y . . . z) a) , etc. These target
Next, “(” is defined to interpret and objects are LISP lists, so we will use “[”
return an expression, skipping the follow- to build them; [a,b, . . ..c ] translates
ing “)” . The remaining definitions should into (LIST a b . . . c) .
be self-explanatory. The reader interested
in how this approach to theorem-provers A fragment of the definition of L:
works is on his own as we mainly concerned nilfix right [“PARSE”, bp] $
here with the way in which the definitions infixr ; 1 [“PROG2”, left, right] $
specify the syntax and semantics of the infixr 6 1 [“PROG2”, nil, left, right] $
language. prefix is 1 [“LIST”, right, ‘left’,
[“PARSE”, bp]] $
The overhead of this approach is infix $ 1 (print eval left; right) $
almost negligible. The parser spends prefix delim 99 [“DELIM”, token G advance] $
possibly four machine cycles or so per prefix ‘ 0 [“QUOTE”, right 6 check “’”] $
token (not counting lexical analysis), and delim ‘$
the semantics can be seen to do almost prefix [ 0 (“LIST” “ “ getlist bp
nothing; only when the strings get longer ~ check “l’”) $
than a computer word need we expect any j:;;: 1 $
significant time to be spent by the logical $
operations. For this particular interpreter, prefix ~ O (right G check “)”) $
this efficiency is irrelevant; however, for ::;;; ~ $
a general-purpose interpreter, if we prepro- 2S (left . f token # “)” then
cess the program so that the lexical items ( “ “ get ist O) 6 check “)”
become pointers into a symbol table, then elle nil $,
the efficiency of interpreting the resulting infix getlist 25 is “GETLIST” $
string would be no worse than interpreting prefix if 2 [“COND”, [right,
a tree using a tree-traversing algorithm check “then”; right]]
as in LISP interpreters. @ (if token = “else” then
(advance; [[right]])) $
For the next example we describe a delim then $
translator from the language used in the delim else $
above to trees whose format is that of the nilfix advance [“ADVANCE”] $
internal representation of LISP s-expressions, prefix check 25 [“CHECK”, right] $
an ideal intermediate language for most infix -+ 25 [t’SETQ”, left, parse(l)] $
compilers. prefix 1 0 [“LAMBDA”, “ “ getlist 25
G check “;”; right] $
In this example we focus on the prefix + 20 right $
versatility the procedural approach gives infix + 20 is “PLUS” $
us , and the power to improve the descrip- prefix - 20 [“MINUS”, right] $
tive capacity of the metalanguage that we infix - 20 is “DIFFERENCE” $
get from bootstrapping. Some of the infix X 21 is “TIMES” $
verbosity of the theorem prover can be infix + 21 is “QUOTIENT” $
done away with in this way. infixr + 22 is “EXPT” $
infixr + 22 is “LOG”
We present a subset of the definitions prefix I O [“ABS” > ri~ht G check “ l“] $
of tokens of the language L; all of them delim
infixr ~ $
are defined in L, although in practice one 14 is “APPEND” $
would begin with a host language H (say infixr . 14 is “CONS” $
the target language, here LISP) and write prefix a 14 [“CAR”, right] $
as many definitions in H as are sufficient prefix 6 14 [“CDR”, right] $
to define the rest in L, We do not give infix E 12 is “MEMBER” $
Che definitions of nilfix, prefix, infix infix = 10 is “EOUAL” $
or infixr here; however, they perform infix # 10 [“NOTti,[ ’’EQUAL’’,left,right] 1 $
assignments to the appropriate objects; infix ~ 10 is “LESSP” $
e.g. (nilfix a b) performs nud(a)+’b’, infix > 10 is “GREATERP”
(prefix a b c) sets bp+b before performing
nud(a)+-’c’, (infix a b c) does the same as
(prefix a b c) except that the led is and so on,
defined instead and also lbp(a)+b is done,
and infixr is like infix except that The reader may find some of the boot-
bp+b-1 replaces bp+b. The variable bp is strapping a little confusing. Let us
available for use for calling the parser consider the definitions of ‘right’ and ‘+’.
when reading c. Also (delim x) does The former is equivalent to
lbp(x)+-O. The function (a getlist b) nud(right) + ’[’’PARSE”, bp]’ .
The latter is equivalent to 6. Acknowledgments
nud(+) + ‘parse(20) ‘ and
led(+) + ‘[’’PLUS”, left, parse(20)] , I am indebted to a large number of
because when the nud of right is people who have discussed some of the
encountered while reading the definitions ideas in this paper with me. In particular
of + , it is evaluated by the parser in I must thank Michael Fischer for supplying
an environment where bp is 20 (assigned many valuable ideas relevant to the
by prefixlinfix). implementation, and for much programming
help in defining and implementing CGOL, a
It is worth noting how effectively we pilot language initially used to break in
made use of the bootstrapping capability and improve the system, but which we hope
in defining “is”, which saved a considerable to develop further in the future as a desirable
amount of typing. With more work, one programming language for a large number of
could define even more exotic facilities. classes of users.
A useful one would be the ability to
describe the argument structure of operators 7. References
using regular expressions.
Aho, A.V. 1968. Indexed Grammars. JACM ~,
The “is” facility is more declarative 4, 647-671
than imperative in flavor, even though it Chomsky, N. 1959. On certain formal properties
is a program. This is an instance of the of grammar. Information and Control,
boundary between declarative and imperatives ~, ~, 137-167.
becoming fuzzy. There do not appear to be Fischer, M.J. 1968. Macros with Grammar-
any reliable ways of distinguishing the two like Productions. Ph. D. T hesls,
in general. Harvard University.
Floyd, R.W. 1963. Syntactic Analysis and
5. Conclusions Operator Precedence. JACM 10, 3, 316-333.
Knuth, D.E. 1965. On the transl~ion of
We argued that BNF-oriented approaches lanwages from left to right, Informa-
to the writing of translators and interpreters tio~ a~d Control, 8, 6, 607-639
were not enjoying the success one might’ Leavenworth, B.N. SyntaF macros and extended
wish for. We recommended lexical semantics, translation. CACM, ~, 11, 790-793. 1966.
operator precedence and a flexible approach Lewis, P.M., and R.E. Stearns. 1968. SYntax-
to dealing with arguments. We presented a directed transduction, JACPI ~, 3, 465-488.
trivial parsing algorithm for realizing McKeeman, W.M., J.J. Horning and D.B. Wort-
this approach, and gave examples of an manl 1970~ A Compiler Generator.
interpretive theorem prover and a trans- Prentice-Hall Inc. Englewood Cliffs, N.J.
lator based on this approach. Minsky. M.L. 1970. Form and Content in
bomputer Science. Turing Lecture, JACM
It is clear how this approach can be ~, 2, 197-215.
used by translator writers. The modularity Van WIJngaarden, A., B.J. Mailloux, J.E.L.
of the approach also makes it ideal for Peck and C.H.A. Koster. 1969. ReDort
implementing extensible languages. The on the Algorithmic Language ALG-
triviality of the parser makes it easy to Mathematisch Centrum, Amsterdam,
implement either in software or hardware, MR 101.
and efficient to operate. Attention was Wirth, N. 1971. The programming language
paid to some aspects of error detection, PASCAL . Acts Informatica, ~, 35-68.
and it is clear that type checking
and the like, though not exemplified in the
above, can be handled in the semantic
code. And there is no doubt that the
procedural approach will allow us to do
anything any other system could do, although
conceivably not always as conveniently.
The system has so far found two
practical applications. One is as the
“front-end” for the SCRATCH-PAD system of
Greismer and Jenks at IBM Yorktown
Heights. The implementation was carried
out by Fred Blair. The other application
is the syntactic component of Project
MAC’s Mathlab system at MIT, MACSYMA,
where this approach added to MACSYMA
extension facilities not possible with
the previous precedence parser used in
MACSYMA. The implementer was Michael