Turbo Pascal Internal
Do you still remember Borland Turbo Pascal? The most successful Pascal compiler ever? If the
answer is yes then you probably remember that Turbo Pascal was indeed Turbo. A very fast compiler
which generated quite fast executable code. It featured integrated development environment (IDE)
where you could edit, run and debug your programs. Originally created as Blue Label Pascal by
Anders Hejlsberg, it was licensed by Borland as Turbo Pascal in early 1980s. Last version for DOS was
released in 1992 as Turbo Pascal 7.0. Borland also released similar compiler for Windows and later a
brand new product Delphi--Rapid Application Development tool for Windows.
Have you ever wondered what makes Turbo Pascal a fast compiler? Well, if you are interested in
Turbo Pascal internals here are described some basic units consisting this compiler. Turbo Pascal
design is not conventional as taught in compiler design books. Its design is oriented toward speed.
The parser is tightly connected with code generator. There is some low-level intermediate code, but
most of the raw code is generated by the parser.
A great example of data structures and algorithms used in Turbo Pascal is TPC16. TPC16 is a Turbo
Pascal compatible compiler. It generates unit and executable files compatible with Turbo Pascal 7.0
command line compiler. TPC16 is written in Turbo Pascal. It is consisted of many units described
Here are declared all global variables that are used by the compiler. Declarations are grouped into
sections. Some sections declare variables which hold data for particular module which is compiled
and these variables are saved (pushed) during processing of used units.
This unit declares a symbol table holding all reserved words. It is separated from other units because
the file includes also hash values for reserved words which are generated by a separate program.
This unit defines data structures that define basic Pascal types (integer, real, extended, Boolean,
char, string, file, set, array, object, etc). Unit also contains functions to test type compatibility,
symbol table type storage and type processing.
Routines for reading and writing files are located in the I/O Utilities unit. This unit also contains
procedures for error handling and error reporting.
Symbol Table Management
Symbol table management is one of the most important internal operations in every compiler. This
unit contains procedures to create symbol table, insert identifiers into symbol table, search symbol
table for some identifier, search unit, procedure or record scope for particular identifier, and all
other operations with symbol tables.
Here are located procedures that scan source file and generate a stream of tokens. Token is the
smallest element of the source which is processed by the parser. Scanner processes source files,
skips comments, processes compiler directives, and generates tokens.
This is the brain of the compiler. It reads tokens, checks language syntax and generates intermediate
code. One of the reasons for fast compilation lies in parser. Since it generates code which is close to
executable code, the next steps of compilation are pretty fast.
This unit processes Pascal statements (for, while, repeat, while, etc.), begin-end blocks and
This unit processes system functions like abs, arctan, chr, int, odd, ofs, etc. and generates code for
This unit processes system procedures like exit, fillchar, writeln, inc, val, etc. and generates code for
This is the biggest unit and contains functions and procedures to process expressions. Expression is
anything that needs to be calculated--from a simple identifier to complex expression in multiple
parentheses. There are many cases that need to be tested and processed. Procedures in this unit
also generate the majority of the code.
This unit contains procedures to process various calculations. Calculation is some operation with one
or two expressions (addition, subtraction, multiplication, etc.)
This is the inline assembler that processes instruction in an asm-end block.
This unit declares data structures that are needed for inline assembler.
This unit processes intermediate code and generates executable code and reference data needed in
linker. Code generator also performs code optimizations.
This unit imports OMF object files and processes OMF records.
This unit generates the final executable code and creates output files. Before the code is generated
each referenced item is recursively processed and marked. Unmarked items are not executed and
therefore not included in the final executable code