Docstoc

OoCities

Document Sample
OoCities Powered By Docstoc
					     COMPILER
   CONSTRUCTION
         WEEK- 4:
INTRODUCTION TO COMPILER
     & INTERPRETER
               An Overview
 The main purpose of a compiler and interpreter is to
  translate a program written in a high-level
  programming language like Pascal into a form that a
  computer can understand in order to execute the
  program.
 In the context of this translation, the high-level
  language is called the source language.
 A compiler translates a program written in the source
  language into a low-level object language, which can
  be the machine language of a particular computer.
 The program that we write in the source language is
  called the source program, which we edit, in one or
  more source files.
                 An Overview
 The compiler translates each source file into an object
  file.
 If the object files contains assembly language, we must
  next run an assembler (another type of a program
  translator) to convert them into machine language.
 We then run a utility program called a linker to combine
  the object files (along with any needed    runtime
  library routines) into the object program.
 Once created, an object program is a separate program
  in its own right.
 We can load it into the computer’s memory and then
  execute it.
             An Overview
   For example, if we are programming in C, we
    would edit the source files and save them using
    names ending in .c or .cpp
   The C compiler from Borland or Microsoft then
    translates each source file into a machine
    language object file, which it saves using a name
    ending in .obj
   The linker combines the separate object files into
    the final object program, which is saved using a
    name ending in .exe or .com (what is difference?)
   Then we can load and run the object program.
   Following figure summarizes the compiler
    translation process:
An Overview
             An Overview
   On the other hand interpreter does not produce an
    object program.
   It may translate the source program into an
    internal intermediate code that it can execute
    more efficiently, or it may simply execute the
    source program’s statements directly.
   The net result is that an interpreter translates a
    program into the actions specified by the
    program.
   Interpreters are often used for the BASIC, LISP
    languages etc.
   Following figure summarizes the interpreter
    translation process:
   Note: a compiler may also first translate the source program into
    intermediate code, and then translate the intermediate code into object
    language.
    Compiler Vs. Interpreters:
    What an interpreter does with a source program is
    very similar to what we would do with the program if
    we had to figure out what it does without using a
    computer.
   For example, if we are handling a C program.
   First we look it over to check for syntax errors.
   We then locate the start of the main program, and
    from there we execute the statements one at a time
    by hand.
   We might use a pencil and scratch pad to keep track
    of the values of the variables (manual debug).
   If we encounter the statement i=j+k; in the program
    Compiler Vs. Interpreters:
   We would look up the current values of j and k on
    our scratch pad, add the values, and write down
    the sum as the new value for i.
   An Interpreter essentially does what we just did.
   It is itself a program that runs on the computer.
   A BASIC interpreter reads in a BASIC source
    program, looks it over for syntax errors, and
    executes the source statements one at a time.
   Using some of its own variables as a scratch pad,
    the interpreter keeps track of the values of the
    source program’s variables.
   On the other hand compiler is also a program that
    runs on the computer.
    Compiler Vs. Interpreters:
   A BASIC compiler read in a BASIC source
    program and checks it for syntax errors.
   But then, instead of executing the source
    program, it translates the source program into
    the object program.
   The compiler generates a machine language
    object program; the output is more cryptic.
   There is discussion going on which one is better,
    a compiler or interpreter?
   To execute a source program with an interpreter,
    we simply feed the source program into the
    interpreter, and it takes over to check and
    execute the program.
Advantages & Disadvantages of
  Compilers and Interpreters:
   A compiler, however, checks the source program and
    then produces an object program.
   After running the compiler, we may need to run the
    linker, and then we have to load the object program
    into memory in order to execute it.
   So, an interpreter definitely has advantages over a
    compiler when it comes to the effort required to
    execute a source program.
   Interpreter can be more versatile than compiler.
   Remember that they are themselves programs, and like
    any other programs, they can be made to run on different
    computers.
    Advantages & Disadvantages of
      Compilers and Interpreters:
     One can write a Pascal or C interpreter that runs on both
      an IBM PC and an Apple Macintosh, so that it will execute
      Pascal or C source program on either computer.
     A compiler, however, generated object programs for a
      particular computer.
     Therefore, even if we took a Pascal or C compiler
      originally written for the PC and make it run on the Mac, it
      would still generate object program for the PC not for the
      Mac.
     To make the compiler generate object program for the
      Mac, we would have to rewrite substantial portions of the
      compiler.
    Advantages & Disadvantages of
      Compilers and Interpreters:
    What happened if the source program contains a “logical
     error” that doesn’t show up until runtime, such as an
     attempt to divide by variable whose value is zero?
    Since an interpreter is in control when it is executing the
     source program, it can stop and tell us the line number of
     the offending statement and the name of the variable.
    It can even prompt us for some corrective action (like
     changing the value of the variable) before resuming
     execution.
    The object program generated by a compiler, on the other
     hand, usually runs by itself.
    Information from the source program, such as line number
     and names of variable, might not be present in the object program.
    Advantages & Disadvantages of
      Compilers and Interpreters:
    When a runtime error occurs, the program may simply abort and
     perhaps print a message containing the address of the bad
     instruction.
    Then it’s up to us to figure out which source statement that
     address corresponds to, and which variable was zero.
    When it comes to the debugging, an interpreter is generally the
     way to go.
    However, many modern program development environments
     now give compilers debugging capabilities that are almost as
     good as those of interpreters. (hybrid approach, e.g. VB etc.)
    We compile the program and run it under the control of the
     environment.
    Advantages & Disadvantages of
      Compilers and Interpreters:
    If runtime error occurs, we are given the information and
     control we need to correct the error.
    Then we can resume the execution of the program, or compile
     and run it again.
    Such compilers usually generate extra information or
     instructions in the object program to keep the environment
     informed of the current state of the program’s execution.
    This often caused the object program to be less efficient than
     it otherwise could be.
    Most people turn off the debugging features when they are about
     to generate the final “production” version of their program.
    Suppose we have successfully debugged our program, and now
     our most important concern is how fast it executes.
    Advantages & Disadvantages of
      Compilers and Interpreters:
    Remember that an interpreter executes the statement of the
     source program pretty much the way we would be by hand.
   Each time it executes a statement, it looks it over to figure out
    what operations the statement says to do.
   With a compiler, the computer executes a machine-language
    program, generated either directly by the compiler or indirectly
    with an assembler.
   Since a computer executes a machine language program at top
    speed, such a program can run 10 to 100 times faster than the
    interpreted source program.
    Advantages & Disadvantages of
      Compilers and Interpreters:
    A compiler is definitely the winner when it comes to the speed.
    This is certainly true in the case of an optimizing compiler that
     knows how to generate especially efficient code.
   So we see that compilers and interpreters have advantages and
    disadvantages.
   It depends on what aspect of program development and execution
     we consider.
   A compromise may be to have both a compiler and an interpreter
    for the same source language.
   Then we have the best of both worlds, easy development and fast
    execution.
             Model of a Compiler:
   Compiler can be described in a modular fashion.
   The task of constructing a compiler for a particular source language is
    complex.
   The complexity and nature of the compilation process depend, to a
    large extent, on the source language.
   Compiler complexity can often be reduced if a programming language
    designer takes various design factors into consideration.

   Since we are dealing with high-level source language such as PASCAL
    and C.
   Such a model is given in following figure.

   Although this model may vary for the compilation of different high-level
    languages, it is nevertheless representative of the compilation process.
Model of a Compiler:
          Model of a Compiler:

   A compiler must perform two major tasks: the analysis of a source
    program and the synthesis of its corresponding object program.
   The analysis task deals with the decomposition of the source program
    into its basic parts.
   Using these parts, the synthesis task builds their equivalent object
    program modules.
   The performance of these tasks is realized more easily by building
    and maintaining several tables.
              Model of a Compiler:
   A source program is a string of symbols each of which is generally a
    letter, a digit, or certain special symbols such as +, - and ( , ).
   A source program contains elementary language constructs such as
    variable names, labels, constants, keywords, and operators.
   It is therefore desirable for the compiler to identify these various types
    as classes.
   These language constructs are given in the definition of the language.
   The source program is input to a lexical analyzer or scanner whose
    purpose is to separate the incoming text into pieces or token such as
    constants, variable names, keywords (do, if and for etc), and operators
    (+, -, etc).
              Model of a Compiler:
   In essence, the lexical analyzer performs low-level syntax analysis.
   For efficiency reasons, each class of tokens is given a unique internal
    representation number.
   For example, a variable name may be given a representation number of 1,
    a constant a value of 2, a label the number 3, the addition operator (+) a
    value of 4 etc.
   For example in C:
    TEST: if a > b
    x=y ;

   Would be translated by the lexical analyzer into the following sequence of
    token:
       Model of a Compiler:
TEST            3

:               26

if              20

a               1

>               15

b               1

x               1

=               10

y               1

;               27
             Model of a Compiler:

   Note that in scanning the source statement and generating the
    representation number of each token we have ignored spaces (or blanks)
    in the statement.
   The lexical analyzer must, in general, process blanks and comments.
   Certain programming languages allow the continuation of statements
    over multiple lines.
   Lexical analyzers must then handle the input processing of such
    multiple-line statements.
   Also, some scanners place constants, labels, and variable names in
    appropriate tables.
           Model of a Compiler:

    A table entry for a variable, for example, may contain its name, type
    (i.e. int, float etc), object program address, value, and line in which it is
    declared.
   The lexical analyzer supplies tokens to the syntax analyzer.
   These tokens may take the from of a pair of items.
   The first item gives the address or location of the token in some
    symbol table.
   The second item is the representation number of the token.
   Such an approach offers a distinct advantage to the syntax analyzer;
    namely, all token are represented by fixed-length information: an
    address (or pointer) and an integer.
              Model of a Compiler:
   The syntax analyzer is much more complex than the lexical analyzer.
   Its function is to take the source program (in the form of tokens) from
    lexical analyzer and determine the manner in which it is to be
    decomposed into its constituent parts.
   In syntax analysis we are concerned with grouping tokens into larger
    syntactic classes such as expression, statement, and procedure.
   The syntax analyzer (or parser) outputs a syntax tree (or its equivalent) in
    which its leaves are the tokens and every non-leaf node represents a
    syntactic class type.
              Model of a Compiler:
   The syntax tree produces by the syntax analyzer is used by the semantic
    analyzer.
   The function of the semantic analyzer is to determine the meaning (or
    semantics) of the source program.
   The semantic analyzer actions may involve the generation of an
    intermediate form of source code.
   For the expression (A + B) * (C + D), the intermediate source code might
    be the following set of quadruples:
      (+, A, B, T1)
      (+, C, D, T2)
      (*, T1, T2, T3)
   Where (+, A, B, T1) is interpreted to mean “add A and B and place the
    result in temporary T1” and so on.
            Model of a Compiler:

   An infix expression may be converted to an intermediate form
    called Polish Notation (Assignment)
   The output of the semantic analyzer is passed on to the code
    generator.
   An this point the intermediate form of the source language
    program is usually translated to either assembly language or
    machine language.
              Model of a Compiler:
   Above expression will be in assembly like:
    LDA A
    ADD B
    STO T1
    LDA      C
    ADD D
             STO T2
             LDA T1
             MUL T2
             STO T3
             Model of a Compiler:


   The topic of code generation is passed on to a code optimizer.
   This process is present in more sophisticated compilers.
   Its purpose is to produce a more efficient object program.

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:7
posted:10/15/2012
language:English
pages:30