compiler

Document Sample
compiler Powered By Docstoc
					COMPILER




Prepared By :
What is a Compiler ?
• A Compiler is a program that reads
  a program written in one language-
  the source language-and translates
  it into an equivalent program in
  another language-the target
  language.
• As an important part of this
  translation process the compiler
  reports to its user the presence of
  errors in the source programs.
  Compiling Process




SOURCE     COMPILER   TARGET
PROGRAM               PROGRAM




           ERROR
           MESSAGES
Contents of the topic

• Overview and history of
  compiler
• Comparison with interpreter
• Parts of a compiler
• Phases of a compiler
• Compilation of a sample
  program
• Compiler construction tools
Overview and history of
compiler
• First compiler was produced by
  IBM in 1950’s for FORTRAN
  language.
• It was taken 18 years to build
  the first compiler.
• Today we can build a compiler
  in a very few months.
• Designing an efficient and
  reliable compiler is still
  challenging.
Comparison with
Interpreter
                                    output
•              Interpreter

    program
                           data



    Compiler        binary        output




                    data
    program
    Comparison with
    interpreter
• In this model , the data and the source program are
  input to the interpreter. Instead of producing any
  object module as in the compilation model , the
  interpreter produces the results by performing the
  operation on the source program on its data.
• Interpreter is less efficient in execution than
  compiler.
• But it handles certain language features which can
  not be compiled e.g. languages like APL are
  normally interpreted.
• Interpreter can be portable as they don’t produce
  machine code.
• An interpreter gives us an improved debugging
  environment because it can check for errors like
  out of bounds array indexing at run time.
Parts of a compiler

• Analysis of a source program
• Synthesis of a source program
Phases of a compiler
                      Source program

                     Lexical Analyzer

                          tokens


                     Syntax analyzer

Error handler       Abstract syntax trees        Symbol table


                    Semantic Analyzer

                Intermediate code generator


                     Intermediate code

                Analysis phase of the compiler
                     Intermediate code


                      Code optimizer


Error handler          Optimized code           Symbol table



                       Code generator


                      Target program


                Synthesis phase of a compiler
Lexical Analysis
  Stream of
  characters           Lexical analyzer           tokens


•     It groups characters into tokens
•     Eliminates comments and spaces.
•     Process compiler directives.
•     Enter information into symbol table.
•     Scanner may be hand coded or may be
      generated from* regular expression.
                 X = y z + 10
Ex.

                    Lexical analyzer


       Id1 assign-op id2 mult-op id3 add-op num
Syntax Analysis
  tokens       Syntax analyzer      Syntax tree


  • Hierarchical analysis is called parsing or
    syntax analysis
  • Combines tokens to grammatical phrases.
  • The grammatical phrase of the source
    program are represented by a parse tree.
Example:
       Id1 assign-op id2 mult-op id3 add-op num


                     Syntax Analyzer


                        Assign-op



                      id1       add-op


                     Mult-op         num



               id2             id3
Semantic Analysis                  Annotated
Syntax tree    Semantic Analyzer   Parse tree



• Determines the meaning of the source
  string.
• Gathers type information for the
  subsequent code generation phase.
• It uses hierarchical structure determined
  by the syntax analysis phase to identify
  the operators and operands of expressions
  and statements.
• It checks each operator has operand that
  are permitted by the source language
  specification.
                   Assign-op


Example:    id1            Add-op


                       Mult-op     num


                 id2         id3

            Semantic Analyzer

                 assign-op


           id1             add-op


                   Mult-op         inttoreal


           id2             id3      num
   Intermediate code
   generation
   Annotated                                    Intermediate
                  Intermediate code generator
    Parsed tree                                    code

• Generate an explicit intermediate
  representation of the source program.
• This intermediate representation should have
  two important properties.
       1.easy to produce
       2.easy to translate to source program.
 We can think the intermediate representation
  as a program for an abstract machine.
Example:              assign-op


                id1         Add-op



                      Mult-op     inttoreal


                id2       id3       num


           Intermediate code generator



              temp1 : = id2 * id3

             temp2 : = inttoreal(num)
             temp3 : = temp1 + temp2

              Id1 : = temp3
Code optimization
Intermediate code         Code optimizer       Optimized code


• The optimizer tries to improve the
  intermediate code in order to achieve the
  faster running machine code.
               temp1 : = id2 * id3
                    temp2 : = inttoreal(num)
                    temp3 : = temp1 + temp2
                     Id1 : = temp3


                    Code optimizer


                    temp1 : = id2 * id3
                    Id1 : = temp1 + rnum
 Code generation
Optimized code   Code generator    Target program



 • Generate the target code for the optimized
   code.
 • The storage must be allocated or the
   register must be assigned to the variable.
 • Addressing modes to be used for
   accessing the data must be decided before
   generating the code.
Example:
           temp1 : = id2 * id3

           Id1 : = temp1 + rnum



           Code generator




           Movf   id3 , r2
           Mulf   id2 , r2
           Movf   rnum , r1
           Addf   r2 , r1
           Movf   r1, id1
Compilation of a sample
program X = y * z + 10
                         Lexical analysis

    Id1 assign-op id2 mult-op id3 add-op num

                             Syntax analysis

                   Assign-op


             id1             Add-op


                   Mult-op         num


           id2               id3
                    Semantic analysis

            Assign-op


      id1
                 Add-op


       Mult-op        inttoreal



id2          id3        num


                   Intermediate code generation
 temp1 : = id2 * id3
 temp2 : = inttoreal(num)
 temp3 : = temp1 + temp2
 Id1 : = temp3

          Code optimization

 temp1 : = id2 * id3
Id1 : = temp1 + rnum

           Code generation

 Movf id3 , r2
 Mulf id2 , r2
 Movf rnum , r1
 Addf r2 , r1
 Movf   r1, id1
Symbol table
Management
• A symbol is a data structure containing a
  record for each identifier with fields for the
  attributes of the identifiers.
• The data structure allows us to find the
  record for each identifier quickly and to
  store or retrieve data from that record
  quickly.
• When an identifier in the source program is
  detected by the lexical analyzer. The
  identifier is entered into the symbol table.
Error handler
• Each phase can encounter errors.
• After detecting an error a phase must
  some how deal with that error, so that
  compilation can proceed, allowing further
  errors in the source program to be
  detected.
• The syntax and semantic analysis phase
  usually handle a large fraction of the error
  detectable by the compiler.
• Errors in the structure of a token is
  determined by the syntax analysis phase.
   Compiler construction
   tools
• Parser generator : - It produces syntax
  analyzers. These parser generators take
  inputs based on context free grammar.
• Yacc is a LALR parser generator and is
  available as a command on UNIX.
• Lexical-analyzer generators produce lexical
  analyzers. These scanner generators take
  specification based on regular expressions.
• For example Lex is a lexical analyzer
  generator. This tool is available on UNIX.
  The basic organization of the resulting
  lexical analyzer is a finite automation.
Compiler construction
tools
• Syntax directed translation
  engines produce collections of
  routines that traverse the parse
  tree which generate
  intermediate code.
• Automatic code generators take
  a collection of rules defining the
  translation of each operation of
  the intermediate language to
THANK U

				
DOCUMENT INFO