Docstoc

Low Level Virtual Machine C_ Compiler - Projects

Document Sample
Low Level Virtual Machine C_ Compiler - Projects Powered By Docstoc
					Senior Project Proposal


                   Prabir Shrestha (4915302)
                      Myo Min Zin (4845411)
      Napaporn Wuthongcharernkun (4846824)
   Objective
   Motivation
   Scope
   The Framework
   Gantt Chart
   Questions and Answers



                            2
                       A Naive Compiler



                                        Low Level Virtual
C# Source File (.cs)                  Machine Intermediate
                                       Representation (.ll)




                                                              3
Front-end




Back-end




            4
   Evolution of Computer Programming
   Managed Code vs Unmanaged Code
   Bulky .NET Framework
   Operating Systems written in managed code




                                                5
 Why Low Level Virtual Machine?
  – Source Language independent
  – Retargetable code generator
  – Supports various architectures
     • X86, PowerPC, ARM
  – Open source




                                     6
 It is not
   – a compiler,
   – a virtual machine alike JVM, .NET Framework
 It is
   – A modular compiler infrastructure
      • a collection of (C++) libraries and tools to help in
        building compilers, debuggers, program analyzers etc.




                                                                7
 Commonly referred to as LLVM
 Started as academic project at University of
  Illinois on 2002.
 Current development mainly by Apple Inc.
 Projects related to LLVM
  – Clang: C/C++ front-end; aims to replace gcc
  – OpenGL engine in Mac OS X 10.5
  – used by Adobe Systems Inc., Nvidia, Sun
    Microsystems Laboratories

                                                  8
 Keywords- Categories
 Operators and Special Characters
 Source Language Features




                                     9
 Types
  bool char float int string class   struct   enum   object


 Conditionals
  if      else

 Loops
  for       while     do




                                                              10
 Single Inheritance
  base


 Encapsulation
  private       protected   public


 Overloading Operators
  operator

 Method Overloading / Method Overriding
  override   virtual



                                           11
 Indexing & Properties(Accessor/ Mutator)
  get set value

 Modifiers
  static sealed

 Type Casting
  explicit   implicit




                                             12
base       enum       int         private     this

bool       explicit   namespace   protected   true

break      extern     new         return      typeof

char       false      null        sealed      using

class      float      operator    set         virtual

const      for        object      sizeof      void

continue   get        public      static      while

do         if         override    string      value

else       implicit               struct      is



                                                        13
       Operators & Special characters supported

x.y        x--       <        !=        *=        ||

f(x)       --x       >       (T) x      /=

a[x]       +        <=         =         *

x++         -       >=        +=         /

++x         !       ==        -=        &&




                                                       14
   Single class Inheritance
   Encapsulation
   Overloadable Operators
   Method Overloading/Overriding
   Properties (Accessors / Mutators)




                                        15
   Overall Process
   Scanner
   Parser
   Semantic Analyzer
   Code Generator
   Assembling and Linking




                             16
17
18
 Tokenization Process- Identifying the tokens
  from the input stream.
 Skip meaningless characters, white spaces,
 Lexical Analysis- Checking for Lexical Errors
 Using Coco/R tool the scanner and parser are
  generated at the same time.




                                                  19
  Syntax Analysis is performed at this phase.
  Coco/R generates a recursive descent parser.
      – Top down parsing method
      – Procedural-like functions
      – Generally for each production rule, one procedure
        is generated.

  Accepts Grammar in LL(k) Form
      – LL(1) Conflict Resolvers may be needed

LL: Left to Right, Left most Derivation                     20
 Parser Error-Recovery Techniques
  – Synchronization
  – Weak Symbols

 Synchronization Technique
  – SYNC symbols are placed in the grammar, where
    there’s unlikely to be errors.

  – Upon error detection, parser skips input symbols until
    it finds one that is expected at a synchronization
    point.


                                                             21
 Weak Symbols
  - Placed in front of tokens that are prone to error,
    often misspelled or missing.

  - When error is encountered, reports error and can
  jump to next synchronization point.




                                                         22
 Synchronization Example
    TypeDecl
    =
    SYNC
     ( "class" ident [ClassBase] ClassBody [";"]
         | "struct" ident [Base] StructBody [";"]
         | "enum" ident [":" IntType] EnumBody [";"]
       )
    .

 Weak Symbols Example
                 EnumBody
                 =
                 "{" EnumMember { WEAK "," EnumMember} "}".


                                                              23
24
 A phase that follows after the generation of
  parser
 To check semantic error once the lexical and
  syntax errors have been checked
 Examples:
  – type checks, scoping of variable, constant values
    not being changed, no redefinitions of a classes,
    method and member variables



                                                        25
26
 After AST and semantic analysis
 Generating LLVM Intermediate Representation (IR)




                                                     27
 Language and Target independent
 Designed to support multiple language
  frontends
 Represents the key operations of ordinary
  processors
 Avoids machine specific constraints
  – physical registers, pipelining

                                              28
 Does not define runtime and OS system
  functions
  – these are defined by runtime libraries
 IR is a typed Virtual Instruction Set
  – unbounded number of registers
  – operations are low level
  – checked for consistency


                                             29
 Usually 3-address code
     %temp2 = add i32 %temp0, %temp1
 Instructions are typed
 Instructions are polymorphic
 Usually Static Single Assignment (SSA) Form
  – new register for each result
  – uses phi (ɸ) functions
  – code generator tries to store these variables in same
    real registers

                                                            30
                      Constant Folding
 Simplifies constant expressions at compile time

                          Example
           i = 100 * 20 * 3         i = 6000




                                                    31
                    Constant Propagation
 Substituting the values of known constants in
  expressions at compile time

                       Example
   int x = 7        int x = 7        int x = 14
   int y = 14 – x   int y = 14 - 7   int y = 7




                                                  32
                 Strength Reduction
 Costly operation is replaced with equivalent but
  less expensive operation
                         Examples
             y = x / 8          y = x >> 3

             y = x * 64         y = x << 6

             y = x * 2              y = x + x




                                                     33
          Elimination of Useless Instruction
 Drop instructions that do not modify any memory
  storage
                        Examples
            x = y + 0              x = y

            y =   z * 1            y = z




                                                    34
35
                  User Phases

  LLVM C# Compiler
                 LLVM Intermediate Representation (.ll)

LLVM Assembler (llvm-as)              LLVM Bitcode (.bc)


          LLVM Linker (llvm-ld)

                                               Executable

                                                            36
37
38

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:0
posted:3/25/2013
language:English
pages:38