Docstoc

Interpreters and VMs for OODLs

Document Sample
Interpreters and VMs for OODLs Powered By Docstoc
					From OODL Interpreters
  to VM’s and Beyond
    Jonathan Bachrach
       MIT AI Lab
                     Outline
•   Material mostly from “Lisp In Small Pieces”
•   Metacircular Interpreters
•   CPS Interpreters
•   Fast Interpreters
•   VM’s and Compilation
•   Speeding up VM’s
•   VVM’s
•   In Language VM’s
•   Assignment #2
         Lisp In Small Pieces
• Written by Christian Queinnec
• Highly recommend looking at this book
• Basis for a fair amount of today’s lecture
Q: Why Interpreters?
        A: Why Interpreters?
• Semantics of language
  – is manifest
  – is measurable
• Allows more
  – Dynamism
  – Extensibility
      Metacircular Interpreters
• Write interpreter in same language using
  own special forms
• Complexity of resulting interpreter is a
  measure of expressivity and simplicity of
  language
• Dream is to get a compiler for free from
  this!
                   Simple Environment
(dv <frame> <assocs>)
                                                 *env*      next          next
(dv <env> (isa <any>))
  (slot <env> (env-next <env>))                            frame         frame
  (slot <env> (env-frame <env>) (isa <frame>))

(dv $empty-env (isa <env>))

(dm env-define-binding!
                                                         keytext val   keytext val
    ((env <env>) (name <sym>) (value <any>))
  (set (elt (env-frame frame) name) value))
                                                         keytextval    keytextval
(dm env-extend!
    ((env <env>) (names <lst>) (values <lst>)
                                                         keytextval
     => <frame>)
  (let ((new-env (isa <env> (set env-next env))))
    (do2 (fun (name value)
           (set (elt (env-frame new-env) name) value))
      names values)
    new-env))

(dm env-binding-value
    ((env <env>) (name <sym>) => <any>)
  (if (== env $empty-env)
      nul
      (let ((value (elt (env-frame env) name)))
        (if (== value nul)
            (env-binding-value (env-next env) name)
            value))))
              Proto in Proto
(dm evol ((e <any>) (env <env>)) e)

(dm evol ((e ‘()) (env <env>)) e)

(dm evol ((e <sym>) (env <env>))
  (env-binding-value env e))

(dm evol ((e <lst>) (env <env>))
  (evol-list (head e) e env))

(dm evol-list ((_ 'quote) (e <lst>) (env <env>))
  (sexpr-text-of-quotation e))
      Q: What’s the Problem?
• Why might a metacircular interpreter not be
  acceptable?
      A: What’s the Problem?
• Creates bootstrapping issues
• Very inefficient
          Tail Position / Calls
• If an expression is in tail position then it is
  not necessary to restore the environment
   – (fun (x) (f x))
   – (fun (x) (if (f x) 5 6))
• Tail calls are calls that are in tail position
  and don’t save/restore environment
   – Bind arguments
   – Goto function
Continuation Passing Style (CPS)
• CPS = Continuation Passing Style
  – Explicitly telling a computation where to send
    the result of that computation
  – Once computation is complete, executor applies
    receiver to result rather than returning it as a
    normal value
• Example
  – (fun (x) (f x)) => (fun (x k) (f’ x k))
            CPS Interpreter
• Control is more explicit by passing
  continuations through interpreter
• Allows one to code complicated control
  forms without built-in support.
  – E.g., call/cc
             CPS Interpreter Example
(dm evol-list ((_ ‘quote) (e <lst>) (env <env>) (k <fun>))
  (k (sexpr-text-of-quotation e)))

(dm evol-list ((_ 'if) (e <lst>) (env <env>) (k <fun>))
  (evol (sexpr-if-test e) env
     (fun (e)
       (if e
           (evol (sexpr-if-then e) env k)
           (evol (sexpr-if-else e) env k)))))

(dm evol-seq+ ((e+ <lst>) (env <env>) (k <fun>))
  (if (empty? (tail e*))
      (evol (head e*) env k)
      (evol (head e*) env
         (fun (e) (evol-seq+ (tail e*) env k)))))

(dm evol-seq ((e* <lst>) (env <env>) (k <env>))
  (if (empty? e*)
      (k #f)
      (evol-seq+ e* env k)))

(dm evol-list ((_ 'seq) (e <lst>) (env <env>) (k <fun>))
  (evol-seq (sexpr-begin-actions e) env k))
                     CPS Wins
(dm evol-list ((_ 'lab) (e <lst>) (env <env>) (k <fun>))
  (let ((exit (make-function '(x) `(,k x) env)))
    (evol-seq (sexpr-block-body e)
      (env-extend! env (lst (sexpr-block-name e)) (lst exit))
      k)))
                CPS Loses
• Problem is that some special forms like fin
  (cf. Lisp’s unwind-protect) are difficult* to
  implement without resorting to built-in
  support
  – *It appears that they are possible with enough
    continuations
• Conses (allocates) lots of closures for
  continuations during interpretation
      Interpreter Pretreatment
• Lighten up environment
• Invent beginnings of instruction set
  – Implemented in terms of thunks (zero
    parameter functions)
• Threaded as tree of calls to thunks
• Reject environment (rep’d as global *env*)
• Keep track of tail position
                    Fast Environment
• Globals stored in vector
• Locals are in list of inlined
  frames                                *env*       next:   next:



• Model environment structure in
  static environment
• Resolve environment accesses          *globals*

  to fixed or relative offsets
    – Globals are fixed offset
    – Locals are two relative offsets
• N.B., Can do even better by
  flattening ribcage environment
  into a stack
              Pretreatment
• Pretreatment of programs is handled by the
  function meaning (compile)
• Converts into thunks that will serve as
  threading mechanism
• Each thunk can be thought of as an address
  to which we simply jump to execute
        Inventing Instructions
• Use combinators which are like instructions
  for an abstract machine:

  (dm CONSTANT (value)
    (fun () value) )

  (dm meaning-list ((_ 'quote) (e <lst>) r t?)
    (fun ()
      (CONSTANT (sexpr-text-of-quotation e)) ))
  Interpreter Pretreatment Output
                                    (FIX-CLOSURE
(fun (n f k)                         (ALTERNATIVE
  (if (= n 0)                         (CALL2 #<=>
                                        (SHALLOW-ARGUMENT-REF 0)
      (k 1)                             (CONSTANT 0))
                                      (TR-REGULAR-CALL
      (f (- n 1) f                      (SHALLOW-ARGUMENT-REF 2)
         (fun (r) (k (* n r))))))       (STORE-ARGUMENT (CONSTANT 1)
                                          (ALLOCATE-FRAME 1) 0)
                                      (TR-REGULAR-CALL
                                        (SHALLOW-ARGUMENT-REF 1)
                                        (STORE-ARGUMENT
                                          (CALL2 #<-> (SHALLOW-ARGUMENT-REF 0)
                                            (CONSTANT 1))
                                          (STORE-ARGUMENT (SHALLOW-ARGUMENT-REF 1)
                                            (STORE-ARGUMENT
                                              (FIX-CLOSURE
                                                 (TR-REGULAR-CALL
                                                    (DEEP-ARGUMENT-REF 1 2)
                                                    (STORE-ARGUMENT
                                                      (CALL2 #<*> (DEEP-ARGUMENT REF 1 0)
                                                        (SHALLOW-ARGUMENT-REF 0))
                                                      (ALLOCATE-FRAME 1) 0))
                                                 1)
                                               (ALLOCATE-FRAME 2) 2)
                                             1)
                                           0)))
                                     3)
              Control Forms
(dm ALTERNATIVE (m1 m2 m3)
  (fun () (if (m1) (m2) (m3))) )

(dm meaning-list ((_ 'if) (e <lst>)   r t?)
  (let ((m1 (meaning (sexpr-if-test   e) r #f))
        (m2 (meaning (sexpr-if-then   e) r t?))
        (m3 (meaning (sexpr-if-else   e) r t?)) )
    (ALTERNATIVE m1 m2 m3) ) )
                        Sequence
(dm SEQUENCE (m m+)
  (fun () (m) (m+)) )

(dm meaning-sequence (e+ r t?)
  (if (empty? e+)
      (static-wrong "Illegal syntax: (seq)")
      (if (empty? (tail e+))
          (meaning (head e+) r t?)
          (meaning*-sequence (head e+) (tail e+) r t?))))

(dm meaning*-sequence (e e+ r t?)
  (let ((m1 (meaning e r #f))
        (m+ (meaning-sequence e+ r t?)) )
    (SEQUENCE m1 m+) ) )
                       Closures
(dm FIX-CLOSURE (m+ arity)
  (let ((arity+1 (+ arity 1)))
    (fun ()
      (loc ((the-function (v* sr)
              (if (= (frame-length v*) arity+1)
                  (seq (set *env* (sr-extend* sr v*))
                       (m+) )
                  (wrong "Incorrect arity") ) )
      (make-closure the-function *env*) ) ) )

(dm meaning-fix-abstraction (n* e+ r t?)
  (let ((arity (len n*))
        (r2 (r-extend* r n*))
        (m+ (meaning-sequence e+ r2 #t)) )
    (FIX-CLOSURE m+ arity) ) )
                          Calls
(dm TR-REGULAR-CALL (m m*)
  (fun ()
    (let ((f (m))) (invoke f (m*)) ) ) )

(dm REGULAR-CALL (m m*)
  (fun ()
    (let ((f (m)) (v* (m*)) (sr *env*)
          (result (invoke f v*)) )
      (set *env* sr)
      result ) ) )

(dm meaning-application (e e* r t?)
  (let ((m (meaning e r #f))
        (m* (meaning* e* r (len e*) #f)) )
    (if t? (TR-REGULAR-CALL m m*) (REGULAR-CALL m m*)) ) )
           Argument Processing
(dm meaning* (e* r size t?)
  (if (empty? e*)
      (meaning-no-argument r size t?)
      (meaning-some-arguments (head e*) (tail e*) r size t?)))

(dm meaning-no-argument (r size t?)
  (ALLOCATE-FRAME size))

(dm meaning-some-arguments (e e* r size t?)
  (let ((m (meaning e r #f))
        (m* (meaning* e* r size t?))
        (rank (- size (+ (len e*) 1))))
    (STORE-ARGUMENT m m* rank)))
Q: Where Do We Go From Here?
          VM Introduction
• Portable
• Small
• Factors Implementation
        VM Design Parameters
• Space
   – Runtime
   – Delivery
• Efficiency
   – Runtime
   – Compilation
• Analyzability
   – Inferred types
   – Control flow info
        From Interpreter to VM
•   Linearize
•   Invent registers
•   Invent stack
•   Code generation through concatenation
•   Byte code instructions
            Byte Code Machine
*env*                            next:            next:




*fun*                                                     code   env



*stack*                                                                *constants*
                                           used
*stack-index*

*pc*

                                           free
                  ... 34 250 41 4 34 ...

                Bytecode
                  “Instructions”
(SHALLOW-ARGUMENT-REF j)       (PREDEFINED I)
(DEEP-ARGUMENT-REF I j)        (SHALLOW-ARGUMENT-SET J M)
(DEEP-ARGUMENT-SET! I j m)     (GLOBAL-REF I)
(CHECKED-GLOBAL-REF I)         (GLOBAL-SET! I M)
(CONSTANT V)
                               (ALTERNATIVE M1 M2 M3)
(SEQUENCE M M+)
                               (TR-FIX-LET M* M+)
(FIX-LET M* M+)
                               (CALL0 ADDRESS)
(CALL1 ADDRESS M1)
(CALL3 ADDRESS M1 M2 M3)       (CALL2 ADDRESS M1 M2)
(REGULAR-CALL M M*)            (FIX-CLOSURE M+ ARITY)
(ALLOCATE-FRAME SIZE)          (TR-REGULAR-CALL M M*)
                               (STORE-ARGUMENT M M* RANK)
M,m1,m2,m3,m+,v are values
Rank,arity,size,I,j are ints
Address is predefined funct
           Introduce *val*
• Break implicit communication between
  instructions by introducing result register

  (dm CONSTANT (value)
    (fun () (set *val* value)))
                Inventing Stack

(dm STORE-ARGUMENT (m m* rank)
  (fun () (m)
     (let ((v *val*)) (m*)
        (set (frame-argument *val* rank) v))))


=>

(dm STORE-ARGUMENT (m m* rank)
  (fun () (m) (stack-push *val*)
     (m*)
     (let ((v (stack-pop)))
        (set (frame-argument *val* rank) v))))
                  Linearization
(dm SHALLOW-ARGUMENT-SET! (j m)
  (fun () (m)
    (set (frame-argument! *env* j) *val*)))


=>

(dm SHALLOW-ARGUMENT-SET! (j m)
  (cat m (SET-SHALLOW-ARGUMENT! J)))
(dm SET-SHALLOW-ARGUMENT! (j)
  (lst (fun () (set (frame-argument! *env* j) *val*)))))
                         Inventing PC
(dm REGULAR-CALL (m m*)                   (dm run ()
  (cat m (PUSH-VALUE)
       m* (POP-FUNCTION) (PRESERVE-ENV)
                                            (let ((inst (head *pc*)))
          (FUNCTION-INVOKE)                   (set *pc* (tail *pc*))
          (RESTORE-ENV)))                     (inst)
(dm PUSH-VALUE ()
                                              (run)))
  (lst (fun () (stack-push *val*))))
(dm POP-FUNCTION ()
  (lst (fun ()
         (set *fun* (stack-pop))))
(dm PRESERVE-ENV ()
  (lst (fun () (stack-push *env*))))
(dm FUNCTION-INVOKE ()
  (lst (fun () (invoke *fun*))))
(dm RESTORE-ENV ()
  (lst (fun ()
         (set *env* (stack-pop)))))
             VM Dispatch Loop
(dm run ()
  (let ((instruction (head *pc*)))
    (set *pc* (tail *pc*))
    (instruction)
    (run)))

• Already removed decode overhead by translating
  incoming bytecodes to direct calls
• Still extra overhead
   – Return from instruction
   – Loop
    Reducing Dispatch Overhead
• Convert back to threaded code
• Each instruction is responsible for calling next one:
   (dm CONSTANT (value)
     (fun () (set *val* value)
       (let ((instruction (head *pc*)))
       (set *pc* (tail *pc*))
       (instruction))

• Example Push3 on PowerPC takes
   – 2 instrs to do real push3
   – 11 total instrs for pc loop version
   – 5 total instrs for threaded version
• Can do same with fast interpreter
   – Use CPS threading
   – Consing not an issue if closures created during pretreatment
        Hand Selected Macro
            Instructions
• Still have overhead especially for simple
  instructions
• Can do better by combining oft-occurring
  sequences of instructions
  – amortizing the decode/dispatch overhead
  – customizing resulting instruction to remove any
    other interpretive overhead
  – shrinks byte-code size
           Automatic Macro
         Instruction Detection
• Can run statistics off-line to determine best
  macros
• Can also run statistics on-line
                       VVM’s
•   Tower of VM’s
•   Record expansions
•   Concatenate
•   Peephole optimize at each level
    – Specialization of instructions
       • Now that actual arguments are available
    – Between instruction optimizations
            Boxing/Unboxing
• Objects all the way down
• Integers are objects and must be self-identifying
• One way to encode them is to wrap integer data in
  a “box” which also points to integer prototype
• Integer ops must then first unbox before running
  machine level integer op and finally box result
• We’ll be talking about alternative representations
  later
                   VVM Example
(fun (x y) (* (+ x 1) y))     (SHALLOW-ARGUMENT-REF 0 R0)
=>                            (CONSTANT 1 R1)
(SHALLOW-ARGUMENT-REF 0 R0)   (UNBOX-INT R0 R2)
(CONSTANT 1 R1)               (UNBOX-INT R1 R3)
(I+ R0 R1 R2)                 (PI+ R2 R3 R4)
(SHALLOW-ARGUMENT-REF 1 R3)   (BOX-INT R4 R5)
(I* R2 R3 R4)                 (SHALLOW-ARGUMENT-REF 1 R6)
(RETURN R4)                   (UNBOX-INT R5 R7)
                              (UNBOX-INT R6 R8)
                              (PI* R7 R8 R9)
                              (BOX-INT R9 R10)
                              (RETURN R9)
                R/VVM’s
• Provides extensible VM infrastructure
• Permits the running of many different
  instruction sets customized to different
  languages
• Examples
  – Greg Sullivan’s DVM
  – Ian Piumarta’s VVM
        Bytecodes can be Faster
         than Direct Execution
• When
  – There is a high price for cache misses
  – VM can fit inside cache
  – The interpretive overhead of the bytecodes is small
    (i.e., they are high-level)
• But
  – There will be cases where interpretive overhead is high
• Suggests hybrid approach
  – Demand driven translation
  – Analysis determines interpretive overhead
                In C VM’s
+   Portable
-   Hard to interoperate with native code
-   Needs own versions of special forms
-   Needs its own FFI to C
         In Language VM’s
+ Write VM in Same Language
- Challenging to get high performance
    type system can be overly restrictive
- Tough Bootstrap
                 IDVM
• In Dylan Virtual Machine was developed at
  Harlequin by Tony Mann and Eliot Miranda
• Consider IPVM
                          Example
(dm is-numeric? ((x <int>)) #t)

=>
(dm IPVM-assign-constant-to-result
    ((vm-code <vec>) (ip <int>) result (vars …))
  (let ((new-result (elt vm-code ip))
        (new-ip     (+ ip 1))
        (next-inst (elt vm-code new-ip)))
    (apply next-inst vm-code (+ new-ip 1) new-result vars)))

(dm IPVM-return ((vm-code <vec>) (ip <int>) result (vars …))
  result)

(fun ((x <int>))
  (let ((ip 0)
        (vm-code (vec IDVM-assign-constant-to-result #f IDVM-return))
        (first-inst (elt vm-code ip)))
    (first-inst vm-code (+ ip 1) #f a)))
                     The Trick
• Considerable effort was expending optimizing the
  following idiom:
   (dm foo (a b c (thingies …))
     …
     (apply somefunction a b c thingies))


• Converted into a tail call with thingies being stack
  allocated
• Each IPVM instruction tail calls the next with
  code, pc, and locals as arguments.
           Open Problems
• Compiler for Free with VVM’s
• Hybrid VM’s
• Type system’s powerful enough to do in
  language VM’s
• Analyzable VM
                             Readings
•   Queinnec: Lisp In Small Pieces
•   Deutsch, Schiffman: Efficient Implementation of the Smalltalk-80 System
•   Piumarta: Optimizing threaded code by selected inlining
•   Doyle, Moss: When are bytecodes faster than direct execution?
•   Withington, McKay, Palter, Robertson: The symbolics virtual lisp machine or
    using the DEC alpha as a programmable micro-engine
•   Mann, Miranda: A Dylan virtual machine in Dylan or son of Dylan Compiler
    in Dylan
•   Folliot, Piumarta, Riccardi: A dynamically configurable multi-language
    execution platform
•   Miranda: BrouHaHa - a portable smalltalk interpreter
•   Ford et al: microkernels meet recursive virtual machines
•   Chambers, Unger: Making pure object-oriented languages practical
•   Ingalls: The Evolution of the Smalltalk Virtual Machine
             Assignment #2
• Write a meta circular interpreter for proto
• Implement all the special forms
• Assume (i.e., use)
  – the object system
  – the reader and parsing functions
  – the special forms themselves

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:9
posted:5/6/2010
language:English
pages:53