Run Time Storage Organization by klutzfu54

VIEWS: 255 PAGES: 36

									Run Time Storage Organization

           Chapter 7


• We have covered the front-end phases
  – Lexical analysis
  – Parsing
  – Semantic analysis
• The back-end phases are:
  – Optimization (optional)
  – Code generation (we’ll cover some basic issues)
• We’re almost ready for a look at code
  generation. . .
  – but need to consider the run time environment

Run-time environments   (Section 7.1)

• Before discussing code generation, we need to
  understand what kind of memory environment
  we need to provide to support typical program

• This also depends on the language being
  supported and what features are supported
  – e.g. recursion

Run-time Resources

• Execution of a program is initially under the
  control of the operating system

• When a program is invoked:
  – The OS allocates space for the program
  – The code is loaded into part of the space
  – The OS jumps to the entry point (i.e., “main”)

Memory Layout

                          Low Address


            Other Space

                          High Address

• Our pictures of machine organization have:
  – Low address at the top
  – High address at the bottom
  – Lines delimiting areas for different kinds of data

• These pictures are simplifications
  – E.g., not all memory need be contiguous
     • Unix "text" and "data" segments may be allocated in
       completely different parts of memory
“ Other Space”:

• Holds all data for the program
• Other Space = Data Space

• Compiler is responsible for:
  – Generating code
  – Orchestrating use of the data area

Code Generation Goals

• Two goals:
  – Correctness - essential
  – Speed - desirable

• Most complications in code generation come
  from trying to be fast as well as correct
  – There are simple code generations schemes that
    produce correct but sub-optimal code.

Assumptions about Execution

1) Execution is sequential; control moves from
   one point in a program to another in a well-
   defined order
2) When a procedure is called, control
   eventually returns to the point immediately
   after the call
Do these assumptions always hold?
•   not always: setjmp/longjmp in C, “try” in Java

• An invocation of procedure P is an activation
  of P
• The lifetime of an activation of P is
  – All the steps to execute P
  – Including all the steps in functions, procedures or
    methods that P calls
     • since P could call other procedures, then P remains
         “active” until they return, and P itself terminates

Lifetimes of Variables

• The lifetime of a variable x is the portion of
  execution in which x is defined
• The local variables of P are especially
  – since the same procedure can have multiple
    activations, there can be multiple instances of the
    same variables, associated with each activation
• Note that
  – Lifetime is a dynamic (run-time) concept
  – Scope is a static concept
 Stacks and Activations

1. Execution Assumption (2) (control returns to
   P after Q completes) implies that when P calls
   Q, then Q returns before P does
2. Lifetimes of procedure activations are
   properly nested
3. The variables and other data that go with the
   activations can be organized on a stack

Revised Memory Layout

                        Low Address



                        High Address
Activation Records

• The information needed to manage one
  procedure activation is called an activation
  record (AR) or frame

• If procedure F calls G, then G’s activation
  record contains a mix of info about F and G.

What is in G’s AR when F calls G?

• F is “suspended” until G completes, at which point F
  will resume. G’s AR contains information needed to
  correctly resume execution of F.

• G’s AR generally also contains:
   – G’s return value (needed by F)
      • (unless it is returned in a register)
   – Actual parameters to G (supplied by F)
      • (unless they are passed in registers)
   – Space for G’s local variables
The Contents of a Typical AR for G

• Space for G’s return value
• Actual parameters
• Pointer to the previous activation record
  – (The control link); points to AR of caller of G
     • purpose is to restore caller’s environment when G returns

• Machine status prior to calling G
  – Contents of registers & program counter
• Local variables
• Other temporary values

int main () { int i; i=f(3); }
int g() { return 1; }
int f(int x) {
  if x = 0 return g(); else return f(x - 1); }

                 Argument (x)
AR for f:        control link
                 return address
Stack After Two Calls to f

             f (result)
                             Points to a
                 3 (arg)
       AR:                   location in
                 CL          the code
                 (RA)        for main
             f   (result)
       AR:       2           Points to a
                 CL          location in
                 (RA)        the code
                             for f()

• main() has no argument or local variables and
  its result is never used; its AR is
• The (RA) are return addresses of the
  invocations of f
  – The return address is where execution resumes
    after a procedure call finishes

• This is only one of many possible AR designs
  – Would work for C, Pascal, FORTRAN, Java, etc.
Key Point

 The compiler must determine, at compile-time,
 the layout of activation records and generate
 code that correctly accesses locations in the
 activation record

 Thus, the AR layout and the code generator
 must be designed together

 This picture shows the state after the call to
 2nd invocation of f returns.
                                f (result)
 Location of return
 value such that caller can
 find it at fixed offset
 from its own frame.
                                f    1
 RA also easily found
 for easy return

• There is nothing magic about this organization
  – Can rearrange order of frame elements
  – Can divide caller/callee responsibilities differently
  – An organization is better if it improves execution
    speed or simplifies code generation
• Real compilers hold as much of the frame as
  possible in registers
  – Especially the method result and arguments

Memory Layout with Stack (contains AR’s)

                             Low Address



                             High Address

• All references to a “global” (or “static”)
  variable point to the same object
  – Can’t store them in an activation record

• Globals are assigned a fixed address once
  – Variables with fixed address are “statically
• Depending on the language, there may be
  other statically allocated values   24
Memory Layout with Static Data

                            Low Address

Memory       Static Data


                            High Address
Heap Storage

• A value that outlives the method that creates
  it can’t be kept in the activation record
        method foo() { return new Bar }
  The Bar value must survive deallocation of foo’s AR
• Languages with dynamically allocated data use
  a heap to store dynamic data
  – Heap management quite different from stack
  – Need dynamic list of free blocks, etc.
  – Garbage collector (Java)
  – Fragmentation a problem                  26

• The code area contains object code
  – For most languages, fixed size and read only
• The static area contains data (not code) with
  fixed addresses (e.g., global data)
  – Fixed size, read-only (constants) or R/W
• The stack contains an AR for each currently
  active procedure
  – Each AR usually fixed size, computed by compiler,
    (contains local variables)
• Heap contains all other data
  – In C, heap is managed by malloc and free       27
     • depends strongly on run-time behavior of program
Summary (Cont.)

• Both the heap and the stack grow

• Must take care that they don’t grow into each

• Solution: start heap and stack at opposite
  ends of memory and let them grow towards
  each other
Memory Layout with Heap

                          Low Address

Memory      Static Data


              Heap        High Address
Other Implementation Issues

• Generally the compiler does not generate all
  the code that will be executed at run time
• Much functionality is provided by run-time
  libraries of support code
  – Memory management, file access, other input-
    output interfaces, thread control, etc
  – Compiler must generate code to be linked to these
  – Operating systems often define calling conventions
    to support multiple languages
     • which compilers must follow....      30
Case Study:“WABA” Implementation of JVM

• To show one view of a “real” activation record,
  we’ll look at what happens in one version of
  the Java Virtual Machine when:
  – a method starts running
  – arguments are pushed prior to a new invoke
  – invoke is executed setting up a new activation

• Remember this is an interpreted machine, not
  anything like MIPS…                 31
Activation Record Structure for a Running
Method on JVM
                             var and stack
     var  local 0
                             are “machine”
           local 1           Registers.
           local n           Housekeeping
   stack  stack 1           links are also
                             on the stack but
           stack 2
                             outside any area
           stack m           addressable by
           method ptr        JVM instructions

           class ptr

Ready To Invoke Method With 2 Arguments

     var  local 0
           local 1
           local n
           arg 1
           arg 2
   stack  stack m
           method ptr
           class ptr

After Invoke of New Method
              local 0
              local 1           • Notice arguments copied
              local n             into new AR
              stack 1
              stack 2           • Saved stack pointer reflects
              stack m
              method ptr
                                  removal of arguments from
              class ptr           caller’s AR
              saved var         • “Control link” (purple) is
              saved stack
      var  arg 1 (local 0)
                                  required info to restore
              arg 2 (local 1)     caller’s AR upon return
             local 3
     stack  stack 1            • PC points to bytecode after
             stack 2
                                  the invoke
             stack m
             method ptr         • Only local vars and local
             class ptr
                                  stack visible to programmer
JVM Notes

• JVM is a "software" machine
• Formal parameters are rearranged by
  "Invoke" to occupy the first n "local vars"
• Fits the JVM philosophy well:
  – Can take advantage of special "short" instructions
    to access the first m locals
  – similar handling of formals and locals
  – Overheads of moving words around not a relative
    performance problem
• Next: Code generation for MIPS


To top