Precise and Efficient Garbage Collection in VMKit with MMTk by ert634

VIEWS: 19 PAGES: 49

									 Precise and Efficient Garbage
Collection in VMKit with MMTk

       LLVM Developer’s Meeting
           Nicolas Geoffray

        nicolas.geoffray@lip6.fr
                   Background

•   VMKit: Java and .Net on top of LLVM
    •   Uses LLVM’s JIT for executing code

    •   Uses Boehm for GC

•   Performance bottlenecks
    •   No dynamic optimization

    •   Conservative GC




                                  2
                                  CPU-intensive Benchmarks
                                           (JGF)
                           5.00
                                                                                    MacOSX 10.5
                                                                                     8 x X86_32
                                                                                   2.66GHz 12Go
Time normalized to VMKit




                           3.75



                           2.50



                           1.25



                             0
                                   series   lufact    heapsort   crypt       fft   sor    matmult

                                              VMKit                  Jikes          Sun
                                                                 3
                                  VM-intensive Benchmarks
                                         (Dacapo)
                           1.50
                                                                               MacOSX 10.5
                                                                                8 x X86_32
                                                                              2.66GHz 12Go
Time normalized to VMKit




                           1.13



                           0.75



                           0.38



                             0
                                   antlr   fop     luindex          bloat   jython     pmd

                                           VMKit                 Jikes           Sun
                                                             4
         Execution Overheads
      antlr              fop            luindex

   7%                 9%
                     2%                  5%
                                        2%
  7%                8%                 8%
10%                6%                12%
11%        65%
                             75%              73%


   bloat             jython             pmd

  7%5%                                   4%
                   24%                20%
18%                                         38%
           53%     4%          49%   1%
 17%               12%               20%
                     11%                   17%


Application        Allocations        Collections

System.arraycopy   Interface calls    Others
                         5
               Execution Overheads
            antlr                    fop                  luindex

         7%                       9%
                                 2%                        5%
                                                          2%
        7%                      8%                       8%
21%   10%                14%   6%                20%   12%
      11%        65%
                                         75%                    73%


         bloat                   jython                   pmd

        7%5%                                               4%
                               24%                      20%
      18%                                                     38%
35%              53%     23%   4%          49%   37%   1%
       17%                     12%                     20%
                                 11%                         17%


      Application              Allocations              Collections

      System.arraycopy         Interface calls          Others
                                     6
Goal: replace Boehm with
          MMTk

•   MMTk is JikesRVM’s GC
    •   Framework for writing GCs

    •   Multiple GC Implementations (Copying, Mark and trace, Immix)

•   Copying collectors require precise stack scanning

    •   Locate pointers on the stack




                                 7
              But... it’s in Java?

•   Yes, but nothing to be afraid of:
    •   Use of Magic tricks

    •   No use of runtime features (exceptions, inheritance)

    •   No use of standard library

•   Use VMKit’s AOT compiler

    •   Transform MMTk into a .bc file



                                  8
                    Outline

•   Introduction

•   Precise garbage collection

•   Compiling MMTk with VMJC

•   Putting it all together

•   What’s left




                              9
•   Introduction

•   Precise garbage collection

•   Compiling MMTk with VMJC

•   Putting it all together

•   What’s left




                              10
Precise Garbage Collection

•   Write code that locates pointers in the stack
    •   llvm.gcroot in JIT-generated code

    •   llvm.gcroot in VMKit’s runtime written in C++

•   Use LLVM’s GC framework to generate stack
    maps
    •   Caml stack maps for llvm-g++ generated code

    •   JIT stack maps for JIT-generated code




                                 11
Precise Garbage Collection
                         App.java




            llvm-g++
                         VMKit.exe
VMKit.cpp
                       Caml Stackmap

                                          Stackmaps
                                       for precise stack
                                           scanning
                         JITted App
                        JIT Stackmap



                          12
                 Stack Scanning

•   Problem: interweaving of different kinds of functions
    •   Application’s managed (Java or C#) functions: trusted

    •   VMKit’s C++ functions: trusted

    •   Application’s JNI functions: untrusted

•   Solution: create a side-stack for frame addresses
    •   Updated upon entry of a kind of method

    •   VMKit knows the kind of each frame on the thread stack



                                   13
             Type of methods

•   Trusted
    •   Has a stack map, so can manipulate objects (llvm.gcroot)

    •   Saves frame pointer (llvm::NoFramePointerElim)

•   Untrusted
    •   Has no stack map, so should not manipulate objects

    •   May not save the frame pointer




                                 14
Stack Scanning Example (1)
                 VMKit.main (C++)




            15
   Stack Scanning Example (1)
                        VMKit.main (C++)
push(Enter Java)
                        App.main (Java)




                   16
   Stack Scanning Example (1)
                        VMKit.main (C++)
push(Enter Java)
                        App.main (Java)

                        App.function (Java)




                   17
   Stack Scanning Example (1)
                          VMKit.main (C++)
push(Enter Java)
                          App.main (Java)

                          App.function (Java)
push(Enter native)
                          VMKit.runtime (C++)




                     18
   Stack Scanning Example (1)
                          VMKit.main (C++)
push(Enter Java)
                          App.main (Java)

                          App.function (Java)
push(Enter native)
                          VMKit.runtime (C++)
push(Enter Java)
                          App.function2 (Java)


                     19
   Stack Scanning Example (1)
                               VMKit.main (C++)
push(Enter Java)
                               App.main (Java)

                               App.function (Java)
push(Enter native)
                               VMKit.runtime (C++)
push(Enter Java)     fp
                               App.function2 (Java)


                          20
   Stack Scanning Example (1)
                               VMKit.main (C++)
push(Enter Java)
                               App.main (Java)

                               App.function (Java)
push(Enter native)   fp
                               VMKit.runtime (C++)
push(Enter Java)     fp
                               App.function2 (Java)


                          21
   Stack Scanning Example (1)
                               VMKit.main (C++)
push(Enter Java)
                               App.main (Java)
                     fp
                               App.function (Java)
push(Enter native)   fp
                               VMKit.runtime (C++)
push(Enter Java)     fp
                               App.function2 (Java)


                          22
   Stack Scanning Example (1)
                               VMKit.main (C++)
                     fp
push(Enter Java)
                               App.main (Java)
                     fp
                               App.function (Java)
push(Enter native)   fp
                               VMKit.runtime (C++)
push(Enter Java)     fp
                               App.function2 (Java)


                          23
Stack Scanning Example (2)
                 VMKit.main (C++)




            24
   Stack Scanning Example (2)
                        VMKit.main (C++)
push(Enter Java)
                        App.main (Java)




                   25
   Stack Scanning Example (2)
                        VMKit.main (C++)
push(Enter Java)
                        App.main (Java)
push(Enter JNI)
                        App.function (JNI)




                   26
   Stack Scanning Example (2)
                        VMKit.main (C++)
push(Enter Java)
                        App.main (Java)
push(Enter JNI)
                        App.function (JNI)

                        App.function2 (JNI)




                   27
   Stack Scanning Example (2)
                          VMKit.main (C++)
push(Enter Java)
                          App.main (Java)
push(Enter JNI)
                          App.function (JNI)

                          App.function2 (JNI)

push(Enter native)
                          VMKit.jniRuntime (Java)



                     28
   Stack Scanning Example (2)
                                     VMKit.main (C++)
push(Enter Java)
                                     App.main (Java)
push(Enter JNI)
                                     App.function (JNI)

                     saved fp        App.function2 (JNI)

push(Enter native)
                                     VMKit.jniRuntime (Java)



                                29
   Stack Scanning Example (2)
                                     VMKit.main (C++)
push(Enter Java)
                          fp         App.main (Java)
push(Enter JNI)
                                     App.function (JNI)

                     saved fp        App.function2 (JNI)

push(Enter native)
                                     VMKit.jniRuntime (Java)



                                30
   Stack Scanning Example (2)
                                     VMKit.main (C++)
                          fp
push(Enter Java)
                          fp         App.main (Java)
push(Enter JNI)
                                     App.function (JNI)

                     saved fp        App.function2 (JNI)

push(Enter native)
                                     VMKit.jniRuntime (Java)



                                31
       Running the GC


 A precise GC scans the stacks at safe points: point
during execution where the GC can know the type
            of each value on the stack




                       32
Single-threaded Application


•   GC always triggered at safe points
     •   gcmalloc instrunctions

     •   Collector::collect()




                                  33
Multi-threaded Application

•   When entering a GC, must wait for all threads to
    join
    •   Don’t use signals! or no safe point

    •   Use a thread-local variable to poll on method entry and
        backward branches

    •   Scan stacks of threads blocked in JNI or system calls




                                  34
Application changes for GC


 public static void runLoop(int a) {
      while (a--) System.out.println(“Hello World”);
 }




                          35
Application changes for GC
 public static void runLoop(int a) {
      if (getThreadID().doGC) GC()
      while (a--) {
            System.out.println(“Hello World”);
            if (getThreadID().doGC) GC()
       }
 }


                          36
•   Introduction

•   Precise garbage collection

•   Compiling MMTk with VMJC

•   Putting it all together

•   What’s left




                              37
                What is VMJC?

•   An Ahead of Time compiler (AOT)
    •   Generates .bc files from .class files

•   Use of llvm tools to generate platform-dependant
    files
    •   shared library: llc -relocation-model=pic + gcc

    •   executable: llc + ld vmkit + gcc




                                  38
Goal: compile MMTk with
         VMJC

•   Generate a .bc file that can be linked with VMKit
    •   Interface MMTK → VMKit (e.g. threads synchronization, stack
        scanning)

    •   Interface VMKit → MMTk (e.g. gcmalloc)




                                39
Why MMTk does not need
    a Java runtime?

•   No use of runtime features
    •   synchronizations, exceptions, inheritance

•   No use of standard library
    •   HashMap, LinkedList, ArrayList




                                  40
How MMTk is manipulating
      pointers?

•   Definition of Magic classes and methods
    •   Address, Word, Offset

    •   Word Address.loadWord(Offset)

•   Magic classes and methods translated by the
    compiler [VEE’09]
    •   Similar mechanism than Inline ASM for C




                                41
   Example (Frampton [VEE’09])

  Inline ASM in C                    Magic in Java

void
prefetchObjects(
           @NoBoundsCheck



OOP
*buffer,
                  void
prefetchObjects(



int
size)
{
                   

ObjectReference[]
buffer)
{



for(int
i=0;i
<
size;i++){
    

for(int
i=0;i<buffer.length;i++)
{





OOP
o
=
buffer[i];
          



ObjectReference
current





asm
volatile(
               





=
buffer[i];







"prefetchnta
(%0)"
::
     



current.prefetch();







"r"
(o));
                 

}



}
                             }

}



                                42
•   Introduction

•   Precise garbage collection

•   Compiling MMTk with VMJC

•   Putting it all together

•   What’s left




                              43
        Option 1: Object File


•   Create a .o file of MMTk
    •   gcc mmtk.o vmkit.o -o vmkit

•   But...
    •   No inlining in application code




                                  44
Option 2: LLVM Bitcode File

 •   Create a .bc file of MMTk
     •   vmkit (-load mmtk.bc) -java HelloWorld

 •   Late binding of allocations in VMKit code
     •   gcmalloc in C++ are linked at runtime

 •   Inlining in Java code
     •   new in applications are inlined with MMTk’s malloc




                                  45
Option 3: Everything is Bitcode


  •   Create a .bc file of MMTk

  •   Create a .bc file of VMKit

  •   Link, optimize and run




                               46
•   Introduction

•   Precise garbage collection

•   Compiling MMTk with VMJC

•   Putting it all together

•   What’s left




                              47
                    What’s left

•   Implementing the MMTK → VMKit interface
    •   Interactions between the GC and the VM

•   Finish implementation with read/write barriers
    •   In VMKit code, in managed code

•   Run benchmarks!
    •   Benchmark with different GCs from MMTk




                                48
http://vmkit.llvm.org



          49

								
To top