High-Level Language Virtual Machine Architecture by wuyunqing


									A Framework for Reducing the
Cost of Instrumented Code

    2005, 12, 7
    Choonki Jang

             Advanced Compiler Research Lab.   1

   Collecting profiling information,
       Execution overhead
       Prevented many known offline feedback-optimizations from
        being used in online system.

   Framework of this paper
       General instrumentation sampling to reduce the overhead

                          Advanced Compiler Research Lab.          2/27

   JIT compilation
       relied on simple static strategies for choosing target.
       compiling each method with a fixed set of optimizations.
   Adaptive systems
       dynamically select a subset of all methods for optimization.
       attempt to focus optimization effort on program hot spot.
   Using offline profile optimizations
       rely on instrumenting the code to collect detailed information
       instrumentation cause performance degradation

                           Advanced Compiler Research Lab.             3/27
On online system

   Need both performance and instrumentation.
       Can execute instrumented code for some period.
       After instrument, optimize.

   Overhead of instrumentation
       forces to profile for a small time.
       need way to stop instrumented code from executing.

                          Advanced Compiler Research Lab.    4/27

   The framework offers
       Smaller performance degradation, longer period to instrument
       Tunable
           allows to adjust more accurate or more fast easily
       Multiple types of instrumentation can be used simultaneously
       Independent to hardware or OS.
       Deterministic

                             Advanced Compiler Research Lab.       5/27
Sampling Framework

   To achieve both performance and expensive
       Duplicated code
           with the instrumented code
           small and bounded amount of time
       Checking code
           original code that modified to swap back and forth between
            checking code and duplicated code.
           majority of execution occurs

                             Advanced Compiler Research Lab.             6/27
Modified Instrumented Method

            Checking                           Duplicated
             Code                                Code

               Low                                 High
             Overhead                            Overhead

    To allow swap back and forth between the checking code and
    duplicated code in a fine-grained and controlled manner,
    checking code modified slightly.

                        Advanced Compiler Research Lab.          7/27

   Trigger using timer interrupt
       can be problem when sampling on the level of basic block or
       Using counter
   Compiler-inserted counter-based sampling
       Count a particular event until threshold value.
       Conditional branch that monitor sample condition.
       For multi-thread application,
           Concern race condition.
           Thread- or Process-specific counter

                             Advanced Compiler Research Lab.      8/27
Sampling code (check)

        if( globalCounter <= 0 )
             globalCounter = resetValue;

  The counter variable(globalCounter) will mostly like be in cache.
  Conditional branch(takeSample), will be predictable as not taken.

   The performance overhead should be low.
    Result of experiments : 4.9%
     • place checks at all method entry and all back edges
     • no sampling were taken, only for executing checks.

                         Advanced Compiler Research Lab.              9/27

   All of the code in the method                 Method Entry
    is duplicated.
   Checks are placed at
       method entry
       all back edges
       to ensure
           Only a bounded amount of
           All code has the opportunity to
            be sampled

                             Advanced Compiler Research Lab.     10/27
Property 1

   The number of checks executed in the checking code is
    less than or equal to the number of back edges and
    methods entries executed, independent of the
    instrumentation begin performed.

                      Advanced Compiler Research Lab.   11/27
Drawbacks of Full-Duplication

   Increasing compiling time
       By experiment, 34% increased
       Duplicate after most of the compilation has taken place.
   Increasing space
   Duplicated code has no effect on locality

                           Advanced Compiler Research Lab.         12/27
Space saving

  It is not always possible to remove a non-instrumented
  node from the duplicated code without violating Property 1

                      Advanced Compiler Research Lab.          13/27
Variation for space saving
1. Partial-Duplication
   Goal
       Remove as many non-instrumented basic blocks from the
        duplicated code as possible without violating Property 1
       Bottom-node
           Any non-instrumented node, n, in the duplicated code DAG such
            that no instrumented nodes are reachable from n.
           Bottom-node can be removed without violating Property 1
       Top-node
           Any non-instrumented node, n, in the duplicated code DAG such
            that no path from entry to n contains an instrumented node.

                            Advanced Compiler Research Lab.           14/27

  Method Entry

                 Advanced Compiler Research Lab.   15/27

  Method Entry
                                Method Entry


                 Advanced Compiler Research Lab.              16/27
             All bottom-nodes can be removed from the
             duplicated code without violating Property 1

  Method Entry                             Method Entry

                          Advanced Compiler Research Lab.   17/27
         1. In the checking code, all checks that branch to a
                     top-node should be removed

  Method Entry                             Method Entry

                          Advanced Compiler Research Lab.       18/27
 2. In the duplicated code, for every edge that previously connected a top-
 node to an instrumented node, the corresponding edge in the checking
 code should have a check added

    Method Entry                            Method Entry

                           Advanced Compiler Research Lab.                19/27
Variation for space-saving
2. No-Duplication
   Weakened Property 1
       Allowing more than one check to be executed per loop
        iteration or method call.
       By guarding all instrumentation operations with checks,
        duplication is not needed.

                                                      Non-instrumented instruction
                                                      Instrumented instruction

                          Advanced Compiler Research Lab.                        20/27

   Environment
       Jalapeño JVM developed at IBM T.J Watson
           Two compiler
               Non-optimizing baseline compiler
               Optimizing compiler
   SPECjvm98 benchmark

                               Advanced Compiler Research Lab.   21/27
Experiment Result
Time overhead

   not using their framework
   Call-edge       : All method entries are instrumented
   Field-access    : A counter is maintained for each field of all classes

                          Advanced Compiler Research Lab.                    22/27
Experiment Result
Framework overhead (Full Duplication)

 No loop unrolling
 Only non-aggressive static inling heuristics
 Naïve checking implementation (Not hardware, OS or JVM specific)

                           Advanced Compiler Research Lab.          23/27
Experiment Result
Framework overhead (No Duplication)

        No samples are taken.
        Only overhead is the cost of executing the checks

                  Advanced Compiler Research Lab.           24/27
Experiment Result
Sample overhead and accuracy

   They say fine-grained and controlable

                      Advanced Compiler Research Lab.   25/27
Experiment Result
Sampling Overhead

   Compared to non-instrumented code
       Similar to the total overhead of Full-Duplication

                           Advanced Compiler Research Lab.   26/27
Experiment Result
Accuracy according to Trigger

      Using Full-Duplication with field-access instrumentation
      For approximately same number of sample,
            * timer interval     : 10ms
            * counter interval   : 3000

                        Advanced Compiler Research Lab.          27/27

To top