Docstoc

HMPusingSVM 250

Document Sample
HMPusingSVM 250 Powered By Docstoc
					HOT METHOD PREDICTION USING SUPPORT VECTOR MACHINES

                                       Sandra Johnson ,Dr S Valli
         Department of Computer Science and Engineering, Anna University, Chennai – 600 025, India.
                             sandra_johnk@yahoo.com , valli@annauniv.edu


                                                   ABSTRACT
               Runtime hot method detection being an important dynamic compiler optimization
               parameter, has challenged researchers to explore and refine techniques to address
               the problem of expensive profiling overhead incurred during the process. Although
               the recent trend has been toward the application of machine learning heuristics in
               compiler optimization, its role in identification and prediction of hot methods has
               been ignored. The aim of this work is to develop a model using the machine
               learning algorithm, the Support Vector Machine (SVM) to identify and predict hot
               methods in a given program, to which the best set of optimizations could be
               applied. When trained with ten static program features, the derived model predicts
               hot methods with an appreciable 62.57% accuracy.

               Keywords: Machine Learning, Support Vector Machines, Hot Methods, Virtual
               Machines.



1   INTRODUCTION                                           results of the evaluation. Section 7 proposes future
                                                           work and concludes the paper.
     Optimizers depend on profile information to
identify hot methods of program segments. The              2   RELATED WORK
major inadequacy associated with the dynamic
optimization technique is the high cost of accurate            Machine learning techniques are currently used
data profiling via program instrumentation. The            to automate the construction of well-designed
major challenge is how to minimize the overhead            individual optimization heuristics. In addition, the
that includes profile collection, optimization strategy    search is on for automatic detection of a program
selection and re-optimization.                             segment for targeted optimization. While no previous
     While there is a significant amount of work           work to the best of our knowledge has used ML for
relating to cost effective and performance efficient       predicting program hot spots, this section reviews the
machine learning (ML) techniques to tune individual        research papers which use ML for compiler
optimization heuristics, relatively little work has        optimization heuristics.
been done on the identification and prediction of              In a recent review of research on the challenges
frequently executed program hot spots using                confronting dynamic compiler optimizers, Arnold et
machine learning algorithms so as to target the best       al. [1] give a detailed review of adaptive
set of optimizations. In this study it is proposed to      optimizations used in the virtual machine
develop a machine learning based predictive model          environment. They conclude that feedback-directed
using the Support Vector Machine (SVM) classifier.         optimization techniques are not well used in
Ten features have been derived from the chosen             production systems.
domain knowledge, for training and testing the                 Shun Long et al. [3] have used the Instance-
classifiers. The training data set are collected from      based learning algorithm to identify the best
the SPEC CPU2000 INT and UTDSP benchmark                   transformations for each program. For each
programs. The SVM classifier is trained offline with       optimized program, a database stores the
the training data set and it is used in predicting the     transformations selected, the program features and
hot methods of a program which are not trained. This       the resulting speedup. The aim is to apply
system is evaluated for the program’s hot method           appropriate transformations when a new program is
prediction accuracy.                                       encountered.
     This paper is structured as follows. Section 2            Cavazos et al. [4] have applied an offline ML
discusses related work. Section 3 gives a brief            technique to decide whether to inline a method or not.
overview of Support Vector Machines. In Section 4          The adaptive system uses online profile data to
this approach and in section 5 the evaluation              identify “hot methods” and method calls in the hot
methodology is described. Section 6 presents the           methods are in-lined using the ML heuristics.


                     Ubiquitous Computing and Communication Journal                                            1
     Cavazos et al. [5, 12] have also used supervised      machine learning to identify the best procedure clone
learning to decide on which optimization algorithm         for the current run of the program. M. Stephenson et
to use: either graph coloring or Linear scan for           al. [18] have used two machine learning algorithms,
register allocation. They have used three categories       the nearest neighbor (NN) and Support Vector
of method level features for ML heuristics (i.e.)          Machines (SVMs), to predict the loop unroll factor.
features of edges of a control flow graph, features        None of these approaches aims at prediction at the
related to live intervals and finally, statistical         method level. However, machine learning has been
features about the size of a method.                       widely used in work on branch prediction [21, 22, 23,
     Cavazos et al. [11] report that the best of           24].
compiler optimizations is method dependent rather
than program dependent. Their paper describes how,         3   SUPPORT VECTOR MACHINES
logistic regression-based machine learning technique
trained using only static features of a method, is used        The SVM [15, 16] classification maps a training
to automatically derive a simple predictive model          data (xi,yi), i = 1,…,n where each instance is a set of
that selects the best set of optimizations for             feature values xi ∈ Rn and a class label y ∈ {+1,-1},
individual methods within a dynamic compiler. They         into a higher-dimensional feature space φ(x) and
take into consideration the structures of a particular     defines a separating hyperplane. Only two types of
method within a program to develop a sequence of           data can be separated by the SVM which is a binary
optimization phases. The automatically constructed         classifier. Fig. 1 shows a linear SVM hyperplane
regression model is shown to out-perform hand-             separating two classes.
tuned models.                                                  The linear separation in the feature space is done
     To identify basic blocks for instruction              using the dot product φ(x).φ(y). Positive definite
scheduling Cavazos et al. [20] have used supervised        kernel functions k(x, y) correspond to feature space
learning. Monsifrot et al. [2] have used a decision        dot products and are therefore used in the training
tree learning algorithm to identify loops for unrolling.   algorithm instead of the dot product as in Eq. (1):
Most of the work [4, 5, 11, 12, 20] is implemented
and evaluated using Jikes RVM.                             k ( x, y ) = (Φ ( x) • Φ ( y ))                             (1)
     The authors [8, 19] have used genetic
programming to choose an effective priority function
which prioritizes the various compiler options             The decision function given by the SVM is given in
available. They have chosen hyper-block formation,         Eq. (2):
register allocation and data pre-fetching for
                                                                      n
evaluating their optimizations.
     Agakov et al. [9] have applied machine learning
                                                            f ( x ) = ∑ vi .k ( x, xi ) + b                            (2)
                                                                     i =1
to speed up search-based iterative optimization. The
statistical technique of the Principal component
                                                           where b is a bias parameter, x is the training example
analysis (PCA) is used in their work for appropriate
                                                           and vi is the solution to a quadratic optimization
program feature selection. The program features
                                                           problem. The margin of separation extending from
collected off-line from a set of training programs are
                                                           the hyperplane gives the solution of the quadratic
used for learning by the nearest neighbor algorithm.
                                                           optimization problem.
Features are then extracted for a new program and
are processed by the PCA before they are classified,
using the nearest neighbor algorithm. This reduces                                Optimal Hyperplane
the search space to a few good transformations for
the new program from the various available source-
level transformations. However, this model can be                                               Margin of Separation
applied only to whole programs.
     The authors [10] present a machine learning-
based model to predict the performance of a
modified program using static source code features
and features like execution frequencies of basic
blocks which are extracted from the profile data
collected. As proposed in their approach [9], the                             Feature Space

authors have used the PCA to reduce the feature set.       Figure 3: Optimal hyperplane and margin of
A linear regression model and an artificial neural         separation
network model are used for building the prediction
model which is shown to work better than the non-          4   HOT METHOD PREDICTION
feature-based predictors.
     In their work Fursin et al. [14] have used                This section briefly describes how machine




                     Ubiquitous Computing and Communication Journal                                                          2
learning could be used in developing a model to           Machine’s (LLVM) [6] bytecode representation of
predict hot methods within a program. A discussion        the programs provides the training as well as the test
of the approach is followed by the scheme of the          data set. The system architecture for the SVM-based
SVM-based strategy adopted in this study.                 hot method predictive model is shown in Fig.2 and it
                                                          closely resembles the architecture proposed by the
                                                          authors C. L. Huang et. al. [26]. Fig. 3 outlines the
                                                          strategies for building a predictive model.

                                                          1.    Create training data set.
                                                                a. Collect method level features
                                                                    i. Calculate the specified feature for every
                                                                         method in a LLVM bytecode.
                                                                   ii. Store the feature set in a vector.
                                                                b. Label each method
                                                                    i. Instrument each method in the program
                                                                         with a counter variable [25].
                                                                   ii. Execute the program and collect the
                                                                         frequency of the execution of each
                                                                         method.
                                                                  iii. Using the profile information, each
                                                                         method is labeled as either hot or cold.
                                                                  iv. Write the label and its corresponding
                                                                         feature vector for every method in a file.
                                                                c. Steps (a) & (b) are repeated for as many
                                                                      programs as are required for training.
                                                          2.    Train the predictive model.
                                                                a. The feature data set is used to train the
                                                                      SVM-based model.
                                                                b. The predictive model is generated as output.
                                                          3.    Create test data set.
                                                                a. Collect method level features.
                                                                    i. Calculate the specified features for every
                                                                         method in a new program.
                                                                   ii. Store the feature set in a vector.
                                                                  iii. Assign the label ‘0’ for each feature
                                                                         vector in a file.
                                                          4.    Predict the label as either hot or cold for the test
                                                                data generated in step 3 using the predictive
                                                                model derived in step 2.

                                                          Figure 3: System outline

                                                          4.2     Extracting program features
                                                               The ‘C’ programs used for training are converted
Figure 2: System architecture of the SVM-based hot        into LLVM bytecodes using the LLVM frontend.
method predictive model                                   Every bytecode file is organized into a single module.
                                                          Each module contains methods which are either user-
                                                          defined or pre-defined. Only static features of the
4.1    The approach
     Static features of each method in a program are      user-defined methods are extracted from the
collected by offline program analysis. Each of these      bytecode module, for the simple reason that they can
method level features forms a feature vector which is     be easily collected by an offline program analysis.
labeled either hot or cold based on classification by a   Table 1 lists the 10 static features that are used to
prior execution of the program. The training data set     train the classifier. Each feature value of a method is
thus generated is used to train the SVM-based             calculated in relation to an identical feature value
predictive model. Next, the test data set is created by   extracted from the entire bytecode module. The
offline program analysis of a newly encountered           collection of all the feature values for a method
program. The trained model is used to predict             constitutes the feature vector xi. This feature vector
whether a method is hot or cold for the new program.      xi is stored for subsequent labeling. Each feature
An offline analysis on the Low Level Virtual              vector xi is then labeled yi and classified as either hot




                     Ubiquitous Computing and Communication Journal                                               3
(+1) or cold (-1) based on an arbitrary threshold           feature 1 indicates the percent of loops found in the
scheme described in the next section.                       method. The “hot method threshold” used being 50%,
                                                            4 out of the 8 most frequently executed methods in a
Table 1: static features for identifying hot methods.       program are designated as hot methods. The first
                                                            element in each vector is the label yi (+1 or -1). Each
  1.  Number of loops in a method.                          element of the feature vector indicates the feature
      Average loop depth of all the loops in the            number followed by the feature values.
  2.
      method.
  3. Number of top level loops in a method.                 4.4    Creating test data set
      Number of bytecode level instructions in                  When a new program is encountered, the test
  4.                                                        data set is collected in a way similar to the training
      the method.
  5. Number of Call instructions in a method.               data set, except that the label is specified as zero.
  6. Number of Load instructions in a method.
  7. Number of Store instructions in a method.              0 1:1 2:1 3:1 4:1.13098 5:2.91262 6:2.05479
      Number of Branch instructions in the                  7:1.09091 8:1.34875 9:1.55172 10:34
  8.                                                        0 1:0 2:0 3:0 4:0.552341 5:0.970874 6:1.14155
      method.
  9. Number of Basic Blocks in the method.                  7:0.363636 8:0.385356 9:0.862069 10:4
  10. Number of call sites for each method.                 0 1:1 2:1 3:1 4:1.26249 5:0 6:2.51142 7:2.90909
                                                            8:1.15607 9:1.2069 10:40
4.3     Extracting method execution frequencies
                                                            Figure 5: Sample test data set
     Hot methods are frequently executing program
segments. To identify hot and cold methods within a
                                                            4.5    Training and prediction using SVM
training program, profile information is gathered
                                                                 Using the training data set file as input , the
during execution. The training bytecode modules are
                                                            machine learning algorithm SVM is trained with
instrumented with a counter variable in each user-
                                                            default parameters (C-SVM, C=1, radial base
defined method. This instrumented bytecode module
                                                            function). Once trained the predictive model is
is then executed and the execution frequency of each
                                                            generated as output. The derived model is used to
method is collected. Using this profile information,
                                                            predict the label for each feature vector in the test
the top ‘N’ most frequently executed methods are
                                                            data set file. The training and prediction are done
classified as hot. This system keeps the value ‘N’ as
                                                            offline. Subsequently, the new program used for
the “hot method threshold”. In this scheme of
                                                            creating test data set is instrumented. Executing this
classification, each feature vector (xi) is now labeled
                                                            instrumented program provides the most frequently
yi (+1) and yi (-1) for hot and cold methods
                                                            executed methods. The prediction accuracy of the
respectively. The feature vector (xi) along with its
                                                            system is evaluated by comparing the predicted
label (yi) is then written into a training dataset file.
                                                            output with the actual profile values.
Similarly, the training data set of the different
training programs is accumulated in the file. This file
                                                            5     EVALUATION
is used as an input to train the predictive model.
                                                            5.1    Method
+1 1:1 2:1 3:1 4:0.880046 5:2.51046 6:0.875912
                                                                 Prediction accuracy is defined as the ratio of
7:0.634249 8:1.23119 9:1.59314 10:29
                                                            events correctly predicted to all the events
-1 1:0 2:0 3:0 4:1.16702 5:1.25523 6:1.0219
                                                            encountered. This prediction accuracy is of two
7:3.38266 8:1.50479 9:1.83824 10:2
                                                            types: hot method prediction accuracy and total
+1 1:2 2:2 3:2 4:1.47312 5:0.83682 6:1.89781
                                                            prediction accuracy. Hot method prediction accuracy
7:1.47992 8:2.59918 9:2.81863 10:3
                                                            is the ratio of correct hot method predictions to the
                                                            actual number of hot methods in a program, whereas
Figure 4: Sample training data set
                                                            total prediction accuracy is the ratio of correct
                                                            predictions (either hot or cold) to the total number of
    The general format of a feature vector is
                                                            methods in a program. Hot method prediction
         yi 1:xi1, 2:xi2, 3:xi3, …….j:xij
                                                            accuracy is evaluated at three hot method threshold
where the labels 1, 2, 3,.... , j are the feature numbers
                                                            levels: 50%, 40% and 30%.
and xi1, xi2, ...., xij are their corresponding feature
                                                                 The leave-one-out cross-validation method is
values. Fig. 4 shows a sample of three feature vectors
                                                            used in evaluating this system. This is a standard
from the training dataset collected for the user-
                                                            machine learning technique where ‘n’ benchmark
defined methods found in the SPEC benchmark
                                                            programs are used iteratively for evaluation. One out
program. The first feature vector in Fig. 4 is a hot
                                                            of the ‘n’ programs is used for testing and the rest ‘n-
method and is labeled +1. The values of the ten
                                                            1’ programs are used for training the model. This is
features are serially listed for example ‘1’ is the
                                                            repeated for all the ‘n’ programs in the benchmark
value of feature 1 and ‘29’ of 10. The value ‘1’ of




                      Ubiquitous Computing and Communication Journal                                              4
suite.
                                                                                                                        Total Method Prediction Accuracy
5.2   Benchmarks                                                                                                Hot Method Thresholds   50%      40%   30%
    Two benchmark suites, SPEC CPU2000 INT                                                               120
[17] and UTDSP [13] have been used for training




                                                                                 Prediction Accuracy %
                                                                                                         100
and prediction. UTDSP is a C benchmark and SPEC
                                                                                                          80
CPU2000 INT has C and C++ benchmarks.
Evaluation of the system is based on only the C                                                           60

programs of either benchmark. The model trained                                                           40

from the ‘n-1’ benchmark programs in the suite is                                                         20
used to predict the hot methods in the missed out                                                           0
benchmark program.




                                                                                                                              2
                                                                                                                              p




                                                                                                                             cf




                                                                                                                              x
                                                                                                                            er
                                                                                                                              c




                                                                                                                             e
                                                                                                                              f
                                                                                                                           ol
                                                                                                                           ip
                                                                                                                          gc




                                                                                                                         r te
                                                                                                                           zi




                                                                                                                         ag
                                                                                                                          m


                                                                                                                         rs




                                                                                                                       tw
                                                                                                                       bz
                                                                                                               g

                                                                                                                       6.


                                                                                                                       1.




                                                                                                                      vo
                                                                                                                     pa




                                                                                                                      er
                                                                                                            4.




                                                                                                                    0.
                                                                                                                    6.
                                                                                                                   17


                                                                                                                   18




                                                                                                                  Av
                                                                                                         16




                                                                                                                   5.
                                                                                                                  7.




                                                                                                                 30
                                                                                                                 25
                                                                                                                25
                                                                                                                19
5.3    Tools and platform                                                                                                     SPEC CPU2000 INT
    The system is implemented in the Low Level
Virtual Machine (LLVM) version 1.6 [6]. LLVM is                              Figure 7: Total prediction accuracy on the SPEC
an open source compiler infrastructure that supports                         CPU2000 INT benchmark
compile time, link time, run time and idle time
optimizations. The results are evaluated on an Intel                              The total method prediction accuracy on the
(R) Pentium (R) D with 2.80 GHz and 480MB of                                 SPEC CPU2000 INT and UTDSP benchmark suites
RAM running Fedora Core 4. This system uses the                              is shown in Fig. 7 and 9. The total method prediction
libSVM tool [7]. It is a simple library for Support                          accuracy for all C programs on the SPEC CPU2000
Vector Machines written in C.                                                INT varies from 36 % to 100 % with an average of
                                                                             68.43%, 71.14% and 71.14% for the three hot
                                                                             method thresholds respectively. This averages to
                                      Hot Method Prediction Accuracy
                                                                             70.24%. The average prediction accuracies obtained
                             Hot Method Thresholds    50%       40%    30%   on the UTDSP benchmark suite are 69%, 71% and
                            120                                              58% respectively for 50%, 40% and 30% hot method
                                                                             thresholds. This averages to 66%. Overall the system
    Prediction Accuracy %




                            100

                             80                                              predicts both hot and cold methods in a program with
                                                                             68.15% accuracy.
                             60

                             40                                              7                             CONCLUSION AND FUTURE WORK
                             20

                              0
                                                                                  Optimizers depend on profile information to
                                                                             identify the hot methods of program segments. The
                                                2
                                                p




                                              cf




                                                x
                                              er
                                                c




                                               e
                                                f
                                             ol
                                            ip
                                            gc




                                           rte
                                             zi




                                           ag
                                            m




                                                                             major inadequacy associated with the dynamic
                                           rs




                                         tw
                                         bz
                                 g

                                         6.


                                         1.




                                        vo
                                        pa




                                        er
                              4.




                                      0.
                                      6.
                                     17


                                     18




                                    Av
                            16




                                     5.
                                     7.




                                   30
                                   25




                                                                             optimization technique is the high cost of accurate
                                  25
                                  19




                                             SPEC CPU2000 INT
                                                                             data profiling via program instrumentation. In this
Figure 6: Hot method prediction accuracy on the                              work, a method is worked out to identify hot
SPEC CPU2000 INT benchmark                                                   methods in a program using the machine learning
                                                                             algorithm, the SVM. According to our study, with a
6                            RESULTS                                         set of ten static features used in training the system,
                                                                             the derived model predicts total methods within a
     Fig. 6 shows the prediction accuracy of the                             program with 68.15% accuracy and hot methods
trained model on the SPEC CPU2000 INT                                        with 62.57% accuracy. However, hot method
benchmark program at three different hot method                              prediction is of greater value because optimizations
thresholds: 50%, 40% and 30%. The hot method                                 will be more effective in these methods.
prediction accuracy for all C programs on the                                     Future work in this area is aimed at improving
benchmark is found to vary from 0 % to 100 % with                            the prediction accuracy of the system by identifying
an average of 57.86 %, 51.43% and 39.14% for the                             more effective static and dynamic features of a
three hot method thresholds respectively. This                               program. Further research in this system can be
averages to 49.48% on the SPEC CPU2000 INT                                   extended to enhance it to a dynamic hot method
benchmark suite. Similarly, on the UTDSP                                     prediction system which can be used by dynamic
benchmark suite, in a 0% to 100% range, the hot                              optimizers. Applying this approach, the prediction
method prediction accuracy averages for the three                            accuracy of the other machine learning algorithms
thresholds are 84%, 81% and 62% respectively.                                can be evaluated to build additional models.
This averages to 76% on the UTDSP benchmark
suite. Overall, this new system can obtain 62.57%
hot method prediction accuracy.



                                              Ubiquitous Computing and Communication Journal                                                                 5
                                                                                                 Hot Method Prediction Accuracy
                                                                                 Hot Method Thresholds                    50%          40%     30%
                             100
    Prediction Accuracy %


                              80

                              60

                              40

                              20

                               0




                                                                                                                                                                                            t
                                                                                                                                                                                     ir
                                                                                                                            c
                                                                                      2




                                                                                                                                               ff t
                                     m




                                                                                                                      em




                                                                                                                                                                           m
                                                                                                                            s




                                                                                                                                                        fi r
                                              ss




                                                                                                   eg




                                                                                                                                                                                                      ge
                                                                                             m




                                                                                                                            l




                                                                                                                                                                 iir
                                                                           ee
                                                         ct




                                                                           ng




                                                                                                                                                                                          ul
                                                                                                                      tra
                                                                                                                       lp




                                                                                                                       l li




                                                                                                                                                                                  sf
                                                                                 72




                                                                                                                                                                         nr
                                  pc




                                                       te




                                                                                            ra
                                              e




                                                                                                                                                                                          m
                                                                                                                  tr e




                                                                                                                                                                                                   ra
                                                                                                  jp
                                                                        sL




                                                                                                                                                                                lm
                                                                        Fu




                                                                                                                  od
                                                                                                                 ec
                                           pr




                                                                                G




                                                                                                                                                                          t
                                                    de




                                                                                          og
                              ad




                                                                                                                                                                       la




                                                                                                                                                                                                  e
                                                            cu

                                                                     dy
                                         m




                                                                                                                m
                                                                                                               sp




                                                                                                                                                                                               Av
                                                 _




                                                                                        st
                                      co




                                                                                                             2.
                                                          ar
                                              ge




                                                                  en




                                                                                      hi




                                                                                                          V3
                                                         M
                                             ed




                                                                 W
                                                      1.

                                                              1.
                                                    72

                                                             72




                                                                                                          UTDSP benchmark
                                                  G

                                                          G




Figure 8: Hot method prediction accuracy on the UTDSP benchmark.

                                                                                                        Total Prediction Accuracy
                                                                                          Hot Method Thresholds
                                                                                                                                50%          40%       30%
                      100
     Prediction Accuracy %




                             80


                             60


                             40


                             20


                              0




                                                                                                                                                                                                    e
                                                                                                                                             ff t


                                                                                                                                                      fir


                                                                                                                                                                    i ir
                                                                       ng


                                                                                  2




                                                                                                   eg
                                                                        ct




                                                                                                                                                                                          t
                                                                                                           c




                                                                                                                                                                                  ir
                                  m




                                                                                                                      l




                                                                                                                                                                   rm
                                                                                                                                     em
                                                                       ee




                                                                                                                                      llis
                                                                                             m




                                                                                                                    ra




                                                                                                                                                                                        ul
                                           ss




                                                                                                                                                                                                 ag
                                                                                                         lp
                                                                                 72




                                                                                                                                                                                sf
                                                                     te
                               pc




                                                                                          ra




                                                                                                                                                                                       m
                                                                                                 jp
                                                                    Fu




                                                                                                                   t




                                                                                                                                                                 tn
                                                                    sL
                                          e




                                                                                                                                  tre




                                                                                                                                                                              lm




                                                                                                                                                                                               er
                                                                                                                ec




                                                                                                                                 od
                                                     e




                                                                                G
                                       pr
                             ad




                                                                                       og




                                                                                                                                                               la
                                                  _d




                                                                 dy
                                                                 cu




                                                                                                                                                                                              Av
                                                                                                               sp




                                                                                                                                m
                                      m




                                                                                    st
                                               ge




                                                               en
                                                               ar




                                                                                                                             2.
                                    co




                                                                                  hi
                                                              M




                                                                                                                          V3
                                             ed




                                                             W
                                                           1.


                                                          1.
                                                   72

                                                       72




                                                                                                         UTDSP benchmark
                                                  G


                                                      G




Figure 9: Total Prediction Accuracy on the UTDSP benchmark.

                                                                                                                                           International    Conference     on     Compiler
8                             REFERENCES                                                                                                   Construction (CC 2006), 2006.
                                                                                                                                       [6] C. Lattner and V. Adve: LLVM: A compilation
[1] Matthew Arnold, Stephen Fink, David Grove,                                                                                             framework for lifelong program analysis &
    Michael Hind, and Peter F. Sweeney: A Survey                                                                                           transformation, In Proceedings of the 2004
    of Adaptive Optimization in Virtual Machines,                                                                                          International Symposium on Code Generation
    Proceedings of the IEEE, pp. 449-466,                                                                                                  and Optimization (CGO’04), March 2004.
    February 2005.                                                                                                                     [7] Chih-Chung Chang and Chih-Jen Lin:
[2] A. Monsifrot, F. Bodin, and R. Quiniou: A                                                                                              LIBSVM : a library for support vector
    machine learning approach to automatic                                                                                                 machines, 2001. Software available at
    production of compiler heuristics, In                                                                                                  http://www.csie.ntu.edu.tw/~cjlin/libsvm.
    Proceedings of the International Conference on                                                                                     [8] M. Stephenson, S. Amarasinghe, M. Martin,
    Artificial Intelligence: Methodology, Systems,                                                                                         and U. M. O’Reilly: Meta optimization:
    Applications, LNCS 2443, pp. 41-50, 2002.                                                                                              Improving compiler heuristics with machine
[3] S. Long and M. O'Boyle: Adaptive java                                                                                                  learning, In Proceedings of the ACM
    optimization using instance-based learning, In                                                                                         SIGPLAN Conference on Programming
    ACM         International    Conference     on                                                                                         Language      Design     and    Implementation
    Supercomputing (ICS'04), pp. 237-246, June                                                                                             (PLDI’03), pp. 77–90, June 2003.
    2004.                                                                                                                              [9] F Agakov, E Bonilla, J Cavazos, G Fursin, B
[4] John Cavazos and Michael F.P. O'Boyle:                                                                                                 Franke, M.F.P. O'Boyle, M Toussant, J
    Automatic Tuning of Inlining Heuristics, 11th                                                                                          Thomson, C Williams: Using machine learning
    International Workshop on Compilers for                                                                                                to focus iterative optimization, In Proceedings
    Parallel Computers (CPC 2006), January 2006.                                                                                           of the International Symposium on Code
[5] John Cavazos, J. Eliot B. Moss, and Michael                                                                                            Generation and Optimization (CGO), pp. 295-
    F.P. O'Boyle: Hybrid Optimizations: Which                                                                                              305, 2006.
    Optimization Algorithm to Use?, 15th                                                                                               [10] Christophe Dubach, John Cavazos, Björn



                                                               Ubiquitous Computing and Communication Journal                                                                                              6
    Franke, Grigori Fursin, Michael O'Boyle and           /~mstephen/stephenson_phdthesis.pdf , M. W.
    Oliver Temam: Fast compiler optimization              Stephenson, Automating the Construction of
    evaluation via code-features based performance        Compiler Heuristics Using Machine Learning,
    predictor, In Proceedings of the ACM                  PhD thesis, MIT, USA, 2006.
    International Conference on Computing             [20] J. Cavazos and J. Moss: Inducing heuristics to
    Frontiers, May 2007.                                  decide whether to schedule, In Proceedings of
[11] John Cavazos, Michael O'Boyle: Method-               the ACM SIGPLAN Conference on
    Specific Dynamic Compilation using Logistic           Programming        Language    Design      and
    Regression, ACM Conference on Object-                 Implementation (PLDI), 2004.
    Oriented Programming, Systems, Languages,         [21] B.Calder, D.Grunwald, Michael Jones,
    and Applications (OOPSLA), Portland, Oregon,          D.Lindsay, J.Martin, M.Mozer, and B.Zorn:
    October 22-26, 2006.                                  Evidence-Based Static Branch Prediction Using
[12] John Cavazos: Automatically Constructing             Machine Learning, In ACM Transactions on
    Compiler Optimization Heuristics using                Programming       Languages    and    Systems
    Supervised Learning, Ph.D thesis, Dept. of            (ToPLaS-19), Vol. 19, 1997.
    Computer Science, University of Massachusetts,    [22] Daniel A. Jiménez , Calvin Lin: Neural
    2004.                                                 methods for dynamic branch prediction, ACM
[13] C. Lee: UTDSP benchmark suite. In                    Transactions on Computer Systems (TOCS),
    http://www.eecg.toronto.edu/~corinna/DSP/infr         Vol. 20 n.4, pp.369-397, November 2002.
    astructure/UTDSP.html, 1998.                      [23] Jeremy Singer, Gavin Brown and Ian Watson:
[14] G. Fursin, C. Miranda, S. Pop, A. Cohen, and         Branch Prediction with Bayesian Networks, In
    O. Temam: Practical run-time adaptation with          Proceedings of the First Workshop on
    procedure cloning to enable continuous                Statistical and Machine learning approaches
    collective compilation, In Proceedings of the         applied to Architectures and compilation, pp.
    5th GCC Developer’s Summit, Ottawa, Canada,           96-112, Jan 2007.
    July 2007.                                        [24] Culpepper B., Gondre M.: SVMs for Improved
[15] Vapnik, V.N.: The support vector method of           Branch Prediction, University of California,
    function estimation, In Generalization in             UCDavis, USA, ECS201A Technical Report,
    Neural Network and Machine Learning,                  2005.
    Springer-Verlag, pp.239-268, 1999.                [25] Youfeng Wu, Larus. J. R. : Static branch
[16] S. Kotsiantis: Supervised Machine Learning: A        frequency and program profile analysis, 1994.
    Review       of    Classification  Techniques,        MICRO-27, Proceedings of the 27th Annual
    Informatica Journal 31, pp. 249-268, 2007.            International Symposium on Microarchitecture,
[17] The     Standard    Performance     Evaluation       pp: 1 – 11, 1994.
    Corporation. http://www.specbench.org.            [26] C.-L. Huang and C.-J. Wang: A GA-based
[18] M. Stephenson and S.P. Amarasinghe:                  feature selection and parameters optimization
    Predicting unroll factors using supervised            for support vector machines, Expert Systems
    classification, In Proceedings of International       with Applications, Vol. 31, Issue 2, pp: 231-
    Symposium on Code Generation and                      240, 2006.
    Optimization (CGO), pp. 123-134, 2005.
[19] www.cag.csail.mit.edu




                    Ubiquitous Computing and Communication Journal                                     7

				
DOCUMENT INFO
Shared By:
Categories:
Tags: UbiCC, Journal
Stats:
views:9
posted:6/17/2010
language:English
pages:7
Description: UBICC, the Ubiquitous Computing and Communication Journal [ISSN 1992-8424], is an international scientific and educational organization dedicated to advancing the arts, sciences, and applications of information technology. With a world-wide membership, UBICC is a leading resource for computing professionals and students working in the various fields of Information Technology, and for interpreting the impact of information technology on society.
UbiCC Journal UbiCC Journal Ubiquitous Computing and Communication Journal www.ubicc.org
About UBICC, the Ubiquitous Computing and Communication Journal [ISSN 1992-8424], is an international scientific and educational organization dedicated to advancing the arts, sciences, and applications of information technology. With a world-wide membership, UBICC is a leading resource for computing professionals and students working in the various fields of Information Technology, and for interpreting the impact of information technology on society.