Scientific Research Statement

					                                  Research Statement
                                         David Ryan Koes

Research Philosophy
My research objective is to study problems where thorough scientific investigation and novel in-
sights may yield a practical real-world solution. I believe that academic research cannot and should
not be solely the province of basic research that seeks only to expand the boundaries of knowledge
and understanding without concern for potential real-world application. Many real-world prob-
lems have at their heart an intellectually challenging research problem. These problems will not
be solved in industry by developers and project managers. Instead, thorough scientific inquiry and
novel insights that advance the field of computer science are needed.
    I believe it is essential that the connection to the real-world context of the problem be main-
tained. Although in the process of finding a successful solution it is useful to solve a problem
abstraction, in order for a solution to be relevant, it must translate to a practical solution. For exam-
ple, computer systems research should be performed using a system that is as close to a production
quality as possible. The extra effort of system building and maintenance is the cost of ensuring that
the research solution directly maps to a real-world solution that can have an immediate meaningful
impact on everyday life.
    Applied research is substantially different from product development. The goal is not simply to
hack out an acceptable solution, but to fully investigate and understand the nature of the problem
and derive a solution that is quantitatively and qualitatively better. Applied research should provide
solutions to real-world problems and expand the boundaries of scientific knowledge.
    In my thesis work I have tackled fundamentally challenging problems in a production quality
system. The approach I take is to rethink the basic representation of the problem. I replace ad-hoc
heuristic methods with an expressive and complete model of the problem and then derive effective
solution techniques for this new problem representation. I find that this principled approach is both
more intellectually satisfying than ad-hoc approaches and that it generates superior results.

Thesis Work
My research interests have been focused on solving the problems confronted when designing the
the tools used to build computer systems. Although this is not a high profile research area, im-
proving development tools can have a widespread impact and there are many deep and challenging
research problems to pursue. In particular, I find compiler design to be especially interesting.
Compiler design is uniquely positioned at the intersection of many disciplines of computer sci-
ence: theory, algorithms, programming languages, software engineering, computer systems, and
computer architecture are all essential ingredients in the design of a compiler. Compilers are a
ubiquitous and essential tool, and improvements to compiler technology have a broad impact on

Research Statement 1/2009                          1                                     David Ryan Koes
the entire field of computer science. Compiler design has been extensively studied with many
noteworthy successes. For example, compiler research directly led to advances in automata theory
and the development of tools like YACC. My goal is to bring the same level of theoretical rigor
and practical completeness that resulted in such advances in the frontend to the code generating
and optimizing backend of the compiler.
    Backend optimizations are a critical part of any optimizing compiler. These optimizations are
responsible for fully exploiting the complex and varied features of modern architectures. Solving
backend optimization problems is algorithmically challenging as they are typically NP-complete.
Historically, the predominate approach to solving these problems has been an amalgamation of
ad-hoc heuristics. In my thesis I developed principled approaches for understanding, evaluating,
and solving the key backend optimization problems of register allocation and instruction selection.
    Register allocation is a critical part of any optimizing compiler. The register allocator is re-
sponsible for finding a desirable assignment of program variables to hardware registers and mem-
ory locations. The quality of register allocation has a substantial impact upon the resulting code
quality, both when optimizing for performance and when optimizing for code size. When optimiz-
ing for performance, an effective register allocation minimizes memory traffic and decreases the
sensitivity of a program to the processor-memory gap. When optimizing for code size, a metric of
primary importance in resource-constrained embedded computing, an effective register allocation
minimizes the amount of overhead introduced by efficiently using addressing modes and managing
data movement instructions. Register allocation is a complex and challenging problem for which,
despite decades of study, no satisfactory solutions exist.
    In my thesis I developed a principled approach to the register allocation problem [3, 5]. Existing
register allocators are ad-hoc heuristics; they contain no formal notion of the underlying problem
that is being solved. In contrast, I developed an expressive and complete model of the problem.
Using an extended version of multi-commodity network flow, my model is expressive enough
to exactly represent all of the pertinent components of the register allocation problem. For any
statically determinable code quality metric, an optimal solution to this model directly corresponds
to an optimal register allocation. I used my expressive and complete model of register allocation
both as a tool to explore the nature of the problem as well as a means to solve the register allocation
    I created an optimal register allocator by combining my model with a commercial integer linear
programming package. I then used this optimal allocator to perform a comprehensive investigation
into the nature of the register allocation problem [1, 2]. This investigation empirically examined
the importance and separability of several components of register allocation when targeting several
distinct architectures and code quality metrics. The results of this study gave rise to several general
design principles for register allocator design.
    Since, due to the NP-completeness of the register allocation problem, an optimal register allo-
cator is not practical for real world application, I developed progressive solution techniques that
use my model to quickly find a decent allocation that is competitive with existing allocators and
then, as more time is allowed for compilation, find a progressively better allocation. My solu-
tion techniques have the additional advantage of generating a meaningful optimality bound that
describes how close the current allocation is to the optimal allocation. This approach changes the

Research Statement 1/2009                         2                                    David Ryan Koes
nature of the interaction between the programmer and the compiler. The programmer can now
explicitly and interactively manage the trade-off between compile time and code quality.
    In addition to register allocation, I developed a principled approach to solving the fundamen-
tal backend compiler optimization of instruction selection. The instruction selection problem is
to find an efficient mapping from the target-independent intermediate representation of a program
to a target-specific assembly representation. Instruction selection is particularly important when
targeting architectures with complex instruction sets, such as the Intel x86 architecture. In these
architectures there are several possible implementations of the same operation, each with differ-
ent costs. The instruction selection problem has been successfully solved when the intermediate
representation of the compiler is in a tree form. However, as I prove in my thesis, when the repre-
sentation is in the form of a directed acyclic graph the problem becomes NP-hard. As a result, the
current best instruction selection algorithm for DAG representations is a limited heuristic peephole
    As an alternative to a peephole matcher heuristic, I developed the Near Optimal Linear-Time
Instruction Selection (NOLTIS) algorithm [4]. The NOLTIS algorithm simply and elegantly ex-
tends existing tree-based instruction selection algorithms. Unlike the peephole matcher heuristic,
which only maintains a local notion of costs, my NOLTIS algorithm seeks to minimize the total
overall cost of the instruction selection. Although the NOLTIS algorithm is not guaranteed to find
an optimal solution of the the NP-hard instruction selection problem, I demonstrated empirically
that the algorithm almost always produces an optimal result, and it does so in worst-case linear

Future Work
Although my thesis concludes with several open problems and new directions for exploration that
I am interested in pursuing, I do not envision a research career spent solely in the area of backend
compiler optimization. I am interested in any computer science problem where thorough scientific
investigation and novel insights may yield a practical real-world solution. There are several areas
of computer science that I am particularly interested in: compilers, embedded systems, operat-
ing systems, networks, and computational biology. However, parallel programming, effectively
harnessing the power of multi-core processors, is an area outside of my thesis that I plan to imme-
diately begin exploring.
    The parallel programming challenge is both deep and broad and will require significant ad-
vances in computer science at all levels of the software stack. The specific aspect of this challenge
that I plan to research is the efficient mapping of parallel work onto real systems. Many techniques
exist for identifying parallelism. Language design may make parallelism explicit, or sophisticated
analysis may extract parallelism from legacy serial code. However, the existence of parallel work
is not sufficient to result in performance gains on parallel hardware. There are complex trade-offs
between resource constraints and communication overheads that have to be properly managed to
achieve parallelization benefit.
    I believe the same general approach I used to tackle the register allocation problem can be
used on this effective parallelization problem. That is, formulate an expressive and complete rep-
resentation of the problem and then develop an effective solution techniques for this model. As a

Research Statement 1/2009                        3                                  David Ryan Koes
starting point, I would work in the context of decoupled software pipelining (DSWP) [6]. DSWP
has been shown produce substantial performance improvements of legacy serial programs on sim-
ulated parallel hardware with low-overhead communication primitives. However, DSWP is not
successful when applied to existing shared-memory multi-core architectures [7]. I believe that I
can develop an expressive and complete model of the tradeoff between communication overhead
and parallelization benefit in the context of DSWP. This model, when coupled with effective solu-
tion techniques, would enable a significant amount of auto-parallelization on existing, unmodified
multi-core architectures.

[1] David Koes and Seth Goldstein. Performance metrics for optimal register allocators. PLDI
    Poster:˜dkoes/research/, 2007.

[2] David Koes and Seth Goldstein. In submission, 2009.

[3] David Koes and Seth Copen Goldstein. A progressive register allocator for irregular archi-
    tectures. In CGO ’05: Proceedings of the International Symposium on Code Generation and
    Optimization (CGO’05), pages 269–280, Washington, DC, USA, 2005. IEEE Computer Soci-

[4] David Koes and Seth Copen Goldstein. Near-optimal instruction selection on dags. In CGO
    ’08: Proceedings of the International Symposium on Code Generation and Optimization
    (CGO’08), Washington, DC, USA, 2008. IEEE Computer Society.

[5] David Ryan Koes and Seth Copen Goldstein. A global progressive register allocator. In PLDI
    ’06: Proceedings of the 2006 ACM SIGPLAN conference on Programming language design
    and implementation, pages 204–215, New York, NY, USA, 2006. ACM Press.

[6] Guilherme Ottoni, Ram Rangan, Adam Stoler, and David I. August. Automatic thread ex-
    traction with decoupled software pipelining. In MICRO 38: Proceedings of the 38th annual
    IEEE/ACM International Symposium on Microarchitecture, pages 105–118, Washington, DC,
    USA, 2005. IEEE Computer Society.

[7] Ram Rangan, Neil Vachharajani, Guilherme Ottoni, and David I. August. Performance scala-
    bility of decoupled software pipelining. ACM Trans. Archit. Code Optim., 5(2):1–25, 2008.

Research Statement 1/2009                      4                                 David Ryan Koes

Shared By:
Description: Scientific Research Statement document sample