JVM Overview
References
Virtual Machine Background
JVM: operational view
JVM: structural view
Concluding remarks
Reading List
Primary:
LY99 Chapter 3. Structure of the Java Virtual Machine
Venners98 Chapter 5. The Java Virtual Machine
GJS96 Chapter 12: Execution
Other references
Survey of VM Research 1974
“Virtual machines have finally arrived. Dismissed for a
number of years as academic curiosities, they are now seen
as cost effective techniques for organizing computer
systems…”
Inferno Virtual Machine
Oak Intermediate Bytecode
What is a virtual machine?
David Gelernter: Truth, beauty, and VMs
“A running program is often referred to as a VM -- a
machine that doesn’t exist as a matter of actual
physical reality. The virtual machine idea is … most
elegant in the history of technology … a crucial step in
the evolution of ideas about software.”
an operating system
a control program to run multiple operating
systems
Design Goals
abstract enough
close enough to the hardware
question: what is the intended use?
Inferno: run OS code
JVM: run application code
What is the JVM?
Key Distinction
what is the specification?
what is the implementation?
object layout is not part of the specification
garbage collection is not part of the spec
JVM: View 1
from the language point of view
trace the lifetime of a virtual machine
invocation, loading-linking, object lifetime, exit
VM in action
invoked “java Test args”
attempts to find class Test
VM uses the class loader
Link
Initialize
Invoke Test.main
Loading
check whether already loaded
if not, invoke the appropriate loader.loadClass
internal table is part of the specification?
class loader flexibility: prefetch, load a bunch
prefetching can be non-transparent!
errors, however, need to be reported separately
class loader hooks: defineClass, findSystemClass,
resolveClass
Link
Link = verification, preparation, resolution
Verification: semantic checks, proper symbol
table
proper opcodes, good branch targets
conservation of stack depth
Preparation: allocation of storage (method
tables)
Resolution: resolve symbol references, check
access, check concreteness
Resolution: eager vs lazy strategy
Initialization
initialize class variables, static initializers
direct superclass need to be initialized prior
happens on direct use: method invocation,
construction, field access
synchronized initializations: state in Class object
check for recursive initializations
Example
class Super {
static { System.out.print(“Super “);
}
class One {
static { System.out.print(“One “);
}
class Two extends Super {
static { System.out.print(“Two “);
}
class Test {
public static void main(String[] args) {
One o = null;
Two t = new Two();
System.out.println((Object)o == (Object)t);
}
}
Example
class Super { static int taxi = 1729; }
class Sub extends Super {
static { System.out.print(“Sub “);
}
class Test {
public static void main(String[] args) {
System.out.println(Sub.taxi);
}
}
Creation of new instances
instance creation expressions: new
Class.newInstance()
string literals, concatenation operations
order:
default field values
invoke constructor
invoke another constructor of this class
invoke super’s constructors
initialize instance variables
execute rest of the constructor
Finalization
invoked just before garbage collection
language does not specify when it is invoked
also does not specify which thread
no automatic invocation of super’s finalizers
very tricky!
void finalize() {
classVariable = field; // field is now reachable
}
State Machine
VM Exit
classFinalize similar to object finalization
class can be unloaded when
no instances exist
class object is unreachable
VM exits when:
all its threads terminate
Runtime.exit or System.exit assuming it is secure
finalizers can be optionally invoked on all objects
just before exit
JVM: View 2
data types, values
runtime data areas
exceptions
instruction set
object management
support for special libraries
Data types and values
corresponds to Java language types
byte, short, int, long, char, float, double, boolean
returnAddress type is only exception
references: concrete value of null left to
implementation
integer sizes: is it too constraining?
floating point values: standard and extended
no runtime type information
instruction specifies the type of operands
iadd as opposed to fadd
Object Representation
left to the implementation
add extra level of indirection
make garbage collection easier
need pointers to instance data and class data
mutex lock
GC state (flags)
Runtime Data Areas
per-thread vs. VM wide
pc register: per thread, undefined while
executing native methods
VM stack (per-thread)
local variables, partial results
method invocation, return
can be heap allocated as well as non-contiguous
size can be manipulated by the programmer
StackOverflowError vs OutOfMemoryError
Runtime Data Areas
Heap (VM wide)
for storing objects
assumes no particular GC method
heap size can expand, user control exists
might cause OutOfMemoryError
Method area (VM wide)
runtime constant pool
field and method data
code
logically part of the heap
Native method stacks: how to catch exceptions?
VM Stack Frames
created and destroyed with method invocations
local variable array, own operand stack
local variable array elements can store a float/int
over-specification?
used for parameter passing
instance methods pass “this” as 0th argument
operand stack: depth determined at compile-time
elements can hold any type
reference to the class’s runtime constant pool
symbolic references for dynamic linking
Initialization Methods
specially named
for instances
invokespecial instruction
can be invoked only on uninitialized instances
for classes
implicitly invoked
Exceptions
each catch/finally clause is represented as an
exception handler
associated with each handler is the code extent
exception handler table is ordered by the
compiler
JVM does not enforce strict nesting
Instruction set
variable size instruction
one-byte opcode followed by arguments
byte aligned except for operands of tableswitch
and lookupswitch
compactness vs. performance
types part of the instruction (iload, fload)
use int operations for byte, char, short, int,
reference
some are type-independent (pop, swap)
Instructions
load/store
arithmetic
conversion
object creation, access fields, load/store array
elements, get array length, type checks
operand stack management
control transfer, method invocation, throw
monitor entry/exit
Threads
notion of priorities
does not specify time-slicing
complex specification of consistency model
volatiles
working memory vs. general store
non-atomic longs and doubles
T.start() is native, invokes T.run()
Summary
issues where implementation is not constrained
loading of classes -- bad?
finalization of objects -- bad?
object representation -- good
issues where implementation is over-constrained
integer representations?
implementation of local variables, expression stacks?
clearly, a JIT does not conform to these specifications
what really is the specification of the JVM
is it the bytecode and class-file format?