Principles of Programming Languages - PDF by luckboy


More Info
									Principles of Programming Lecture 8 - Pointers Languages
Bernhard Reus

Linked list: [4,7,2,3]

4 20 3 7 1 2

! Programmers want to store data that grows and shrinks dynamically... ! ... like linked lists. ! Other such types are trees or doubly linked lists (queues).

In this lecture we discuss features for storage links:

class C { public int f; C(int x){f = x;} } … C a = new C(5); C b = a; b.f = 3; boolean x = new Integer(5)== new Integer(5);


! Problems with Pointers ! Restricted Pointers (References) ! Garbage Collection

What is the value of x ?

What is the value of a.f ?

Pointer Type
Data values of type “pointer” are addresses that provide indirect access to elements of a certain type. Pointer types always include an illegal address, nil (null), that points to nothing. PL/I (mid 60s) was the first language with pointer variables.

Why Pointers?
Indirect addressing (see machine languages) pointers fit in a single machine location (2+2 bytes: SEGMENT+OFFSET) !efficiency; “Pointers as proxies” Dynamic storage management: Pointer points to a cell in the heap: heap-dynamic, anonymous variables. Add writability: ability to implement dynamic and recursive data structures.

Pointers – Design Issues
Lifetime of a heap-dynamic pointer variable? Are pointers restricted to one type of value? Used for dynamic storage management, indirect addressing, both? Explicit or implicit dereferencing? I.e. pointers or references or both? Pointer arithmetic possible?

Operations on Pointers
Allocation (create a heap variable): “book storage room” on the heap Dereferencing get the content of the cell pointed to. Assignment update the content of the cell pointed to Equality Test Deallocation (destroy a heap variable) “free booked storage” on the heap.

45 1
cell #231 cell #230

Dereferencing cont’d
Syntax for Dereferencing: p* (C, C++) l-value = 233 r-value = 232 p^ (Pascal) p (ie. implicit) (Fortran, Java) Most common programming error: dereferencing a null pointer (core dump).

55 232 12
cell #233 cell #232

anonymous pointer variable
Compute the r-value of the pointer variable. This may be a pointer again!

Anonymous Heap Variables
Created using explicit allocation operation
alloc in C new in Pascal, C++, Java

Abstract Syntax
t " type ::= … | ptr t v " varexp ::= … | deref e s " com ::= … | alloc e | dispose e Explicit memory management

Pointer types
dereferencing yields an assignable ( heap ) variable

Deallocation (clean-up) Explicit
(free in C, dispose in Pascal, delete in C++)

Implicit in languages with automatic memory management
(like Haskell, Java)
11 12

Example in Abstract Syntax
block x: ptr int , y=1: int alloc x; deref x = 4; end

Evaluation of Pointers
“Their introduction into high-level languages has been a step backward from which we may never recover.”
(Sir Tony Hoare 1973, Turing Award 1980)

declare integer pointer variable

allocate memory update anonymous pointer variable stored in x


deref x = deref x + y

!Dangling Pointers !Memory Leaks

Memory Leaks…
… are inaccessible (lost) cells. p p
Memory leak

Dangling Pointers …
… point to some storage meanwhile used for other purposes. p p

q before p:=q

q after

q before

q dispose(p) after

Dangling pointer q


Pointer Arithmetic
An additional problem in some languages: In C and C++, pointers can be used like machine addresses (adds to danger): (p + 2)* p p+1 p+2

Pointer Arithmetic (cont’d)
Pro: efficient (tricky) coding (arrays etc.) Contra: Uncontrollable risk of running into “wrong part of heap”. Thus not allowed in many languages.
offset:+1 offset: +2

(e.g. Pascal, Fortran 90, Ada).

In C, C++ pointers can even point into the stack. In Pascal and Ada, pointers can point into the heap only.

Reference Type
Reference Types mean pointers with implicit dereferencing. Thus, pointer arithmetic not possible as pointers are dereferenced before offset can be added. In C++ references must even be constant. In Java references are assignable, no explicit pointers available.

Heap Management
Reclaim storage that is not needed anymore (garbage) otherwise you run out of memory. User controlled (dispose) easily creates dangling pointers and leaks Automatically (by the runtime system) may be safer Design Issue: When is deallocation performed? Incrementally (eager) When space runs out (lazy)


Reference Counters A counter in every cell keeps track of the number of pointers referring to it.
4 3

significant waste of space for reference counters waste of execution time to maintain counters up to date circular pointers (can be done)

this reference counter is not 0

p:= q reference counter Make cell available if counter becomes 0.



Allows garbage to accumulate. Before running out of memory gather the garbage and make it reusable. “Mark & Sweep (& Comapct)” technique: initially mark all cells as garbage traverse heap, unmarking reachable cells make marked cells available for allocation compact the collected memory cells into one block

The marking algorithm needs to sweep the entire heap to detect unreachable garbage. “When you need it most, it works the worst”.
Collection time proportional to used heap, not collected garbage. Need to stop other threads while garbage collecting.

Marking uses up storage itself (stack for recursion).
Clever marking algorithms used, e.g. Schorr-Waite (1967) which uses pointer reversal. In general not too bad as virtual memory is bigger than real memory which is pretty big anyway these days.

Garbage collection in practice
Lisp was the first language with automatic heap management. Functional languages must have automatic heap management since the store is hidden from the user. Java also uses garbage collection: originally mark and sweep optimized (since Java 1.2) using generational garbage collection (see next slide), parameters can be set in JVM to tune garbage collection.

Generational GC
Heap separated into 2 regions: new objects divided in turn into 3 (“survivor”) regions “eden” where new objects are allocated efficiently “survivor spaces 1 & 2” where objects surviving GC are copied to 1 & 2 are swapped each time such that one is always empty old objects (contains objects with long lifetime) incremental garbage collection available on this region

Generational GC
major cycle

minor cycle
1st: copies live objects to survivor space 1:

2nd collection: use space 2

Thank you!
# B Reus, University of Sussex 2006.
objects move to old space when they become tenured

Figures from: Ken Gottry, Pick up performance with generational garbage collection,, 01/11/02

To top