Trees
Binary Search, AVL, Splay
1
Introduction
• Trees are a hierarchical collection (non-linear)
• Mirrors real-world structures
– Books (chapters/paragraphs/sentences/words)
– Organizational charts
– Filing Cabinets/folders/documents
– File systems (drives/folders/files)
2
Terminology
• Trees are formed from nodes and edges. Nodes are sometimes called
vertices. Edges are sometimes called branches.
– Nodes may have a number of properties including value and label.
• Edges are used to relate nodes to each other. In a tree, this represents a
“parent-child” relationship.
– An edge {a,b} between nodes a and b establishes a as the parent of b.
Also, b is called a child of a.
• Although edges are usually drawn as simple lines, they are
really directed from parent to child. In tree drawings, this is
top-to-bottom.
3
Definition
• This definition is “recursive” and “constructive”.
1) A single node is a tree. It is "root."
2) Suppose N is a node and T1, T2, ..., Tk are trees with roots n1, n2,
...,nk, respectively. We can construct a new tree T by making N
the parent of the nodes n1, n2, ..., nk. Then, N is the root of T and
T1, T2, ..., Tk are subtrees.
4
More Terminology
• A node is either internal or it is a leaf.
– A leaf is a node that has no children
– An internal node has children (duh)
• Every node in a tree (except root) has exactly one parent.
• The degree of a node is the number of children it has.
• The degree of a tree is the maximum degree of all of its
nodes.
5
Definitions (Paths and Properties)
• A path is a sequence of nodes n1, n2, ..., nk such that node ni is the parent of
node ni+1 for all 1 0
t.right = delete(v, t.right)
else if t has 2 children
t.data = findMin(t.right).data
t.right = delete(t.data, t.right)
else if t has a left child
t = t.left
else t = t.right
return t
}
35
Performance
Method Worst Case Average
void insert(Comparable element) O(N) O(Log N)
boolean contains(Comparable element) O(N) O(Log N)
void delete(Comparable element) O(N) O(Log N)
int size() O(1) O(1)
boolean isEmpty() O(1) O(1)
Note: The “average” case makes certain assumptions about the way the tree is generated
36
Self Adjusting
Binary Search Trees
• Insertions/removals may “deepen” and
“unbalance” a binary search tree.
• Self-adjusting binary search trees automatically
restore balance after each insertion/removal by
performing a series of rotations.
• Self-adjusting binary search trees insure good
worst-case performance.
37
AVL Trees
• Definition: An AVL tree is a binary search tree with a
balance condition.
For every node of the tree, the height of the left and right
sub-trees can differ by at most 1
Note that the height of an empty tree is defined to be -1
38
AVL Examples
3 3
44 44
0 2 1 2
17 78 17 78
1 0 0 1 0
50 88 32 50 88
0 0 0 0
48 62 41 62
Is this an AVL tree? Is this an AVL tree?
No. The difference in the No. It is not a binary search tree since
heights of the root node is greater node 41 is not greater-than-or-equal-to
than 1 node 44 but is in the right subtree of
node 44.
39
AVL Trees
h=0
h=1
h=2
h=3
Given an AVL tree of height h, what is the minimal
number of nodes that the tree may contain?
40
AVL Trees
• Proposition: The height of an AVL tree containing N elements is O(Log N)
– Let n(h) be the minimal number of nodes of an AVL tree of height h
• n(0) = 1
• n(1) = 2
• n(2) = 4
– n(h) = 1 + n(h-1) + n(h-2) for values of h > 2
• n(h) is strictly increasing therefore n(h-1) > n(h-2)
• Replace n(h-1) with n(h-2) in the above formula
– n(h) > 1 + n(h-2) + n(h-2)
– n(h) > 1 + 2 * n(h-2)
– n(h) > 2*n(h-2)
– Note that n(h-2) > 2*n(h-4) so n(h) > 4*n(h-4) in general then
• n(h) > 2i*n(h-2i) (The minimal # of nodes for any i such that h-2i>=0)
• Choose i = Floor(h/2) to utilize our base-case information
– n(h) > 2Floor(h/2)*n(h-2*Floor(h/2))
– n(h) > 2Floor(h/2)*n(0) (since h-2*Floor(h/2) may be 1)
– n(h) > 2Floor(h/2)
• Taking the Log of both sides results in
– Log(n(h)) > h/2
• This says that the height of an AVL tree is O(Log(N)) where N is the size of the AVL tree
41
Insertion into an AVL Tree
algorithm insert(Key K)
Insert K into the AVL tree using the straightforward binary search tree algorithm
As you insert, push each node visited onto a stack. Don’t need to push Ks’ node onto the stack.
Let X,Y,and Z be NULL valued nodes
while the stack is not empty
X=Y
Y=Z
Z = s.pop()
if Z is not balanced then
rotate(X,Y,Z)
return
algorithm rotate(Nodes X, Y, Z)
Let A, B, and C be an IN-ORDER listing of nodes X,Y,Z
Let T0, T1, T2 and T3 by an IN-ORDER listing of the subtrees of X,Y and Z (don’t include X,Y or Z)
Replace Z with B
Make A the left child of B
Make T0 the left child of A and T1 the right child of A
Make C the right child of B
Make T2 the left child of C and T3 the right child of B
42
Four Possible Rotations
(2 cases + symmetry) Single Right Rotation
Single Left Rotation
A B C
B A C B
T0 T3
C A
T1 T0 T1 T2 T3 T2
T2 T3 T0 T1
Single
Rotations
A
B C
C
A C A
T0
B T3
T3
B
T0 T1 T2 T3 T0
T1 T2
T1 T2
Double
Rotations
Right-Left Rotation : Left-Right Rotation :
Single Right Rotation, and Single Left Rotation, and
43
Single Left Rotation Single Right Rotation
AVL Removal
• Removals of a node may unbalance the tree.
– The only nodes that may become unbalanced are those
nodes on the path from root to the removed node
• Multiple rotations may be required to re-balance
– Let Z be the first unbalanced node on the path from
deleted to the root
– Let Y be the child of Z not on the “removal path”
– Let X be the tallest child of Y
– Rotate nodes X, Y, and Z and repeat
44
Removal Example
62
31 83
12 45 71 95
5 18 39 51 99
14 Remove 62!
45
AVL Performance
Method Runtime
void insert(Comparable element) O(Log N)
boolean contains(Comparable element) O(Log N)
void delete(Comparable element) O(Log N)
int size() O(1)
boolean isEmpty() O(1)
46
Splay Tree
Principle of Locality
BST node is most efficient to search?
A Splay Tree is a BST in which the
most recently accessed node is at
the root.
47
Moving a node to the root
Q: How to move node Z to the root?
A: Use rotations
Three Cases
Zig-Zag Case Zig-Zig Case Root Case
48
Zig-zag Case
Use double rotation:
Before After
49
Zig-Zig Case
Rotate twice, but different: twice right rotations
Before After
50
Root Case (when Z is a child of the root)
Use Single Rotation
Before After
51