VIEWS: 7 PAGES: 50 POSTED ON: 6/28/2011
Priority Queues For all the important things in life! 1 Priority Queue • Useful for assigning “priorities” to shared resources. – Operating System: scheduler can use a priority queue to determine which task to execute next. Need not be executed in the order that the tasks were inserted. – Shared Printers: scheduler can print small jobs before big ones. This increases the average delay for output. 2 Priority Queue • A collection of data that is accessed by “priority” –The data is “ordered” only by it’s priority –Multiple identical priorities are allowed • Supports the following fundamental methods – void insert(Comparable priority, Object data) • Inserts data into the queue using the specified priority • Object remove() • Removes and returns the data having the “greatest” priority • An error occurs if the queue is empty. – Object minElement() • Returns the data having the greatest priority but doesn’t remove it from the queue. • An error occurs if the queue is empty. – Object minKey() insert(priority, data) remove() Priority Queue • Returns the key having the greatest priority. returns the highest priority item 3 Priority Queue ADT Method Output Queue insert(5, A) none ((5,A)) insert(9, C) none ((5,A), (9,C)) insert(4, B) none ((5,A), (9,C), (4, B)) insert(5, D) none ((5,A), (9,C), (4, B), (5, D)) minElement() B ((5,A), (9,C), (4, B), (5, D)) minElement() B ((5,A), (9,C), (4, B), (5, D)) remove() B ((5,A), (9,C), (5,D)) remove() A or D ((9,C), (5,D)) or ((5,A), (9,C)) remove() A or D ((9,C)) 4 Simple Priority Queue Implementations • Use an un-ordered array of values – Similar to a Vector – What is the run time of insert/remove? • Maintain an ordered array of values – Ordered by priority (or key) – What is the run time of insert/remove? 5 Heaps • A binary-tree-based implementation of a priority queue – A complete binary tree with a heap property For every node V of the heap, the priority (key) stored at V is greater than or equal to the priority (key) of any of V’s children. “Greater priority” often means “lower key value”. 6 Heaps 4 4 4 18 9 18 9 4 4 28 19 11 28 11 19 4 4 4 Yes Indeed! Nope! Yes Indeed! (not a complete binary tree) 7 Heap Insertion Example 4 5 6 15 9 7 20 16 25 14 12 11 13 2 How to insert the key “2” into the heap? 8 Heap Insertion (UpHeapBubble) 4 5 6 15 9 7 2 16 25 14 12 11 13 20 Restore the heap ordering property by “bubbling” the new item into its proper location. If the parent is bigger than the bubbling item, then swap the parent with the “bubbler”. 9 Heap Insertion (UpHeapBubble) 4 5 2 15 9 7 6 16 25 14 12 11 13 20 The “up heap bubbling” restores the heap property. A node is only replaced by a “smaller” node. Therefore all the nodes children are still greater-than-or-equal to the new node value. 10 Heap Insertion (UpHeapBubble) 2 5 4 15 9 7 6 16 25 14 12 11 13 20 A Heap! 11 Heap Removal Example 2 3 4 15 9 7 6 16 25 14 12 11 13 20 How to remove an item from the heap? 12 Heap Removal Example 2 ? 3 4 15 9 7 6 16 25 14 12 11 13 20 The item to remove is always at the root and is found in O(1). How to restore the heap? 13 Heap Removal Example 20 3 4 15 9 7 6 16 25 14 12 11 13 20 Remove the “last” item in the “last” row and place it at the root. How to restore the heap ordering property? 14 Heap Removal (DownHeapBubble) 3 20 4 15 9 7 6 16 25 14 12 11 13 Perform a “down heap bubble”! If either child is smaller than the bubbler, recursively swap the bubbler with its smallest child. 15 Heap Removal Example 3 9 4 15 20 7 6 16 25 14 12 11 13 16 Heap Removal Example 3 9 4 15 12 7 6 16 25 14 20 11 13 A Heap! 17 Example Perform the following operations on an initially empty heap: insert(3) insert(12) insert(2) 6 remove() insert(3) insert(15) 12 11 insert(6) insert(11) insert(2) insert(2) 15 remove() remove() remove() remove() 18 Heap Question • Does a heap guarantee the first-in-first-out principle is applied to identically-valued keys? • Consider the following sequence: – insert(1) – insert(2) – insert(2) – remove() – remove() ! 19 Heap Question Given a linked heap representation, find the location of the next insertion point! algorithm nextLocation(TreeNode t, Content C) Input: Root of a binary tree t. Output: t. New node is inserted into t if t == null t = new TreeNode(C) else if height(t.left) != height(t.right) if t.left is full t.right = nextLocation(t.right, C) else t.left = nextLocation(t.left, C) else if t.right is full t.left = nextLocation(t.left, C) else t.right = nextLocation(t.right, C) return t; 20 Priority Queue Performance Sorted Unsorted Method Heap Array Array insert O(N) O(1) O(Log N) remove O(1) O(N) O(Log N) minElement O(1) O(N) O(1) minKey O(1) O(N) O(1) 21 Heap Implementation • Which way to implement heaps? – sequential (array based) binary tree representation? – linked binary tree representation? • The sequential representation is probably the “best”! – Since a heap is a complete binary tree, all nodes are in contiguous array locations. – No “wasted” slots. Oooh! 22 Example Given the following “heap”, perform the listed operations: 0 1 2 3 4 5 6 7 8 9 10 11 n 3 9 5 12 11 7 9 14 13 n n insert(4) insert(1) remove() remove() 23 Sorting Digression • How can we use a Priority Queue to sort? Algorithm queueSort(Vector v) Input: Vector v of comparable values to sort. Output: None. Vector v is sorted. Let Q be a Priority Queue while v is not empty Data = v.remove(0) Q.insert(Data, Data) while Q is not empty v.add(Q.remove()) 24 Sorting Digression • What is the run-time of queueSort if we use an unsorted- array-based queue? 1) N insertions, each of which is O(1) 2) N removals, each of which is O(n) Phase 1 is actually N*O(1)or O(N) Phase 2 is actually O(N) + O(N-1) + … + O(2) + O(1) which can be written O(Sum[i, {i,1,N}]) or O(N2) • This is also known as “selection sort” 25 Sorting Digression • What is the run-time of queueSort if we use a sorted-array- based queue? 1) N insertions, each of which is O(n) 2) N removals, each of which is O(1) Phase 1 is actually O(N) + O(N-1) + … + O(2) + O(1) Phase 2 is actually N*O(1) or O(N) • This is also known as “insertion sort” 26 Sorting Digression • What is the run-time of queueSort if we use a heap? 1) N insertions, each of which is O(Log N) 2) N removals, each of which is O(Log N) Phase 1 is actually: O(Log 1)+O(Log 2)+O(Log 3)+…+O(Log (N-1))+O(Log N) Since each Log term (except the last) is less than O(Log N) we can say that Phase 1 is O(N Log N) Phase 2 is also O(N Log N) by similar reasoning. Therefore this algorithm is O(N Log N) • This is also known as “heap sort” 27 Building A Heap Goal: How can we quickly construct a heap given an unordered array? Heapify an Array class Heap { private int[] array; private int size; private void bubbleDown(int index) { int child = 2*index; if (child < size && array[child+1] < array[child]) ++child; if (child <= size && array[index] > array[child]) { swap(index,child); bubbleDown(child); } } public void buildHeap() { for(int i =size/2; i>0; i--) bubbleDown(i); } } 28 Build Heap Trace 1 2 3 4 5 6 7 8 9 10 -------------------------------------- Original Array 40 12 89 34 16 78 65 21 11 09 40 12 89 34 09 78 65 21 11 16 40 12 89 11 09 78 65 21 34 16 40 12 65 11 09 78 89 21 34 16 40 09 65 11 12 78 89 21 34 16 Final Array 09 11 65 21 12 78 89 40 34 16 29 Build Heap Performance Consider: • The buildHeap method does exactly (n/2) bubbleDowns. • Each bubbleDown is O(h) • h is O(Log n) • buildHeap method is therefore O (n/2 * Log n) or O(n Log n) But Wait: • O(n Log n) is not a “tight upper bound” just an “upper bound” • The buildHeap method is actually Linear! Gasp! You can’t be serious! 30 Build Heap Performance Now Consider: • The buildHeap method does exactly (n/2) bubbleDowns. • Each bubbleDown is O(h) where h is the height of each sub-heap! • h is O(Log n) where n is the size of each sub-heap • Find the height of each internal node of a complete binary tree to determine the build heap performance. build Heap is therefore O(n)! 31 Sorting Digression Revisited • What is the run-time of queueSort if we use a heap combined with buildHeap? 1) Execute buildHeap on the array which is O(N) 2) N removals, each of which is O(Log N) Phase 1 is linear Phase 2 is O(N Log N) This algorithm is still O(N Log N) Performance is slightly better that the insert/remove version 32 Huffman Code (a cool application of Priority Queues) • What is the huffman code? – A data compression scheme – Used in many compression formats (JPG, MPG, MP3, etc…) – Uses a variable-length encoding scheme (vs. fixed length) – Uses a prefix encoding scheme (no code is a prefix of another code) – Based on an analysis of the symbol frequencies in the input file Utilizes a priority queue of binary trees to construct the translation table! Ouch! My head is spinning! 33 Fixed vs. Variable Length Codes Symbol Frequency Table and Encoding Schemes Symbol a b c d e f frequency 45k 13k 12k 16k 9k 5k fixed-length 000 001 010 011 100 101 variable 0 101 100 111 1101 1100 Total Bits: Fixed: 3 bits * 100,000 = 300,000 bits Variable: 45k*1+13k*3+12k*3+16k*3+9k*4+5k*4 = 224,000 bits 76k bits / 300k bits is a 25% compression ratio 34 Fixed Length Codes Fixed Length Encoding Symbol a b c d e f fixed-length 000 001 010 011 100 101 Decode the following fragment: 001100000011 algorithm fixedLengthDecode(InputStream) for every 3-bit-byte B in the InputStream look up the symbol corresponding to B and print it out 001100000011 b e a d 35 Variable Length Codes Variable Length Encoding Symbol a b c d e f fixed-length 0 101 100 111 1101 1100 Decode the following fragment: 10111010111 algorithm variableLengthDecode(InputStream) W is a vector of “BITS” for every bit B in the InputStream append B to W look up the symbol S corresponding to W print S if found and clear W 10111010111 b ea d 36 Variable Length Codes Imagine a binary tree where each leaf represents a symbol and the path from root to leaf represents its variable-length binary encoding! No path would be a prefix of another symbol! a Decode the following fragment 10111010111 c b d algorithm variableLengthDecode(InputStream) N = the root node of the “encoding tree” for every bit B in the InputStream f e if B is zero then N = left child of N else N = right child of N if N is a leaf then print the leaf value set N = root node of encoding tree 37 Huffman Tree • The huffman tree serves as the key to both encoding (compressing) and decoding (uncompressing) a file. How is the huffman tree constructed? algorithm Huffman(V) Input: A vector V of characters with frequency f Output: The huffman tree Let Q be a priority queue while V is not empty C = v.remove(0) f = frequency of C Q.insert(f, Binary tree rooted at C with no children) while Q contains more than 1 item f1 = Q.minKey() T1 = Q.remove() f2 = Q.minKey() T2 = Q.remove() Q.insert(f1+f2, Binary tree with left child of t2 and right child of t1) return Q.remove() 38 Algorithm for Growing a Huffman Tree Define augmented Huffman tree as a Huffman tree & a frequency integer. ALGORITHM for (each character) { Construct an augmented singleton tree with frequency from frequency table for the character; priQue.enqueue( augmented tree from previous step ); } while (priQue.length() >= 2) { ht1 = priQue.front(); priQue.dequeue(); ht2 = priQueue.front(); priQue.dequeue(); Construct an augmented tree with ht1 as right subtree and ht2 as left subtree (frequency of this new tree is the sum of the frequencies of ht1 and ht2); priQue.enqueue( augmented tree from previous step ); } 39 Example Huffman Tree Build priQue (after enqueueing singleton trees) front of queue Frequency Table Char Freq. A 66 D 35 E 103 I 58 L 32 R 46 V 7 40 Y 13 Example Huffman Tree Build priQue Next, build a tree from the front two items. front of queue 41 Example Huffman Tree Build priQue ... and continue front of queue 42 Example Huffman Tree Build priQue ... and continue front of queue 43 Example Huffman Tree Build priQue ... and continue front of queue 44 Example Huffman Tree Build priQue ... and continue front of queue 45 Example Huffman Tree Build priQue ... and continue front of queue 46 Example Huffman Tree Build priQue ... and continue front of queue 47 Example Huffman Tree Build priQue ... and continue front of queue 48 Huffman Compression Compression • Scan the input file and construct a frequency table • Construct a huffman encoding tree from the frequency table • Save the huffman tree to the output file • Scan the input file and for every symbol output its huffman representation (which will be 1 or more bits) to the output file Decompression • Read the huffman tree and then process every bit, outputting the proper symbol when a leaf node is encountered. 49 Huffman Compression Most compressed files are structured as shown below Unique identifier, The huffman tree (or The encoded file. Note that since often referred to as enough information to symbols are not necessarily 8-bits, the a “magic number” re-construct the file may “end” on a non-byte boundary. that indicates this huffman tree) This means that a spurious “EOF” file was generated symbol with frequency 1 should always by your be included in the output. compression program. 50