Docstoc

CSE 326_ Data Structures Lecture _7 Branching Out_1_

Document Sample
CSE 326_ Data Structures Lecture _7 Branching Out_1_ Powered By Docstoc
					CPSC 221: Data Structures
      Lecture #5
    Branching Out
       Steve Wolfman
          2011W2
              Today’s Outline
•   Binary Trees
•   Dictionary ADT
•   Binary Search Trees
•   Deletion
•   Some troubling questions
                  Binary Trees
• Binary tree is either
   – empty (NULL for us), or                         A
   – a datum, a left subtree, and a
     right subtree                           B           C
• Properties
   – max # of leaves:                    D           E   F
   – max # of nodes:
   – average depth for N nodes:                      G       H
• Representation:
                           Data                  I       J
                        left    right
                       pointer pointer
                                    Representation
                              A                          struct Node {
                          left right
                                                           KTYPE key;
                         pointer
                               pointer                     DTYPE data;
                                                           Node * left;
                                                           Node * right;
                B                         C              };
          left right                  left right
               pointer
         pointer                           pointer
                                     pointer
                                                                 A

     D                   E                    F              B       C
 left right          left right           left right
      pointer
pointer                   pointer
                    pointer                    pointer
                                         pointer
                                                         D       E   F
              Today’s Outline
•   Binary Trees
•   Dictionary ADT
•   Binary Search Trees
•   Deletion
•   Some troubling questions
   What We Can Do So Far
• Stack                      • List
   – Push                        – Insert
   – Pop                         – Remove
• Queue                          – Find
   – Enqueue                 • Priority Queue
   – Dequeue                     – Insert
                                 – DeleteMin


            What’s wrong with Lists?
                    Dictionary ADT
                                                             •   midterm
• Dictionary operations                                           – would be tastier with
   –   create                           insert                      brownies
                                    • brownies
   –   destroy                        - tasty                •   prog-project
   –   insert                                                     – so painful… who invented
                                                                    templates?
   –   find                                                  •   wolf
   –   delete                       find(wolf)                    – the perfect mix of oomph
                                                                    and Scrabble value
                             • wolf
                                - the perfect mix of oomph
                                 and Scrabble value




• Stores values associated with user-specified keys
   – values may be any (homogenous) type
   – keys may be any (homogenous) comparable type
                   Search/Set ADT
                                                            •   Berner
• Dictionary operations
                                                            •   Whippet
   –   create                                    insert
                                                • Min Pin   •   Alsatian
   –   destroy                                              •   Sarplaninac
   –   insert                                               •   Beardie
   –   find                                                 •   Sarloos
   –   delete                                  find(Wolf)   •   Malamute
                                               NOT FOUND    •   Poodle




• Stores keys
   – keys may be any (homogenous) comparable
   – quickly tests for membership
           A Modest Few Uses
•   Arrays and “Associative” Arrays
•   Sets
•   Dictionaries
•   Router tables
•   Page tables
•   Symbol tables
•   C++ Structures
                   Desiderata
• Fast insertion
   – runtime:


• Fast searching
   – runtime:


• Fast deletion
   – runtime:
        Naïve Implementations
                   insert   find   delete

• Linked list


• Unsorted array


• Sorted array


    so close!
              Today’s Outline
•   Binary Trees
•   Dictionary ADT
•   Binary Search Trees
•   Deletion
•   Some troubling questions
          Binary Search Tree
       Dictionary Data Structure
• Binary tree property                                    8
   – each node has  2 children
   – result:
       • storage is small                     5                        11
       • operations are simple
       • average depth is small
                                      2           6               10        12
• Search tree property
   – all keys in left subtree
     smaller than root’s key              4           7       9              14
   – all keys in right subtree
     larger than root’s key                                                  13
   – result:
       • easy to find any given key
        Example and Counter-Example
            5                                8

        4           8                5                11

1               7       11   2       7   6       10        18

    3                            4                15        20


BINARY SEARCH TREE                                          21
                                   NOT A
                             BINARY SEARCH TREE
                      In Order Listing
                                 struct Node {
                                   KTYPE key;
                 10                DTYPE data;
                                   Node * left;
                                   Node * right;
     5                 15        };


 2           9              20

         7              17 30

In order listing:
25791015172030
                      Finding a Node
                                     Node *& find(Comparable key,
                10                                Node *& root) {
                                       if (root == NULL)
    5                   15               return root;
                                       else if (key < root->key)
                                         return find(key,
2           9                20                      root->left);
                                       else if (key > root->key)
                                         return find(key,
        7                  17 30                     root->right);
             a.      O(1)
                                       else
             b.      O(lg n)
                                         return root;
             c.      O(n)
    runtime: d.                      }
                     O(n lg n)
             e.      None of these
                         Finding a Node
                                             Node *& find(Comparable key,
                   10                                     Node *& root) {
                                               if (root == NULL)
       5                    15                   return root;
                                               else if (key < root->key)
                                                 return find(key,
  2            9                  20                         root->left);
                                               else if (key > root->key)
                                                 return find(key,
           7                   17 30                         root->right);
                                               else
WARNING: Much fancy footwork with                return root;
refs (&) coming. You can do all of this      }
without refs... just watch out for special
cases.
                   Iterative Find
Node * find(Comparable key,
            Node * root) {
                                                       10
  while (root != NULL &&
         root->key != key) {               5                    15
    if (key < root->key)
      root = root->left;
    else                              2            9                 20
      root = root->right;
  }
                                               7                 17 30
    return root;
}
                                               Look familiar?

(It’s trickier to get the ref return to work here.)
                           Insert
                                // Precondition: key is not
                10              // already in the tree!
                                void insert(Comparable key,
    5                15                     Node * root) {
                                  Node *& target(find(key,
                                                      root));
2           9             20      assert(target == NULL);

                                    target = new Node(key);
        7             17 30     }



    runtime:
                     Funky game we can play with the *& version.
 Digression: Value vs. Reference
           Parameters
• Value parameters (Object foo)
   – copies parameter
   – no side effects
• Reference parameters (Object & foo)
   – shares parameter
   – can affect actual value
   – use when the value needs to be changed
• Const reference parameters (const Object & foo)
   – shares parameter
   – cannot affect actual value
   – use when the value is too big for copying in pass-by-value
            BuildTree for BSTs
• Suppose the data 1, 2, 3, 4, 5, 6, 7, 8, 9 is inserted
  into an initially empty BST:
   – in order



   – in reverse order



   – median first, then left median, right median, etc.
         Analysis of BuildTree
• Worst case: O(n2) as we’ve seen
• Average case assuming all orderings equally likely
  turns out to be O(n lg n).
     Bonus: FindMin/FindMax
• Find minimum                   10

                     5                15

• Find maximum   2           9             20

                         7             17 30
           Double Bonus: Successor
Find the next larger node
                                                     10
in this node’s subtree.

Node *& succ(Node *& root) {             5                15
  if (root->right == NULL)
    return root->right;
  else
    return min(root->right);         2           9             20
}

Node *& min(Node *& root) {                  7             17 30
  if (root->left == NULL) return root;
  else return min(root->left);
}
    More Double Bonus: Predecessor
Find the next smaller node
in this node’s subtree.                                   10

Node *& pred(Node *& root) {
  if (root->left == NULL)                     5                15
    return root->left;
  else
    return max(root->left);               2           9             20
}

Node *& max(Node *& root) {
  if (root->right == NULL) return root;           7             17 30
  else return min(root->right);
}
              Today’s Outline
• Some Tree Review
  (here for reference, not discussed)
• Binary Trees
• Dictionary ADT
• Binary Search Trees
• Deletion
• Some troubling questions
             Deletion
                        10

            5                15

        2           9             20

                7             17 30


Why might deletion be harder than insertion?
                       Lazy Deletion
• Instead of physically deleting
  nodes, just mark them as                                10
  deleted
   +   simpler
                                              5                15
   +   physical deletions done in batches
   +   some adds just flip deleted flag
                                          2           9             20
   –   extra memory for deleted flag
   –   many lazy deletions slow finds
                                                  7             17 30
   –   some operations may have to be
       modified (e.g., min and max)
             Lazy Deletion
Delete(17)
                             10
Delete(15)
                 5                15
Delete(5)

Find(9)      2           9             20

Find(16)             7             17 30

Insert(5)

Find(17)
         Deletion - Leaf Case
Delete(17)                   10

                 5                15

             2           9             20

                     7             17 30
    Deletion - One Child Case
Delete(15)                   10

                 5                15

             2           9             20

                     7                  30
    Deletion - Two Child Case
Delete(5)                   10

                5                20

            2           9             30

                    7
    Finally…
            10

    7            20

2       9             30
                     Delete Code
void delete(Comparable key, Node *& root) {
  Node *& handle(find(key, root));
  Node * toDelete = handle;
  if (handle != NULL) {
    if (handle->left == NULL) {          // Leaf or one child
      handle = handle->right;
    } else if (handle->right == NULL) { // One child
      handle = handle->left;
    } else {                             // Two child case
      Node *& successor(succ(handle));
      handle->data = successor->data;
      toDelete = successor;
      successor = successor->right;      // Succ has <= 1 child
    }
  }
  delete toDelete;
                             Refs make this short and “elegant”…
}
             but could be done without them with a bit more work.
              Today’s Outline
• Some Tree Review
  (here for reference, not discussed)
• Binary Trees
• Dictionary ADT
• Binary Search Trees
• Deletion
• Some troubling questions
             Thinking about
           Binary Search Trees
• Observations
   – Each operation views two new elements at a time
   – Elements (even siblings) may be scattered in memory
   – Binary search trees are fast if they’re shallow
• Realities
   – For large data sets, disk accesses dominate runtime
   – Some deep and some shallow BSTs exist for any data


                 One more piece of bad news: what happens to a
                 balanced tree after many insertions/deletions?
                 Solutions?
• Reduce disk accesses?



• Keep BSTs shallow?
                   To Do
• MAYBE: Read Epp 11.5 and KW 8.1-8.4
• TODO (Steve): announce actual reading!
                   Coming Up
• cilk_spawn Parallelism and Concurrency   Spawns parallel task.
                                           Since we have only
• Huge Search Tree Data Structure
                                           one classroom, one or
• cilk_join                                the other goes first.
• Self-balancing Binary Search Trees

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:7
posted:5/29/2012
language:
pages:39