Document Sample
p6 Powered By Docstoc
					CSC 362 Program #6
Due Date: Thursday, April 26

In this lab you will experiment with an array of linked lists to test out an idea called hashing. You
may have implemented a form of hashing in CSC 364. The idea behind hashing is to make list
access more efficient. In an array or a linked list, to find an item, you have to search for it. If the
array is ordered, you can use the binary search and limit the amount of time it takes to search to
something reasonable (log n) but if unordered, or if a linked list, it could take as much as n steps
where n is the number of data in the list. Hashing uses a function to compute where to look, thus
reducing the time to just 1 (the time it takes to perform array[location].

Consider the following list of values stored in an array. You place the element in the array location
as determined by the hashing function. In this case, our function is f(value) = value % 11 where 11
is the size of the array (we often use a prime number for the size).

Values: 14, 13, 92, 8, 22, 6, 23

Index: 0       1       2       3       4       5       6       7       8       9       10     11
Array: 22      23      13      14      92              6               8

If you want to store a new value, compute value % 11 and place it in that location. If you are
searching for value, you should find it at array location value % 11. Unfortunately, this simple form
of hashing leads to a problem: collisions. For instance, if you want to now store 24, it would map
to array location 2, but we already have a value there, so we might choose to put it in the next
available location, which is array location 5. This would lead to poor performance because search
not only requires looking at value % 11, but then searching sequentially down the array until the
value is found. Another solution is to use a linked list for each array element. Now, array element
0 is a pointer to a linked list which currently only stores 1 node (that node has the value 22). If we
want to insert 24 into our list, we would map it to the linked list at array position 2, so that array
position would point to a node storing 13 and a node storing 24. This is more efficient.

For this assignment, you will create an array of structs. Each struct will store two items, a pointer to
a linked list that indicates all of the items stored at that array location (i.e., values % size), and the
number of items currently stored in that linked list. You will implement the following three
functions for your linked lists:
             Ordered insert – given a new number, store that number in a new node in the
                appropriate linked list using the hashing function value % SIZE.
             Destroy – destroy an entire linked list
             Traverse – traverse an entire linked list, printing each element and output the number
                of nodes in that list, and return the length of the list in nodes

Your program will work as follows:
          o Declare an array of structs, initializing each struct so that its entires are a NULL
             pointer and 0 for the number of elements in the list
          o Randomly generate 100 int values (each number will be between 1 and 100,000) and
             insert each number into the appropriate linked list, updating the number of elements
             stored in that list
          o Print each linked list and return the size of that list, printing out the word NULL and
            length of 0 if a given list is empty, and then output the size of the longest list
          o Destroy each linked list

You will experiment to see what the longest linked list is. Run your program with an array of size
10, an array of size 11, and an array of size 12. Obtain the output for these runs. However, also
experiment with your program by running it several times on each array size. Determine over the
number of runs which size gives you the smallest maximum size (this will be the most efficient
array size). Hand in your program, the output for each run (one run for each of 10, 11, 12), and a
brief summary of what you discovered in running the program (you can put your summary in your
program comments).