# 2-D Arrays and an Introduction to Hashing

Document Sample

```					    2-D Arrays and an
Introduction to Hashing
22C:021 Comp Sci II – Data Structures
3rd Discussion Section
Week of 09/08/2008
Today‟s topics

• Using 2-D arrays – TwoDArray.java
• Indexing using keys – an intro to hashing
using a 2-D array
• A brief overview of JavaDoc
• First quiz during last 20 minutes
2-D Arrays – An overview
Some things to keep in mind about 2-D arrays:
• Can be thought of as an array of arrays
• Often used to store simple, associative data. For example:

Sally        8 hrs         7.5 hrs 0 hrs     4 hrs        4 hrs
John         5 hrs         5 hrs   5 hrs     0 hrs        0 hrs
Bob          0 hrs         0 hrs   0 hrs     8 hrs        7 hrs

Associated data stored in 2nd
1st Dimension - Employee
Dimension. We will eventually see
name used as a key to
that it can be referenced like:
reference an array of their
Hours[“John”][3]
hours. How can we do this
programmatically?
2-D Arrays: Our Example Problem
• In this example problem, we have a potential
space of 1,000,000 keys which can be used to
index our desired data, but our data are sparse.
We will probably never use all 1 million keys.
• In the real world, we don‟t want (or maybe don‟t
have the memory) to use an array of size
1,000,000. Is there any way we can use a
smaller array?
One big point to keep in mind is that we know the
upper bound on the number of possible keys– 1
million. So, how can we design our algorithm to
take advantage of this?
TwoDArray.java - Initialization
1000 rows, each containing
an array of size 10. The first              The array of size 10 can be
set of brackets will be where               changed as needed by our
we use our key to reference                code, using what we learned
the associated array of data.              about dynamic array resizing.
public static void initialize() {
data = new int[1000][10];
sizes = new int[1000];
for (int i = 0; i < 1000; i++)
sizes[i] = 0;
}
The sizes array is used to keep track
of how many elements are currently
stored in each of our 1000 slots in the
data array.

We initialize these all to 0 using the
for-loop.
TwoDArray.java - Insertion
The variable location is our computed key.
It is found using a simple formula to keep it    Here, we compare the length of
in the bounds of the array. Note that if we    the sub-array with the number of
don‟t do this, then we require an array of     elements already stored here. If
size 1,000,000.                  they are equal, then the array is
full, and we must resize it.
public static void insert(int number) {
int location = (number - 1) / 1000;
if (data[location].length == sizes[location]) {
int[] temp;
temp = new int[sizes[location] * 2 + 1];

The manner in which we resize the
sub-array is the same as before. We
double the size to minimize the
amortized running time.
TwoDArray.java – Insertion (cont‟d)
Here, we have the remainder of the insertion function. The first
few lines are just for the purpose of resizing the array if it gets
filled up.
We keep track of which
number we just inserted.
In the case of this
program, it is both our
data, and i++) {
for (int i = 0; i < sizes[location]; the basis of our
temp[i] = data[location][i];computed key.
data[location] = temp;
}
data[location][sizes[location]] = number;
sizes[location]++;
}

The integer sizes[location]
contains the earliest unfilled
slot in the sub-array, since we
haven‟t incremented it, yet.
TwoDArray.java – Searching
Our key, location, is
computed exactly as it is
in the insert function.
public static int search(int[][] data,int[] sizes, int number)
{
int location = (number - 1) / 1000;
for (int i = 0; i < sizes[location]; i++)
if (number == data[location][i])
return i;
return -1;
}

A big idea to take away from this is that, because our key is computed, we
don‟t have to iterate over a large number of rows to find what we‟re looking
for. In fact, the search can be performed in approximately constant time!
TwoDArray.java – The Big Idea
Basically, the search has two parts:
i.    First, you must find the correct row. Since our key is
computed from the input, this can be done in
constant time, due to the “random access” nature of
arrays.
ii.   Then, that row must be scanned for the correct
element. The worst-case running time of this is
linear to the number of elements in the subarray.

In the case of our TwoDArray algorithm, there should be at most 1000
items in any given row. For example, the key will only land in row 5
when n=5000 to 5999. Thus, in the worst case, it would only take 1,000
“steps” to find n, as opposed to 1,000,000 steps if these were stored in
an unordered fashion. In short, this gives us an improvement in time, at
the cost of space.
Real-world Hashing
Revisiting our table of hours that employees have worked, we
are now better prepared to answer the question of how this
might be implemented.

Sally       8 hrs         7.5 hrs 0 hrs               4 hrs         4 hrs
John        5 hrs         5 hrs         5 hrs         0 hrs         0 hrs
Bob         0 hrs         0 hrs         0 hrs         8 hrs         7 hrs

Internally, “Bob” is represented by three characters, which are nothing more than
three small integers: 0x42, 0x6f, and 0x62, in this case. Using certain mathematical
functions, we can turn a list of these integers into one new number called a “hash.”
Real-world Hashing (cont‟d)
our we see that the search we can
Continuing with the example ofHere, key string, “Bob,” function is very
which will take Bob‟s integers
imagine that there is a functionsimilar to the one in TwoDArray.java. The first
big difference is that we use a blackbox
them called hash:
(0x42, 0x6f, and 0x62) and turn functioninto agetInRange to compute our
index.

public static HoursWorked search(String key) {
int location = getInRange(keystring);
for (int i = 0; i < sizes[location] ; i++ )
if (keystring == data[location][i].key)
return data[location][i].hoursWorked;
return null;
}
Second, unlike the TwoDArray version, we
return data (hours worked) rather than the
key itself
Real-world Hashing (cont‟d)
In many programming languages, hashing is either built-in, or
part of a library. This allows you to write code like this:

HashTable hoursWorked = new HashTable();
hoursWorked[“Bob”] = new float[] { 5, 5, 5, 0, 0 };
System.out.println(“On Wednesday, Bob worked ”
+ hoursWorked[“Bob”][2] + “ hours.”);

Later on in the semester, we will get into Hashing in much
greater detail.
JavaDoc – A Brief Overview
• A tool which automatically creates
documentation from specially-formatted
comments in Java code.
to the programmer, such as:
– a quick hover-text reminder of function specifications
– a uniform, standardized way of commenting code,
allowing other programmers to read and understand
JavaDoc – Overall look
opposed to the usual multiline comments
/**              which have just one asterisk after the slash.
* Searches a 2D array for a given number by computing a key from
* the number, and searching the corresponding sub-array at that
index.
*
* @param data - the 2D array to be searched
The first bunch of text, inserts" array
* @param sizes - the supplemental "number ofbefore any
@parameters, is taken as have been of the
* @param number - The number which may a descriptionpreviously
inserted                 function. Note: must be placed right before a
function the order in which the
* @return An integer representingdefinition for this to work. number
* was originally inserted, or -1, if the number was not found.
*/
public static int search(int[][] data, int[] sizes, int number) {
There are certain keywords, such as @param
and @return, which are almost universally-
recognized, and will get special treatment by
some IDEs and by the JavaDoc documentation
program.
JavaDoc – Eclipse Hovertext Example
JavaDoc – Final Notes
• JavaDoc comments can even be formatted
using HTML tags:
/**
* Here is a <i>description</i>
* @author Chris <b>”The Man”</b> Dibbern
• Some useful „@‟ tags are: @author, @param,
@return, and @throws.
Time for the Quiz!

This quiz is open-book and
open-notes.

Thank you for your cooperation.


```
DOCUMENT INFO
Shared By:
Categories:
Stats:
 views: 15 posted: 11/21/2008 language: English pages: 17
How are you planning on using Docstoc?