Chapter 4 Linked Lists
Document Sample


Chapter 4: Linked
Lists
The problem with arrays
An array has a fixed size
Its size is determined at compile time
Very cumbersome to try to dynamically
allocate an array
Having wasted memory that we aren’t
using is not an efficient solution to a
problem
Linked Lists
We could alter the way that we look at an array
to make it more efficient
Rather than having a static block of memory
allocated to an array, what if we only locked off
what we needed?
In a manner of speaking, we could make an array of
one element and then link it somehow to other
elements as they are added
We would link it to the other elements via a pointer
How to do a linked list
We start with the primitive of our list; for
the sake of argument, we’ll say that we
are dealing with integers
We create a new data type that has both
a integer and a pointer contained within it
The integer is the data we want to keep
The pointer points to the next item in the
array
Inserting items into the
linked list
If we want to insert an item, we simply
instantiate our data type
If the list is unsorted, we set the pointer in the
list to point to the newly created element
If the list is sorted, we find where it goes and
change two pointers
The neophyte element copies the pointer value of
the element before it
The element before our neophyte has its pointer
changed to our new element
Deleting items from the
linked list
To delete an item, we look at the element
before the one we want to delete
Change that element’s pointer to the pointer value
of the one we’re deleting
We then garbage collect the deleted item to
ensure that we do not create a memory leak!
We can leave the deleted element around for a bit if
we think we’re going to use it again, but be sure that
you remember to garbage collect it at some point
A holistic look at the
linked list
Why we call it a linked list
The items in the data structure are linked to
one another, i.e. the first item points to
the second which in turn points to the
third which in turn points to the fourth…
A few pointers
A pointer variable or more commonly
referred to as simply a pointer contains
the location or address in memory where
the item in question resides
By using pointers, we can quickly locate
data that the operating system moves
around
The Pointer sisters
Let’s say that Jennifer Pointer moves about
quite a bit
She shops at all the trendiest places
She has many friends that she likes to visit
If I want to find Jennifer Pointer, I need to know
some constant about her that will allow me to
locate her
Unfortunately, Jennifer is a bit of a technophobe and
does not like cellular telephones
However, I do know where she lives
Where in the world is
Jennifer Pointer?
Since I know where she lives and that she is not given
to moving her bivouac willy-nilly, I can drive to her
abode and talk to her more homebody sibling, Melody
Pointer
Melody always knows where I can find Jennifer
Melody tells me where Jennifer is and then I drive to
that location to see her
In this case, Melody becomes a pointer to Jennifer;
she is a static (non-moving) reference to a dynamic
(moving) resource; I can always find where Jennifer is
because I know where to find Melody
A picture of what I mean
All this sounds great, but...
How do I get a pointer variable p to point
to a memory cell (location)?
How do you use p to get to the content of
the memory cell to which p points?
Whoa there, partner you’re a bit ahead of
yourself; First, let us declare p as a
pointer:
int *p;
Be careful with your
syntax!
int *p, q; and int* p,q;
are the same as saying:
int *p;
int q;
To declare both as pointers, we do the
following:
int *p, *q;
When is this memory
allocated?
This memory is allocated statically
which means that it is done at compile-
time
We therefore refer to these variables as
statically allocated variables.
Working with pointers
In addition to being careful with our syntax, we
also need to be a bit careful with how we
handle pointers
If we declare an integer x, we cannot just set
p=x; for this statement will be rejected by the
compiler because p and x are different
fundamental types
We can, however, use the address or
address-of operator, &
p = &x;
which places the address of x into p.
Dynamic memory
allocation
We can make a variable to be allocated at run-time;
this is called dynamic memory allocation.
A variable of this type is said to be a dynamically
allocated variable (real shocker, eh?)
We accomplish this by use of the C++ keyword new
p = new int;
It is important to note that after executing this
statement, the value of p is indeterminate; it is not
initialized to some particular value
You therefore need to initialize it to some value to
prevent weird things from happening to your code!
Unused memory
Suppose that we no longer require the services
of a pointer
We could assign the pointer to NULL which
makes the pointer point to the language-default
never-land, i.e. a pointer that points to NULL
should never be used until its value is
reassigned to something more tangible such as
0xfcde0895 (just an example).
But what if we know we’re not going to use that
pointer again?
Deleting a variable to
recover memory
In this case, we want to delete the variable
so that the memory is recovered by the
operating system
If we do not do this, the memory cell remains
allocated to the program thus producing the
much-feared memory leak
We could delete p in the following manner:
delete p;
A few caveats on delete
When we delete a variable, we do not de-
allocate the variable
We simply leave its contents undefined
It is no longer protected by the program or the OS
The memory cell remains allocated to the
program even though it is no longer accessible;
referencing *p after we have done a delete
p; can be disastrous!
We stave off this problem by assigning
p = NULL;
Why delete doesn’t do this
So why doesn’t delete automatically set
the value of the pointer to NULL?
The system cannot always clearly
determine who should be set to NULL.
You may have more than one pointer that
points to that location
It will therefore remain the responsibility of
the developer to set that value to NULL.
An incorrect pointer to a
non-protected node
An example
Example continued
End of the example
Dynamic array allocation
We can allocate arrays dynamically with a bit of chicanery
int arraySize = 50;
int *anArray = new int[arraySize];
The pointer variable anArray will point to the first item in the
array.
Since arraySize is a variable, we could change the size of
the array
We then create a new array
Copy the old array to the new
garbage collect anArray
This can be inefficient if anArray is sufficiently large!
Pointer arithmetic
C++ treats the name of an array as a pointer to the first
element in the array, e.g.
*anArray is equivalent to anArray[0]
*(anArray+1) is the same as anArray[1]
This is called pointer arithmetic.
Be careful! If a pointer points to an array of integers, you
must add sizeof(int) to get to the next value!
Most compilers will handle this for you
This type of arithmetic can really haunt you if you port this to
another system whose compiler doesn’t compute it for you!
Deleting the array
To effectively de-allocate the array, use
the following notation:
delete [] anArray;
Remember that this memory is returned
to the system for future use
The values it contains may still be valid
Set the pointer to NULL so that others (and
you) will not be tempted to use it!
Pointer-based linked lists
Each component is called a node
Each node has two components
The data itself
A pointer to the next item in the list
Since each node in the linked list
contains two pieces of native primitives, it
is natural to conclude that the linked list
should be a struct instead of a class.
Some pointers on pointers
A pointer can point to almost anything:
Integers
Chars
Arrays
Floats
Structs
A pointer cannot, however, point to a file (there
are special file pointers to do this)
Therefore a pointer in our node structure can
point to another node structure
More on linked lists
We have all the elements pointing to one
another, but what about the beginning and end
of the list?
We have an additional pointer that points to the
beginning of the list – the head pointer or
head of the list
The head is usually pointed to NULL when the list is
initialized
The head is also pointed to NULL if the list becomes
empty
The last item in the linked list can also point to
NULL to indicate that it is indeed the last thing
on the list.
A picture of the linked list
Displaying the contents of
a linked list
If we want to show all the elements in a
linked list, we can simply employ a loop
to iterate through the entire list displaying
each one of the elements
This solution requires that we keep
around an additional pointer cur which
keeps track of the current node to which
we are pointing
Some code
//Display the data in a linked list
//Loop invariant: cur points to the next
//node to be displayed.
for(Node *cur=head; cur!=NULL; cur=cur->next)
cout << cur->item << endl;
Some N.B.s on the code
A common error is to compare cur->next with
NULL instead of cur.
When cur points to the last node of a non-empty
linked list, cur->next = NULL.
This means you won’t display the last item in the list!
Displaying a linked list is an example of a
common operation called list traversal which
sequentially visits each node in the list until it
reaches the end of the list.
Deleting a node from the
linked list
We simply take the next pointer of the
previous entry in the list and set it equal to
next pointer of the item we wish to delete
The node we have deleted still remains in
existence! It must be garbage collected.
prev->next = cur->next;
Does this work for every
node in the list?
Unfortunately, the answer is no.
If we try to delete the first element in the list,
prev->next is undefined.
Fortunately, there is a simple solution:
head = cur->next;
Avoiding a memory leak
After we delete a node from the list, the
node still exists
We must return the node to the OS for
garbage collection
cur->next = NULL;
delete cur;
cur = NULL;
To delete, we perform 3
tasks
Locate the node we wish to delete by list
traversal
We can delete the ith node
We can delete a node that has a particular data item
Disconnect the node from the list by changing
pointers
Return the disconnected node to the system to
be garbage collected
Inserting a node in the
linked list
We do just the opposite of the deletion code
We create our new node
newPtr = new Node;
We initialize our new node with our data
We traverse the list until we find where we wish to insert the node
newPtr->next = cur;
prev->next = newPtr;
Inserting at the beginning
of the linked list
As you might have suspected, this is a special case
We just point head to the newly minted node and let
the neophyte’s next item be the old head
newPtr->next = head;
head = newPtr;
To insert, we perform 3
tasks
Traverse the list to determine the point of
insertion
Create a new node and store the new
data in it
Connect the new node to the linked list
by changing pointers
More on inserting
In order to insert an item in our list, we
need to keep a trailing pointer prev that
points to the previous item in the list
In this manner, we can look at the value
of the current node and see if the new
node has to go before it
We can then back up a node and do the
insertion
Determining the point of
insertion/deletion
//Determine the point of insertion/deletion
//for a sorted linked list
for (prev = NULL, cur=head; (cur != NULL) && (newValue >
cur->item); prev = cur, cur = cur->next);
Pointer-based ADT List
Unlike the array-based implementation, there is
no shifting of items necessary during
insertion/deletion
It is therefore a much more efficient algorithm for
larger data sets
It also cuts down on the memory footprint for the
code
It also does not impose a strict
minimum/maximum for the size of the list other
than the amount of memory available to the
system
ADT List redefined
Suppose that we define a function find(i)
that finds the ith node of the list
To insert/delete at this node, we also need to
know the location of the previous node
Since we’ve taken 2315, we think smartly and
realize that we could make find(i) find the
(i-1)th node which leaves cur pointed to the (i-
1)th node and cur->next pointed to the ith
item
More on find()
It isn’t a specified ADT operation
Moreover, find() returns a pointer
Recall that pointers are powerful and we
don’t want just anyone to have them
We would therefore not want any client of
the class to call it
General observation on
ADTs
In general, it is perfectly reasonable for
an ADT to define variables and functions
that the rest of the program should not
access.
Many ADTs require a special constructor
called a copy constructor so that your
code can correctly handle
List yourList = myList;
Shallow or deep copy?
If we only need a shallow copy of a data
structure, we do not need to provide a
copy constructor as the compiler’s
rendition will suffice
If we need a deep copy (as we do for the
ADT List), we must provide our own copy
constructor
What’s the difference?
Shallow copy
Deep copy
Destructors
Classes that only use statically allocated
memory can rely upon the compiler-
generated destructor to free up memory
However, classes that use dynamically
allocated memory need to have their own
custom written destructor that returns all
used resources to the system
A destructor for List would be ~List()
Header file of ADT List
// *********************************************************
// Header file ListP.h for the ADT list. // Pointer-based implementation.
// *********************************************************
#include "ListException.h"
#include "ListIndexOutOfRangeException.h"
typedef desired-type-of-list-item ListItemType;
class List {
public:
// constructors and destructor:
List(); // default constructor
List(const List& aList); // copy constructor
-List(); // destructor
// list operations:
bool isEmpty() const;
int getLength() const;
void insert(int index, ListItemType newItem)
throw(ListIndexoutOfRangeException, ListException);
void remove(int index)
throw(ListIndexOutOfRangeException);
void retrieve(int index, ListItemType& dataltem) const throw(ListIndexOutOfRangeException);
private:
struct ListNOde // a node on the list
{
ListItemType item; // a data item on the list
ListNode *next; // pointer to next node
}; // end struct
int size; // number of items in list
ListNode *head; // pointer to linked list of items
ListNode *find(int index) const;
// Returns a pointer to the index-th node // in the linked list.
} //end class
// End of header file.
The implementation file
Since we can’t put everything on a single page,
we will do the implementation file piecemeal
The default constructor is simple:
List::List(): size(0), head(NULL)
{
//Nothing needed here
}//end default constructor
Copy constructor
List(const List& aList): size(aList.size)
{
if (aList.head == NULL)
head = NULL; //original list empty
else
{ //Copy first element
head = new ListNode;
assert(head != NULL); //check allocation
head->item = aList.head->item;
//Copy rest of list
ListNode *newPtr = head; //Last node in new list
for (ListNode *origPtr = aList.head->next; origPtr != NULL; origPtr = origPtr->next)
{
newPtr->next = new ListNode;
assert(newPtr->next != NULL);
newPtr = newPtr->next;
newPtr->item = origPtr->item;
} //end for
newPtr->next = NULL;
}
}
Destructor
We can de-allocate the entire list by
continually removing an element until the
list is empty
List::~List()
{
while (!isEmpty())
remove(1);
} //end destructor
List operations
bool List::isEmpty() const
{
return bool(size == 0);
}
int List::getLength() const
{
return size;
}
More list operations
Because the list doesn’t allow direct
access to elements the retrieval, insertion
and deletion operations must all traverse
the list from the beginning until the
specified point is reached
Because of this, we define the operation
find(i).
find(i)
List::ListNode *List::find(int index) const
//Locates a node in the list
//Precondition: index is number of node desired
//Postcondition: Returns pointer to desired node. If node not
// located, NULL returned.
{
if ((index < 1) || (index > getLength()))
return NULL;
else
{
ListNode *cur = head
for (int skip = 1; skip < index; ++skip)
cur = cur->next;
return cur;
}
}
retrieve(i)
void List::retrieve(int index, ListItemType& dataItem)
const
{
if ((index < 1) || (index > getLength()))
throw ListIndexOutOfRangeException(“Index out of
range.”);
else
{
ListNode *cur = find(index);
dataItem = cur->item;
}
}
insert(i, newItem)
void List::insert(int index, ListItemType newItem)
{
int newLength = getLength() + 1;
if ((index < 1) || (index > newLength)) throw ListIndexOutOfRangeException(
"ListOutOfRangeException: insert index out of range");
else
{ // create new node and place newItem in it
ListNode *newPtr = new ListNode;
if (newPtr == NULL)
throw ListException( "ListException: insert cannot allocate
memory");
else
{
size = newLength; newPtr->item = newItem;
// attach new node to list
if (index == 1)
{ // insert new node at beginning of list
newPtr->next = head;
head = newPtr;
}
else
{
ListNode *prev = find(index-1); // insert new node after node to which
prev points
newPtr->next = prev->next;
prev->next = newPtr;
} //end if
} //end if
} //end insert
delete(i)
void List::remove(int index)
{
ListNOde *cur;
if ((index < 1) || (index > getLength()))
throw ListIndexoutofRangeException( "ListoutofRangeException:
remove index out of range");
else
{
--size;
if (index == 1)
{ // delete the first node from the list
cur = head; // save pointer to node
head = head->next;
}
else
{
ListNOde *prev = find(index-1);
cur = prev->next; //Save pointer to node
prev->next = cur->next;
} //end if
cur->next = NULL;
delete cur;
cur = NULL;
} //end if
} //end remove
Comparing the array-based
and pointer-based
implementations
As usual, there are pros and cons to each
implementation strategy
You should carefully weigh these pros and cons before
selecting a strategy
Arrays have a fixed size
Arrays have direct access because their elements are stored
one after the other
This is called an implicit address
A pointer-based has to explicitly specify the next address
Because an array-based implementation doesn’t need address
information for the next element, they require a smaller
memory footprint
Arrays don’t require you to traverse the entire list to find your
element
Arrays require you to shift the data anytime you insert or delete
elements
Saving and restoring a
linked list from a file
The algorithm that restores a linked list also
demonstrates how you can build a linked list
from scratch
Writing the pointers to a file serves no purpose
because those pointers become invalid once
the program terminates
Therefore, writing out the entire node to a file is
not an eloquent solution
All you really need to save in the file is the data
portion of each node (easy to do if each item
has a fixed size, but a bit trickier if you’re
storing strings or other variable length data)
More restoring from a file
You can use the native insert() code to
keep adding items to your list
However, each time you insert something to
the end requires a traversal to the end of the
list
We could save the file in reverse order of the list so
we always insert at the head of the list
We could make a tail pointer that points to the
end of the list
tail could be local and destroyed after the list is created
Or it could exist as long as head exists – it’s up to you!
Passing a linked list to a
function
It is sufficient to pass the head pointer to the function
This should not be the case if the function is outside of
the class’ scope (remember, pointers are powerful and
this would violate the wall of the ADT!)
Recursive functions might need the head pointer as an
argument
These must not be in the public section of the class
This keeps our ADT safe from others
Pass the head pointer by reference
A linked list passed to a function as an argument is shallow
copied, not deep copied
Passing the head pointer causes a deep copy by the copy
constructor
Recursively processing
linked lists
If the recursive functions are members of a
class they should not be public because they
require the linked list’s head pointer as an
argument
One such recursive function would be list
traversal for writeBackward()
Another example would be repeated insertion
which eliminates the need for both a trailing
pointer and a special case of inserting at the
beginning of a list
Repeated insertion
Suppose we want to insert into a sorted linked
list with a recursive function
The linked list is sorted if
head = NULL;
head->next = NULL;
head->item < head->next->item and the
pointer head->next points to a sorted linked list
The first two of those cases become our base
cases
Some code to do just so!
void linkedListInsert(Node *& headPtr, ItemType newltem)
{
if ((headPtr == NULL) II (newItem < headPtr->item))
{ //base case: insert newltem at beginning
Node *newPtr = new Node;
if (newPtr != NULL)
throw ListException( "ListException: insert cannot allocate
memory");
else
{
newPtr->item = newItem;
newPtr->next = headPtr;
headPtr = newPtr;
} //end if
else
linkedListInsert(headPtr->next, newItem);
} //end linkedListInsert
Some N.B.s
The function inserts at one of the base
cases
Either when the list is empty
Or when the data item is smaller than all the
other data items in the list
In either one of these cases, you need to
insert the item at the front of the list
General Insert (yes, sir!)
The general case in which the item is
inserted somewhere in the innards of the
list is very similar
When the base case is reached, the
next pointer of the node is the argument
that corresponds to the headPtr in our
recursive definition
The general insert case
Variations on a theme
There are different flavors of a linked list
Circular(ly) linked lists
Dummy head nodes
Doubly linked lists
Circular(ly) doubly linked lists
Which one you should use depends on
what you are trying to do within your
design
Circular(ly) linked lists
If we make the next pointer of the last element
point to the first element in the list, we have
created a circularly linked list
No node contains NULL in its pointer
We must be careful when traversing to the end of
the list or we will create an infinite loop
We can save the current pointer and keep
traversing until we hit that pointer again
We don’t have to keep track of the head
pointer, just the current pointer
Dummy Head Nodes
Both the insertion and deletion algorithms
require a special case for the head node
The Dummy Head Node method eliminates the
need for this special case
The Dummy Head Node is present even when
the list is empty
In this implementation, the first item in the list is
actually the second item in the linked list
The insertion and deletion algorithms initialize
prev to point to the dummy head node instead
of to NULL.
Doubly linked lists
When deleting a node from a list, it would be handy to
not have to remember a prev pointer or to have to re-
traverse the list to find the previous node
With a doubly linked list, we have two pointers
packaged with the data item
A next pointer which points to the next node
A prev pointer which points to the previous item
Because there are more pointers, the mechanics of
doing an insert or delete are a bit more involved,
especially at the head or tail of the list
It is common to use a dummy head node with a linked
list to eliminate some of its inherent problems
Circular doubly linked list
You can take a doubly linked list and
change the next pointer for the last item
to make it a circular doubly linked list
The pointer will now point to the head
node/dummy head node of the linked list
Related docs
Get documents about "