Documents
Resources
Learning Center
Upload
Plans & pricing Sign in
Sign Out

Dynamic Memory Allocation and Dynamic Structures

VIEWS: 3 PAGES: 16

									Dynamic Memory Allocation and Dynamic Structures



Dynamic allocation is a pretty unique feature to C (amongst high level languages). It
enables us to create data types and structures of any size and length to suit our
programs need within the program.

We will look at two common applications of this:

      •   dynamic arrays
      •   dynamic data structure e.g. linked lists

Malloc, Sizeof, and Free

The Function malloc is most commonly used to attempt to ``grab'' a continuous
portion of memory. It is defined by:

  void *malloc(size_t number_of_bytes)

That is to say it returns a pointer of type void * that is the start in memory of the
reserved portion of size number_of_bytes. If memory cannot be allocated a NULL
pointer is returned.

Since a void * is returned the C standard states that this pointer can be converted to
any type. The size_t argument type is defined in stdlib.h and is an unsigned type.

So:


  char *cp;
                      cp = malloc(100);


attempts to get 100 bytes and assigns the start address to cp.

Also it is usual to use the sizeof() function to specify the number of bytes:


  int *ip;
                      ip = (int *) malloc(100*sizeof(int));

Some C compilers may require to cast the type of conversion. The (int *) means
coercion to an integer pointer. Coercion to the correct pointer type is very important
to ensure pointer arithmetic is performed correctly. I personally use it as a means of
ensuring that I am totally correct in my coding and use cast all the time.

It is good practice to use sizeof() even if you know the actual size you want -- it
makes for device independent (portable) code.



                                                                                    1
sizeof can be used to find the size of any data type, variable or structure. Simply
supply one of these as an argument to the function.

SO:


     int i;
                    struct COORD {float x,y,z};
                    typedef struct COORD PT;

                    sizeof(int), sizeof(i),
                    sizeof(struct COORD) and
                    sizeof(PT) are all ACCEPTABLE


In the above we can use the link between pointers and arrays to treat the reserved
memory like an array. i.e we can do things like:

     ip[0] = 100;

or

     for(i=0;i<100;++i) scanf("%d",ip++);




When you have finished using a portion of memory you should always free() it. This
allows the memory freed to be aavailable again, possibly for further malloc() calls

The function free() takes a pointer as an argument and frees the memory to which
the pointer refers.

Calloc and Realloc

There are two additional memory allocation functions, Calloc() and Realloc(). Their
prototypes are given below:

void *calloc(size_t num_elements, size_t element_size};

void *realloc( void *ptr, size_t new_size);

Malloc does not initialise memory (to zero) in any way. If you wish to initialise
memory then use calloc. Calloc there is slightly more computationally expensive but,
occasionally, more convenient than malloc. Also note the different syntax between
calloc and malloc in that calloc takes the number of desired elements,
num_elements, and element_size, element_size, as two individual arguments.

Thus to assign 100 integer elements that are all initially zero you would do:


     int *ip;


                                                                                      2
                    ip = (int *) calloc(100, sizeof(int));

Realloc is a function which attempts to change the size of a previous allocated block
of memory. The new size can be larger or smaller. If the block is made larger then
the old contents remain unchanged and memory is added to the end of the block. If
the size is made smaller then the remaining contents are unchanged.

If the original block size cannot be resized then realloc will attempt to assign a new
block of memory and will copy the old block contents. Note a new pointer (of
different value) will consequently be returned. You must use this new value. If new
memory cannot be reallocated then realloc returns NULL.

Thus to change the size of memory allocated to the *ip pointer above to an array
block of 50 integers instead of 100, simply do:

  ip = (int *) calloc( ip, 50);

Linked Lists

 Let us now return to our linked list example:

 typedef struct { int value;
                                                             ELEMENT *next;
                                        } ELEMENT;


We can now try to grow the list dynamically:

 link = (ELEMENT *) malloc(sizeof(ELEMENT));

This will allocate memory for a new link.

If we want to deassign memory from a pointer use the free() function:

 free(link)

See Example programs (queue.c) below and try exercises for further practice.

Full Program: queue.c

A queue is basically a special case of a linked list where one data element joins the
list at the left end and leaves in a ordered fashion at the other end.

The full listing for queue.c is as follows:

/*                                                                    */
/* queue.c                                                            */
/* Demo of dynamic data structures in C                      */

#include <stdio.h>



                                                                                        3
#define FALSE 0
#define NULL 0

typedef struct {
   int   dataitem;
   struct listelement *link;
}             listelement;

void Menu (int *choice);
listelement * AddItem (listelement * listpointer, int data);
listelement * RemoveItem (listelement * listpointer);
void PrintQueue (listelement * listpointer);
void ClearQueue (listelement * listpointer);

main () {
  listelement listmember, *listpointer;
  int    data,
         choice;

    listpointer = NULL;
    do {
           Menu (&choice);
           switch (choice) {
              case 1:
                    printf ("Enter data item value to add ");
                    scanf ("%d", &data);
                    listpointer = AddItem (listpointer, data);
                    break;
              case 2:
                    if (listpointer == NULL)
                        printf ("Queue empty!\n");
                    else
                        listpointer = RemoveItem (listpointer);
                    break;
              case 3:
                    PrintQueue (listpointer);
                    break;

             case 4:
                   break;

             default:
                   printf ("Invalid menu choice - try again\n");
                   break;
          }
    } while (choice != 4);
    ClearQueue (listpointer);
}                                     /* main */

void Menu (int *choice) {

    char   local;


                                                                   4
   printf ("\nEnter\t1 to add item,\n\t2 to remove item\n\
\t3 to print queue\n\t4 to quit\n");
   do {
          local = getchar ();
          if ((isdigit (local) == FALSE) && (local != '\n')) {
              printf ("\nyou must enter an integer.\n");
              printf ("Enter 1 to add, 2 to remove, 3 to print, 4 to quit\n");
          }
   } while (isdigit ((unsigned char) local) == FALSE);
   *choice = (int) local - '0';
}

listelement * AddItem (listelement * listpointer, int data) {

    listelement * lp = listpointer;

    if (listpointer != NULL) {
            while (listpointer -> link != NULL)
                listpointer = listpointer -> link;
            listpointer -> link = (struct listelement *) malloc (sizeof (listelement));
            listpointer = listpointer -> link;
            listpointer -> link = NULL;
            listpointer -> dataitem = data;
            return lp;
    }
    else {
            listpointer = (struct listelement *) malloc (sizeof (listelement));
            listpointer -> link = NULL;
            listpointer -> dataitem = data;
            return listpointer;
    }
}

listelement * RemoveItem (listelement * listpointer) {

    listelement * tempp;
    printf ("Element removed is %d\n", listpointer -> dataitem);
    tempp = listpointer -> link;
    free (listpointer);
    return tempp;
}

void PrintQueue (listelement * listpointer) {

    if (listpointer == NULL)
            printf ("queue is empty!\n");
    else
            while (listpointer != NULL) {
               printf ("%d\t", listpointer -> dataitem);
               listpointer = listpointer -> link;
            }
    printf ("\n");


                                                                                          5
}

void ClearQueue (listelement * listpointer) {

    while (listpointer != NULL) {
          listpointer = RemoveItem (listpointer);
    }
}

Exercises

Exercise 12456

Write a program that reads a number that says how many integer numbers are to be
stored in an array, creates an array to fit the exact size of the data and then reads in
that many numbers into the array.

Exercise 12457

Write a program to implement the linked list as described in the notes above.

Exercise 12458

Write a program to sort a sequence of numbers using a binary tree (Using Pointers).
A binary tree is a tree structure with only two (possible) branches from each node
(Fig. 10.1). Each branch then represents a false or true decision. To sort numbers
simply assign the left branch to take numbers less than the node number and the
right branch any other number (greater than or equal to). To obtain a sorted list
simply search the tree in a depth first fashion.




Fig. 10.1 Example of a binary tree sort Your program should: Create a binary
tree structure. Create routines for loading the tree appropriately. Read in integer
numbers terminated by a zero. Sort numbers into numeric ascending order. Print out
the resulting ordered values, printing ten numbers per line as far as possible.

Typical output should be




                                                                                      6
  The sorted values are:
   2 4 6 6 7 9 10 11 11 11
  15 16 17 18 20 20 21 21 23 24
  27 28 29 30

1. Dynamic Memory Allocation

1.1. Static vs Dynamic Memory Allocation




The issue we address in this lecture is the efficient use of memory. The issue arises
because of inefficiencies inherent in the way memory is allocated for arrays. When
you declare an array of size 1000, all 1000 memory locations are reserved for the
exclusive use of that array. No matter how many values you actually store in the
array, you will always use 1000 memory locations. The same memory allocation
strategy is used for most implementations of strings. I will use the term static
allocation to refer to this memory allocation strategy, in which all the memory that a
data structure might possibly need (as specified by the user) is allocated all at once
without regard for the actual amount needed at execution time. The opposite
strategy, dynamic allocation, involves allocating memory on an as-needed basis.

There always is an absolute maximum of memory that can be allocated: this is
simply the amount of memory that is physically available on your computer (more
precisely, the amount of memory that is addressable by your computer). No
allocation strategy can get around this. The difference between static and dynamic
allocation is illustrated by the following example.

Example: Recording Names of Individuals
Suppose I have a total of 3000 memory locations available with which to store the
first and last names of everyone in this class. If I use an array of strings to store this
data, storage will be allocated statically, and consequently I must set a maximum on
how long a name can be, and a maximum number of people in the class. It is quite
hard to set these limits, especially when, as in this case, they interact: in order to
allow longer names I have to reduce the possible size of the class. To be perfectly
`safe' I would have to allow for those rare cases of people with extraordinarily long
names, say 100 characters. Given that there are only 3000 memory locations in
total, this name-size imposes a limit of 30 people per class. Now, this is clearly
wrong. Even if there are 30-plus government employees with 100-character names,
it is unthinkable that 30 of them will simultaneously register for the same course.

With dynamic allocation, I will not even have to think about how many people might
join the class, or how long their names might be. I can just as easily accommodate
30 people with 100-character names and 300 people with 10-character names.




                                                                                        7
1.2. How Does Dynamic Allocation of Memory Work?




The key idea is to regard all the available memory as a large global pool that can be
used for any purpose whatsoever. When you need memory, you ask for just the
amount you need; it is given to you from the global pool and is marked as being no
longer `free' so that a subsequent request for memory does not allocate this same
block again for a different purpose. Memory allocation is done by some sort of
procedure call; in C, malloc(n) allocates a block of memory of size n bytes.

It is your responsibility to return memory to the global pool when you are finished
with it; when you do so it will be marked `free' again and be available for other uses
(you might get it again the next time you request some memory).

To continue our class-names example, we imagine that all the available memory is
one large array to be used as a global pool:




We use gray shading to indicate the regions of memory that belong to the global
pool, i.e. which are free. At present, all of it is free.

Now, if the user needs memory to store the name JOE, he must requests enough
space to store the string "JOE". As it happens, in C, all strings must be terminated by
a special character called `NUL' whose code is 0 and which is written \0. Therefore,
the user needs 3 bytes to store J, O, and E, plus an additional byte for NUL. Thus he
requests 4 bytes by calling malloc(4).

Suppose he is given the first 4 bytes from the pool; then memory would now look
like this:




The memory allocator keeps track of what memory is free and what memory has
been allocated. It now knows that the 4 bytes which it gave to the user are no longer
free; they are no longer in the global pool. This we represent visually by the fact that
these bytes are no longer shaded gray.

The question marks now visible indicate that we do not know in what state the bits of
this block of memory really are.

Note: Why not? (1) what's a valid initialization? (2) It is inefficient to initialize large
blocks, especially when the next thing the user does is overwrite this initialization
with the useful values s/he really wanted to put in.
A memory allocator typically makes no guarantee about this issue. The block of
memory which is allocated and returned to the user must be presumed to contain



                                                                                         8
garbage. It is the responsibility of the user to then initialize this block properly so
that its contents become meaningful.

In the present case, the user simply copies the string "JOE" into it:




You can see that it takes up exactly the space that is needed for it. The rest of
memory can be used to store the other names.

Suppose the next name is MARYJANE: 8 characters + 1 NUL. The user asks for 9
bytes and copies the string into them:




You may wonder why this new block is not contiguous with the previous one. The
reason is that, in general, it won't be. This, mainly for two reasons:

   •   The memory allocator might very well decide to allocate the next block from
       some other part of memory.
   •   Often, the memory allocator needs to allocate a little bit more than the user
       requested so that it can store, just before the user's block, some information
       about it, such as its size. Thus, when the user wants to `free' the block, the
       memory allocator can find out how big it is and return the correct number of
       memory locations to the global pool.

For example, the real situation might look like this:




where the arrows point to the start of the blocks allocated to the user. In the
following, we are going to ignore these details.




  What to remember

   •   The memory allocator keeps track of what is allocated and what is free.
   •   When we obtain memory from the global pool, we must assume that it
       contains garbage, and properly initialize it.
   •   We must not assume that successive requests will allocate contiguous
       memory blocks.




                                                                                     9
If the user now says he is finished with the name JOE, the corresponding block of
memory is returned to the global pool and becomes free:




1.3. Problems With Dynamic Allocation of Memory
I have stressed so far the advantage of dynamic memory allocation. Now let me
mention its two main disadvantages or dangers.

Freeing Memory
The user is responsible for freeing up memory when he is finished with it. This is a
serious responsibility, a potential source of bugs that are very hard to find.

For example, suppose you free up a location before you're actually finished with it.
Then further access to the location will either cause a run-time error (memory
violation) or, what is worse (but more common), you will get access to a location
that is being used for some completely unrelated purpose.

Trouble also occurs if you forget to free up some space. If you do this, you are losing
the advantages of dynamic allocation. And in some circumstances - e.g. if you were
writing the program that manages the space on a disk drive - this could be
disastrous.

There are no surefire safeguards against these problems, you must be very careful
when writing programs to return memory when you are finished with it, but not
before.

Fragmentation of Memory
As the preceding example demonstrated, with dynamic allocation the `free' parts of
memory are not all together in one contiguous block. When we returned JOE's
memory, we had 4 free cells on one side of MARYJANE and all the rest on the other
side. This is called fragmentation of memory, and it can grow into a very serious
problem: it is possible to have a large amount of free memory but for it to be broken
up into such tiny fragments that there is not enough contiguous free space to store
another name.

Suppose that after using dynamic allocation for a while memory becomes
fragmented thus:




and that we need to obtain a block this big:


If only the remaining free blocks were contiguous, we'd have enough room, but, in
the present configuration, it looks like we are doomed. There are several possible
solutions to this problem. The one we will study is based on the insight that a large
data structure does not necessarily have to be allocated in one single contiguous




                                                                                    10
chunk of memory. Instead it might be decomposed into smaller components that are
somehow linked together in a chain.

Each component in this chain can very well be allocated in a different region of the
global pool. Thus, when using linked data structures, fragmentation becomes much
less of a problem.

2. Linked Data Structures




In our example application, we need to record the names of all the people that enroll
in a course. We have already established that it is more efficient to use dynamic
allocation to record their names. Therefore, for each name we know starting at what
memory location we have copied it. We call this memory location the address of the
name.

Now, in order to record the names of all the students in the course, what should we
do? We could use an array that's big enough to accommodate the largest number of
students we can reasonably expect. For each student, we'd record his/her name as
illustrated earlier, and store the corresponding address in the next available cell of
the array.

This array might have to be quite big. And what if we wanted to do the same for all
courses offered at the university. We'd have to allocate very large arrays for each
course, even though most of them would not be at full capacity; therefore, large
portions of each array would remain used, yet would consume memory.

Let's take a look at what our example might look like. For concreteness we suppose
that the name JOE starts at memory location 1 and MARYJANE at memory location 7.




Note the order in which names (or rather the indices of their memory location) are
entered into the array. The first student is in the first position; the next student is in
the next position, etc... When we consult this array, we rely on the expectation that
the next student can be found in the next location. We depend on the fact that these
locations are contiguous. In other words, the logical relation next is represented by
the physical order of the locations in the array.




                                                                                       11
With a linked representation, this logical relation is represented explicitly. Along with
each student entry, we store the address of the next student entry:




The first student entry, that of JOE, begins at location 1. It starts with the address of
the next student entry, which happens to be MARYJANE at location 8. The entry for
MARYJANE is linked to the next entry (not shown in the diagram) that begins, say, at
location 23 for example.

Each student entry is linked to the next one. What happens when there is no next
one? How do we recognize the end of a chain? We use the invalid address 0 for this
purpose. In C, this invalid address is represented by the constant NULL (not to be
confused with the NUL character '\0', although they both happen to be defined as 0).
Thus, if there were no further entries after MARYJANE, we would have:




You can see that this method is much more economical than to statically allocate a
large array. Each student entry can be stored anywhere in memory, and we can
examine in turn all the students enrolled in a course by following the links until we
find one whose value is NULL.

Furthermore, we can similarly envision a linked data structure of all courses offered
at the university. Each course would be represented by a pair of positions
(dynamically allocated): the first would point to the first student in the first course
(see above), and the second would point to the next course. For example, we might
have something like this:




                                                                                      12
2.1. Conclusion




The combination of dynamic allocation and linked data structures is extremely
attractive because it utilizes memory with very high efficiency. There is a price to be
paid however.

   •   As mentioned above, the user is responsible for correctly returning memory
       when it is no longer needed.
   •   Linked structures do require extra memory: space is needed for storing the
       links. This overhead is usually small compared to the increased utilization of
       memory gained by dynamic allocation, but there are circumstances where the
       space for links is a very high cost. We need to be wary of this.
   •   The code for manipulating linked structures is sometimes quite a lot more
       complex than the code for contiguous structures. But this is not of great
       concern because when you use abstract data types, the user does not know
       how the structure is implemented. If the implementation is complex, that is
       the implementer's concern; the complexity will be confined to the few core
       operations that define the data type. The vast majority of the code will be
       independent of the implementation details and unaffected by its complexity.

3. Operations Needed To Support Dynamic Allocation And Linked Structures

Abstractly, in order for the user to define and use dynamically allocated linked data
structures, the following are necessary:




Getting Memory
We require a procedure for getting one block of memory - of a particular type, or
appropriate for storing an object of a particular type. Typically, this procedure will
return the address of the cell that has been allocated to the user. This procedure has
different names in different systems, I will call it by the generic name GET_MEMORY.
In C it is called malloc. In the textbook it is called new.




                                                                                    13
Pointer Variables
We require a type of variable that can store an address. For example, when
GET_MEMORY is called it returns an address and the user needs to be able to store
that address in a variable in order to make use of it later on.

In our example, each student entry contains the address of the next student entry;
e.g. JOE's entry contains the address of MARYJANE's entry. Therefore, a student
entry must be a structure/record whose first component is capable of storing an
address. Variables that hold addresses are called pointers.

In C, to declare a variable that can hold a real value, you say:

     float X ; (in the textbook: var X: real)
To declare a variable that can hold the address of a real value, say:
     float *P ; (in the textbook: var P: PointerTo real)

Invalid Address
We require a special value that can be stored in a pointer, that means `not a valid
address'. The textbook calls it NIL; in C it is called NULL.

P = NULL; means that P is defined, but contains no address. We can test this: if (P
== NULL) ... Typically, we use NULL to indicate either then end of a chain of links or
an uninitialized address.

Returning Memory




We require a procedure for returning/freeing one block of memory. The user will
pass to this procedure a pointer (variable containing an address) and the procedure
will return the corresponding block of memory to the global pool.


  This procedure works on one block at a time. To free up all the space consumed by
the student entries for one course, the user would have to call the procedure once
for each student entry. Calling it just on the entry for JOE would not also free up the
following entries. The entry for MARYJANE would still occupy memory.

Before freeing JOE:




                                                                                    14
After freeing JOE:




  Unfortunately, now we'd have no way to free MARYJANE because we forgot to
retrieve its address from JOE's entry before we freed it. This is a very common
mistake! Look out for it in your own code.

The procedure for returning memory to the global pool has different names in
different systems, I will call it by the generic name RETURN_MEMORY. In C it is
called free. In the textbook it is called dispose.

Accessing Objects Through Pointers




We require methods for accessing, i.e. looking at and changing, the value (or more
generally, the object) stored at a particular address. This is done by dereferencing a
pointer, or rather its value.

For example, in C, if P is a pointer and contains a valid address (i.e. not NULL), we
can obtain the value stored at this address by means of the * prefix operator.

X = *P;
        Looks up the value P is pointing at and stores it in X.
*P = 13.1;
        Stores the value 13.1 at the location P points at (assuming P is of type real*,
        that is `pointer to real').
For example, suppose P is a pointer to a real value: this means P itself contains an
address, and at that address is stored a real value. Suppose memory consists of the
following sequence of cells:




X, an ordinary variable, is memory cell 7. It contains a real value, 3.9. P, a pointer
variable, is memory cell 2. It contains an address, 5. (*P) is the memory cell whose
address is stored in P, i.e. memory cell 5. It contains a real value, 4.8.
*P = X;
       Copies the value in cell named X (at address 7) into the cell whose address is
       in P (cell 5).




                                                                                    15
Note that P itself has not changed.

The textbook uses a different symbol for the dereferencing operator. The symbol is ^
and it is placed after the pointer variable, not before it. So the textbook writes (P^)
instead of (*P).

Obtaining The Address Of An Object




Finally, it is sometimes useful to have a procedure that takes a symbolic variable
name, like X, and returns the address corresponding to the name. In the above
example, X is a symbolic name for the address 7. In C, this operation is done using
the symbol & - (&X) is the address corresponding to X. The text has no notation for
this.

3.1. Summary




get_memory (malloc,new)
        Procedure for dynamically allocating memory.
return_memory (free,dispose)
        Procedure for returning memory that was dynamically allocated.
pointer
        A variable or expression whose value is an address.
NIL, NULL
        A special value for pointers, meaning `not a valid address'.
dereferencing (*,^)
        Following a pointer; accessing the value in the address represented by the
        pointer.
&
        Obtain the address of the object denoted by an expression. E.g. the address
        of a variable.
In many languages all these things are ``built-in'', i.e. provided as primitives in the
language. But in some languages, FORTRAN and COBOL for example, they are not.
But that does not mean you cannot use linked structures in these languages. What it
means is that in order to use linked structures you will have to build your own
implementations of these basic memory-management primitives. This is not hard at
all, the textbook discusses how to do it using arrays in section 4.2.3.



                                                                                    16

								
To top