; lectures
Documents
Resources
Learning Center
Upload
Plans & pricing Sign in
Sign Out
Your Federal Quarterly Tax Payments are due April 15th Get Help Now >>

lectures

VIEWS: 0 PAGES: 119

  • pg 1
									               DEPARTMENT OF
ELECTRICAL ENGINEERING AND COMPUTER SCIENCE

              L.C. Smith College of
        Engineering and Computer Science

          SYRACUSE UNIVERSITY




               CPS 196
    INTRODUCTION TO
   PROGRAMMING IN C


    LECTURE NOTES




                   RJ Irwin
            Table of Contents
UNIT 1: COURSE SET-UP / PROGRAMMABILITY.................................................................................... 3

UNIT 2: BASIC COMPUTER ORGANIZATION ........................................................................................... 5

UNIT 3: ELEMENTS OF C PROGRAM STRUCTURE ................................................................................ 9

UNIT 4: MORE ON EXPRESSIONS, FUNCTIONS AND I/O .................................................................... 15

UNIT 5: MORE ON DECLARATIONS AND FUNCTIONS........................................................................ 18

UNIT 6: BOOLEAN EXPRESSIONS AND CONDITIONAL STATEMENTS .......................................... 21

UNIT 7: ITERATION ....................................................................................................................................... 25

UNIT 8: FILTERS AND FUNCTION PROTOTYPES ................................................................................. 29

UNIT 9: INTRODUCTION TO ARRAYS ...................................................................................................... 32

UNIT 10: MORE ON ARRAYS ....................................................................................................................... 35

UNIT 11: ARRAYS AND POINTERS ............................................................................................................ 37

UNIT 12: MULTI-DIMENSIONAL ARRAYS............................................................................................... 41

UNIT 13: INTRO. TO STRUCTURES AND DATA ABSTRACTION ........................................................ 45

UNIT 14: MORE ON STRUCTURES AND DATA ABSTRACTION ......................................................... 48

UNIT 15: SWITCH STATEMENTS AND ENUM TYPES ............................................................................ 52

UNIT 16: STRING PROCESSING .................................................................................................................. 56

UNIT 17: MORE ON STRING PROCESSING.............................................................................................. 60

UNIT 18: CHARACTER PROCESSING FACILITIES AND FILE I/O ...................................................... 63

UNIT 19: MORE ON FILE I/O AND COMMAND LINE ARGUMENTS ................................................. 68

UNIT 20: ABSTRACT DATA TYPES: AN EXTENDED EXAMPLE ........................................................ 73

UNIT 21: STRUCTURE POINTERS AND DYNAMIC ALLOCATION ................................................... 81

UNIT 22: CONSTRUCTORS, ACCESSORS AND DESTRUCTORS ......................................................... 88

UNIT 23: MODULAR CODE AND MULTI-FILE PROGRAMS I ........................................................... 100

UNIT 24: MODULAR CODE AND MULTI-FILE PROGRAMS II.......................................................... 113
UNIT 1                                           CPS 196                                             Page 3




UNIT 1: COURSE SET-UP / PROGRAMMABILITY
This session sets up the ground rules for the rest of the course. Once these administrative preliminaries are
over, the history of the C language and the value of learning C are briefly discussed.



1. EVERYONE PRESENT FOR CPS 196: INTRO. TO PROGRAMMING IN C?

2. SIGN-UP SHEET & SYLLABUS REVIEW
2.1 ON SIGN-UP SHEET, WRITE DOWN YOUR SU (@syr.edu) EMAIL ADDRESS
2.2 OFFICE HOURS, OFFICE PHONE, AND EMAIL ADDRESS.
2.3 COURSE PREREQUISITES—FOR NOVICE PROGRAMMERS, NOT COMPUTER TYROS
2.4 COURSE STRUCTURE
2.4.1 Students are liable for knowledge of Syllabus—read it carefully
2.4.2 CPS 196 Website
2.4.3 Students must check Email and Announcements directory often
2.5 GRADING POLICY

3. COURSE OBJECTIVE: LEARN THE ESSENTIALS OF C PROGRAMMING
3.1 ENOUGH TO WRITE MODEST, BUT USEFUL WHOLE PROGRAMS
3.2 ENOUGH TO WRITE REUSABLE MODULES FOR USE IN LARGER PROGRAMS
3.3 ENOUGH TO READ OTHERS' CODE, IF NOT TOO SOPHISTICATED
3.4 ENOUGH TO PROCEED TO MORE ADVANCED BOOKS & COURSES

4. WHAT IS C?
4.1 DEVELOPED AT BELL LABS AROUND 1969
a)   Genesis of C: CPL  BCPL  B  C
b)   Originally created for systems programming, i.e., writing software that controls hardware and other
     software. UNIX and other operating systems are written in C (+ machine code); so are many compilers,
     GUIs, device drivers, etc.

4.2 C IS A HIGH-LEVEL LANGUAGE (EASE OF USE)
The programmer is not restricted to the data types and operations available at the machine language level,
which are usually pretty primitive. C is similar to the older high-level "imperative" languages like Pascal, PL/1
and Algol (and, to a lesser extent, FORTRAN and basic BASIC), and it is the basis of the modern “object-
oriented” programming languages C++, Java and Objective-C.

4.3 C HAS SOME LOW-LEVEL LANGUAGE FEATURES (FOR EFFICIENCY)
C has a rich set of operators, including some that mimic machine language instructions, e.g., bit manipulation
and increment/decrement operators. Direct manipulation of addresses, commonly called pointers, is also
supported; this greatly facilitates systems programming (see 5.2 below).
UNIT 1                                          CPS 196                                            Page 4




5. WHY LEARN C?
5.1 SUPPORTS WRITING PORTABLE PROGRAMS
A program's source code is portable if, when it is compiled in different environments, the resulting executable
programs behave indistinguishably. By environment, we mean a particular machine/operating system/compiler
combination. Portability is not automatic—it requires conscious, diligent work.

5.2 ONE OF THE FEW HIGH-LEVEL LANGUAGES USED FOR SYSTEMS PROGRAMMING
Systems programs include Operating Systems (e.g., Windows, Mac OS X, Linux), compilers and linkers,
graphics and other utility libraries (e.g. OpenGL). The other commonly used systems programming languages
are C-based, chiefly C++ (esp. on Windows) and Objective-C (esp. on Mac OS X).

5.3 USED TO WRITE SOPHISTICATED APPLICATIONS
Examples: most popular word processors, spreadsheets, data base management systems and graphics
packages are written at least partially in C. Several commonly used applications programming languages are
C-based, chiefly C++, Java and Objective-C.

5.4 C SUPPORTS WRITING FAST, TIGHT CODE
The fact that C is "low-level" high-level language helps knowledgeable programmers code for maximal speed in
minimal space.

5.5 MANY COMMERCIAL TOOLS AND CODE LIBRARIES SUPPORT C PROGRAMMING
5.6 MUCH PUBLIC DOMAIN C CODE AVAILABLE THROUGH INTERNET
5.7 YOU MAY HAVE TO READ C PROGRAMS, EVEN IF YOU DON'T WRITE THEM

6. PROGRAMMABILITY (COMPUTABILITY) [SLIDES: PROGRAMMABILITY]
6.1 WHAT CAN BE COMPUTED? ARE THERE LIMITS?
6.2 BASIC VOCABULARY
UNIT 2                                             CPS 196                                              Page 5




UNIT 2: BASIC COMPUTER ORGANIZATION
In order to fully understand C, basic knowledge of the structure of computers is needed. These notes provide
the necessary background. All architectures cannot be addressed, so the discussion is limited to a generic
structure shared by most personal computers and workstations.


1. PHYSICAL STRUCTURE
1.1 SYSTEM UNIT, DISPLAY, KEYBOARD, MOUSE, ETC.
Should be familiar to all. Most of the important stuff is inside the system unit. What's "under the hood?"

1.2 INSIDE SYSTEM UNIT [SLIDES: COMPUTER GROSS ANATOMY]
PCB = Printed Circuit Board; plastic board holding chips and other electrical circuit elements. Chips that are
parts of a functional unit (e.g., disk drive controller board, memory board, modem board, video driver board) are
interconnected on PCBs. The edge connectors of PCBs in turn are typically plugged into sockets on a
motherboard, a large PCB on which the Central Processing Unit (CPU)—the "brains of the organization"—
typically resides. The functional units on the PCBs communciate via a set of parallel wires that connects the
motherboard sockets with each other and the CPU. A set of parallel wires is called a bus.


2. LOGICAL STRUCTURE
A simplified, schematic diagram of how the various functional units are interconnected.

2.1 MEMORY [ILLUS.: N-BYTE MEMORY WITH 2-BYTE WORDS]
a)   In order to run an executable program (= machine instructions + data), it must first be read into memory
     (say from a hard disk).
b)   Memory is unsuitable for permanent storage -- its capacity is limited, and its contents are typically lost when
     the computer is powered off.

2.1.1 Bits
a)   The basic, indivisible unit of memory is the binary digit, or bit.
b)   A bit may contain a 0 or a 1; sequences of bits can represent numbers in base two, e.g.,
                 2       1      0
     1012 = 12 + 02 + 12 = 510.
                                                          n
c)   The number of different configurations of n bits is 2 , which is always an even number

2.1.2 Bytes
a)   Memory is usually organized into 8-bit bytes.
b)   With each byte is associated a unique number: its address.
c)   Machine instructions refer to bytes by their addresses: an n-byte memory has addresses 0 to n–1
     (addresses are always zero-relative).
d)   Bytes are typically the smallest addressable memory units. (A single bit cannot be individually transferred
     to/from memory -- can't buy less than a can of soda from a vending machine.)
       8
e)   2 = 256 possible bit configurations in a byte; thus, the byte is convenient for holding a single character,
     e.g., a letter, digit, punctuation mark or other common symbol.
f)   A fixed correspondence between such bit patterns and letters, digits, etc., is called a character code, e.g.,
     ASCII (most computers) and EBCDIC (IBM mainframes). Unicode is the new standard (subsumes ASCII).
g)   Can also be used to hold small integers (0 to 255, if viewed as unsigned)

2.1.3 Words
a)   Bytes are grouped into convenient fixed-size units called words.
b)   The most common word sizes are 4 and 8 bytes (32 and 64 bits, respectively).
c)   A machine instruction refers to a word by specifying the lowest memory address of a byte contained in that
     word (and similarly for any multi-byte region of memory that can be examined or stored into):

                                                  4-byte word at
                                                    address 



               byte                    +1              +2            +3
               addr's
UNIT 2                                            CPS 196                                             Page 6



2.1.4 Interpreting the Contents of Memory [illus.: WHAT'S IN A WORD?]
Depends on what one is looking for -- the same bit configuration can be interpreted in different ways. This sort
of ambiguity shouldn't be uncomfortable, since we deal with it all the time. Consider the letter configuration
"port":

             if we expect a noun, it means a type of red wine
             if we expect a verb, it means to carry
             if we are computer geeks expecting a noun, it means a point of access to a computer system, e.g.,
              a "USB" port or a "parallel" port
             if we are computer geeks expecting a verb, it means to adapt a program written for one type of
              computer (e.g., a PC-compatible) so that it runs on another type of computer (e.g., a Macintosh)

Similarly, a 2-byte chunk of memory could represent:

             2 consecutive 8-bit characters
             two 8-bit integers
             a 16-bit integer
             part of a 32-bit integer
             part of a 64-bit floating point number
             a machine instruction (recall both instructions and data must be in memory for a pgm to execute)
             part of a larger machine instruction
             a 16-bit address
             part of a 32-bit address, etc., etc.

2.2 CENTRAL PROCESSING UNIT
In personal computers and workstations, it is common for the whole CPU to reside on one chip. A CPU
comprises a Control Unit, an Arithmetic-Logical Unit, and a collection of Registers.

2.2.1 Control Unit (CU)
a)   Responsible for fetching machine instructions from memory and directing their execution. Such
     instructions may call for moving a few bytes of data memoryregister, adding two numbers held in
     registers, and other low-level stuff. Note that both instructions and data reside in memory.

2.2.2 Registers
a) Where operands must be for operations to be performed upon them. That is, to change a data value held
in memory, it must first be read into a CPU register, altered there, and written back to the memory area whence
it came.
b) Registerregister transfers, and other movements of data inside the CPU (i.e., on-chip) are extremely
fast, memoryregister transfers less so.
c) Typically, register size = word size (32 or 64 bits on most computers)
d) Most machine instructions operate on data in word-size chunks. A computer with a 32-bit architecture will
have 32-bit registers and instructions that add, subtract, compare and otherwise operate on 32-bit words.
Similarly, a 64-bit computer will have 64-bit registers, etc.
e) The number and size of registers varies from computer to computer. Most CPUs have <<100 accessible
registers. Common register sizes: 32, 64 and 128 bits (the first two sizes predominate in PCs).

2.2.3 Arithmetic-Logical Unit (ALU)
a)   Performs arithmetic operations, like integer (whole number) addition, subtraction, multiplication and
     division; also performs floating point addition, subtraction, etc. A floating point number has both a
     (possibly zero) whole part and a (possibly zero) fractional part, e.g., 12.345 has whole part 12 and
     fractional part .345 (.345 = 345/1000). The Intel Pentium brouhaha of 1994 was due to floating point errors.
b)   Performs other simple operations, e.g., comparison of values held in different registers, check if a register
     holds zero, etc.


3. C VS. MACHINE LANGUAGE (HIGH- VS. LOW-LEVEL LANGUAGE)
3.1 MACHINE LANGUAGE [SLI.: Program Execution, ILL.: Programming Process]
An "Executable File" is composed of (binary) machine instructions and (binary) data derived from programmer-
written source file(s) by a translation program called a compiler (that outputs object file(s)) and another
program called a linker that combines these object file(s) with “canned” object files. Object files comprise
machine instructions and data, together with information needed by the linker to produce the executable file.
UNIT 2                                            CPS 196                                          Page 7



3.1.1 Machine Instructions: primitive, must be directly supported by H/W
3.1.2 Data Types: limited, unstructured, must be directly supported by H/W
3.1.2.1 Integer Types
a)   Byte-sized integers represent characters (larger integers can be used in the Unicode representation)
b)   Integer sizes are limited to those handled directly by the machine instruction set (e.g., there may be
     instructions to add two 4-byte integers, but not two 8-byte integers).
c)   Some integers may represent memory addresses (i.e., they may be pointers)
3.1.2.2 Floating Point Types
a)   Some really old CPU chips cannot perform floating point operations directly in H/W.
b)   Without H/W support, elaborate (and slow) S/W emulation of floating point values is required. Prior to the
     introduction of the Intel 80486 (pre-Pentium) processor, most PCs used S/W emulation. Computers used
     for scientific calculation incorporate high-performance floating point H/W.

3.1.3 Machine Language Programs
a)   Binary instruction codes + binary data (looks like garbage: 1000101101011011, 10111110, etc., could be
     a sequence of machine instructions).
b)   Execution unfolds as a sequence of primitive operations on small amounts of data.
c)   Cannot run the same machine language program on two computers with different architectures.

3.2 THE C LANGUAGE
3.2.1 Statements: high-level equivalent of machine instructions
Usually, one C statement does the work of many machine instructions. In the following, we use symbols rather
than 1's and 0's to make the machine code more readable. We assume that , , , and  are the addresses of
(pointers to) x, a, b and c, respectively, and that x, a, b and c are word-sized objects.

                               a              b               c             x
                       …             …               …             …               …

                  byte                                               
                  addr's

                  C                                  Compiled Machine Instructions
                  x = a*b + c;                       move     reg1,         register 1 = a
                                                     move     reg2,         register 2 = b
                                                     mult     reg1, reg2     reg 1 = reg 1 * reg 2
                                                     move     reg2,         register 2 = c
                                                     add      reg1, reg2     reg 1 = reg 1 + reg 2
                                                     move     , reg1        x = register 1


3.2.2 Data Types: a rich variety of types is available to C programmer
a)   Supports all data types commonly supported by machine languages.
b)   Aggregate types like arrays and structures supported.
c)   Data variables are introduced via declarations, which associate variable names with types.
d)   User-defined types (essential for Object-Oriented Programming (OOP) are supported.

3.2.3 C Programs [illus.: GROSS STRUCTURE OF A C PROGRAM]
a)   Consist of one or more source files, containing C statements and declarations.
b)   Statements are executed, changing data values.
c)   All statements (as opposed to mere declarations) are placed inside function definitions.
d)   Both data variables and functions must be declared before being used.
e)   Declarations may be coded inside function definitions (local declarations).
f)   Declarations may be coded outside all function definitions (file-level declarations).
g)   Thus, a C program is a collection of function definitions and file-level declarations.
h)   A complete C program must include a function named main, which always executes first.
i)   The same C program can run on computers with quite different architectures, with precisely the same
     results (if one is careful about the details). The programmer need not be concerned with the number of
     registers, availability of floating point hardware, or other low-level, machine-specific details.
UNIT 2                                              CPS 196                                                Page 8



j)   Here's the smallest politically correct C program; it does nothing but return 0 to the caller:

                                                       int (integer) is the return type of the main function
         function header   { int void )
                             main(                     void parameter type means main takes no arguments


                          {
                              {                        function bodies always begin with a left brace
          function body            return 0;           returns the int value 0 to caller
                              }                        function bodies always end with a right brace



4. C SOURCE FILES
a)   C source files usually contain a mix of two distinct languages: C and the C Pre-Processor language.
b)   The pre-processor language consists of commands, all of which begin with a pound sign, #.
c)   The pre-processor scans through a source file looking for, and acting upon, pre-processor commands.
d)   Pre-processor commands indicate how to modify the source file, yielding a "pre-processed" source file.
e)   The pre-processed source file is pure C code, suitable for compilation.

4.1 #include           COMMAND
a)   The #include <filename> command causes the physical substitution of the named file for the #include
     command in the pre-processed source file.
b)   The named file is often a standard header file, supplied by the C compiler manufacturer. Standard
     headers mostly contain function and data declarations and type definitions.
c)   Programmers may create their own private header files.
d)   Header file names conventionally end with a .h suffix, though this is not enforced.
e)   Examples:

         #include <stdio.h>             causes preprocessor to substitute the standard header file stdio.h

         int
         main( void )
         {
            printf( "Hello, World!\n" );                 printf function declared in stdio.h
            return 0;
         }

         ---------------------------------------------------------------------------

                                  file: a                     file: b
                             int                         {
                             main( void )                    return 0;
                                                         }



                 file: x.c                                                 int
                                                                           main( void )
             #include <a>                                                  {                             to
             #include <b>                   PRE-PROCESSOR                                             compiler
                                                                               return 0;
                                                                           }
UNIT 3                                            CPS 196                                             Page 9




UNIT 3: ELEMENTS OF C PROGRAM STRUCTURE
In this Unit, we look beyond the gross anatomy of a C program. Coverage is bottom-up, with emphasis on the
basic integer and floating point data types, data declarations and expressions.


1. BUILDING BLOCKS OF STANDARD C
1.1 CHARACTER SET
a)   A C source file is a sequence of characters taken from a fixed character set.
b)   A C character set must include the following:
              all lower- and upper-case letters
              all decimal digits
              the blank (or space) character
              29 "graphic" (i.e., visible) characters (all special characters but $, @ and `)
              the following non-graphic characters: backspace, carriage return, form feed, horizontal tab,
               vertical tab (these are all treated as spaces)
c)   The characters space, horiz. and vert. tabs, newline and formfeed are collectively referred to as
     whitespace characters.
d)   Different character codes may assign different numbers to any given C character, but the following must be
     observed by all character codes:
              codes for decimal digits are contiguous, in increasing order (i.e., the code for the character '1'
               must be 1 greater than the code for the character '0', etc.
e)   The ASCII (American Standard Code for Information Interchange) character code is commonly used; it is
     consistent with first 256 codes of the newer Unicode standard that facilitates software internationalization.

1.2 COMMENTS
a)   A sequence of characters beginning with /* and ending with */ is called a comment -- multiple lines OK.
b)   A sequence of characters beginning with // and extending to end of the same line is also a comment.
c)   Comments are ignored by the compiler, but explain your code to human readers. They are utterly crucial.

1.3 TOKENS
a)   The characters of a C program are grouped into tokens, the smallest meaningful program units.
b)   Five classes of tokens:
             Operators (+ - * / % ++ -- = etc.)
             Separators (, ; : ( ) [ ] { })
             Identifiers
             Reserved Words
             Constants

1.3.1 Identifiers
a)   These are names -- of variables, functions, types, etc.
b)   An identifier starts with a letter or an underscore, then continues with any combination of letters,
     underscores and digits.
c)   Examples:
            fahrenheit_to_celsius
            _p1
            year_1995_total
            tax_owed

1.3.2 Reserved Words [illus.: "C RESERVED WORDS"]
a)   Identifiers reserved for specific uses in C, e.g., names of built-in types (char, int, …), statement keywords
     (if, else, …), etc.
b)   May not be used by the programmer to name variables or functions, etc.
c)   It is advisable to avoid the use of C++-specific reserved words, to ease migration to C++ ("clean C").

1.3.3 Constants
a)   A constant is a fixed value of some type: int, double, etc.
b)   Most constants are literals, i.e., their very spelling denotes their values (e.g., 123 is an int constant)
c)   As we cover each type, we'll present the appropriate syntax for constants of that type and several examples
UNIT 3                                             CPS 196                                             Page 10



2. DATA TYPES [ILLUS.: "C/C++ TYPES"]
C is a typed language—every variable is given a type. What is a type? It is a set of values together with a set
of operations on those values -- e.g., the values of an integer type are integers in some fixed range, and the
operations on values of this type include +, -, <, >=, etc. A given operation is applicable only to values of
particular types.

2.1 BASIC SCALAR (SINGLE-VALUED, ATOMIC) TYPES
2.1.1 Integer Types
a)   Approximate properties of the infinite set of integers (whole numbers).
b)   Here’s a list of integer types:
               [signed|unsigned] char (whether char is signed/unsigned is implementation-defined)
               [unsigned] short
               [unsigned] int (unsigned abbreviates unsigned int and is more commonly used)
               [unsigned] long
               [unsigned] long long
c)   sizeof(char)  sizeof(short)  sizeof(int)  sizeof(long)  sizeof(long long),
     where sizeof(X) is the number of bytes (chars) occupied by an object of type X. (sizeof is a C operator
     to be discussed in more detail later.)
d)   sizeof(char) = sizeof(unsigned char), sizeof(short) = sizeof(unsigned short), etc.
e)   range(char)  range(short)  range(int)  range(long)  range(long long), where range(X) is the set
     of all values of type X. (range is not a C operator, we use it for shorthand only.)
f)   range(unsigned char)  range(unsigned short)  range(unsigned)  range(unsigned long), etc.
g)   chars hold small integers, normally treated as character codes (e.g., ASCII codes); sizeof(char) is 1.
h)   Multiple integer types are provided to allow space efficiency (use the narrowest type that holds the required
     values), and to accommodate the commonest integer widths supported at the machine language level
i)   The usual arithmetic (+,-,*,/) and relational operations (==,!=,<,>,<=,>=) apply to values of any integer
     type; note that integer division implies truncation (i.e., the value of 14/5 is 2, not 2.8)
j)   The remainder operator (%) also applies to operands of any integer type (however, the C Standard only
     dictates the results for non-negative operand values)
k)   There are infinitely many integers, but only a finite number of them can be represented by any integer type
l)   The finite precision of integer arithmetic has many unhappy repercussions

2.1.2 Floating Point Types
a)   Approximate properties of the infinite set of real numbers.
b)   Here’s a complete list of floating point: types:
              float                    ("single precision")
              double                   ("double precision")
              long double              ("extended [double] precision")
c)   sizeof(float)  sizeof(double)  sizeof(long double)
d)   range(float)  range(double)  range(long double)
e)   precision(float)  precision(double)  precision(long double), where precision(X) is the number of
     digits of precision in a value of type X. (precision is not a C operator, we use it for shorthand only.)
f)   All the usual arithmetic and relational operations apply to floating point numbers, however:
              the modulus operator, %, cannot be applied to floating point numbers
              the finite precision of floating point arithmetic makes the use of arithmetic and relational operators
               tricky, because there are "holes" in the range of each floating point type (e.g., your environment
               may not be able to distinguish .123456789101112131415 and .123456789101112131416)

2.1.3 Pointer Types
a)   A pointer value represents the address of an object (i.e., a region of memory that can be examined or
     stored into).
b)   Pointers to different types of objects are considered to have different types, e.g., a "pointer to int" is a
     distinct type from "pointer to double". More on pointers later.

2.2 DECLARATIONS
a)   A variable name is associated with a type and a region of memory via a declaration.
b)   Declarations obviate the need for using numeric memory addresses, as in machine languages.
UNIT 3                                             CPS 196                                               Page 11



c)   Format of a declaration (not most general):

         <type> <variable-name>;                       variable not given an initial value

         <type> <variable-name> = <value>;             variable given an initial value

d)   Examples:

                       float x;       declares x to be a floating point variable
                       int   k;       declares k to be an integer variable
                       char c;        declares c to be a character variable
                       char *p;       declares p a pointer-to-character (string) variable
                       int *q;        declares q a pointer-to-integer variable

e)   The type of a variable dictates the size of the memory area that is allocated to hold its value.
f)   The type of a variable also determines the way in which the content of the memory allocated to that variable
     is interpreted.
g)   A variable's declaration must precede its use in a source file.
h)   Variable names are normally chosen by the programmer for their mnemonic (memory-aiding) value, e.g.,

         double     adjusted_income;
         int        head_count;
         char       letter_grade;

a)   Several variables of the same type can be declared in a single declaration using comma separators, e.g.,

         int i, j, k;

j)   The same variable name can be treated in two different ways in a program (depending on context):
           as the memory location of its value, i.e., as a symbolic address (lvalue of variable name)
           as the value held in the memory area allocated to that variable (rvalue of variable name)

                                      13                          x = x + 1;

                              
                                                                            refers to 13 (rvalue of x)
                                   int x;                    refers to  (lvalue of x)


3. EXPRESSIONS
Expressions are meaningful combinations of operators and operands. Think of the ordinary algebraic
expressions you studied in high school, like x+2, a.x+b, y.sin(x)and log(x+1). In contrast, the
combinations x2+ and )ysinx( are not meaningful. Every expression has a type and a value (unless type is
void).

3.1 OPERATORS
a)   Operators take one or more values and combine them to form a new value.
b)   C provides an unusually rich assortment of operators, including most of the common ones, like the
     arithmetic operators (+, -, *, /, %) and relational operators (==, !=, <, >, >=, <=).
c)   Some uncommon C operators: % (remainder), ++ (increment), -- (decrement), & (address-of)
d)   Here’s a C operator not usually thought of as such: = (assignment, as in y = x + 1).
e)   Another C operator not usually thought of as such: () (function call/application, as in y = f(x)).
f)   An operator C lacks: exponentiation.
g)   Examples:
             in the expression 1+1, the operator is + (addition); the value of this expression is 2.
             we may view the expression cos(0.0) as the application of the cosine function to the value 0.0;
              the value of this expression is 1.0.
             if k is an int with value 3, then ++k has value 4, and --k has value 2.
             if x is a variable at memory location 65635, then the value of &x is 65535.
UNIT 3                                           CPS 196                                           Page 12



3.2 OPERANDS
a)   An operand is a value represented by a constant, variable or expression.
b)   A constant is a fixed value of some type, e.g., 1, 187 and -23 are int constants, and 3.14 and -298.4
     are double constants.
c)   Format of integer type constants (not most general):
             int:                   <decimal-digit> <decimal-digit>*
             unsigned int:          <decimal-digit> <decimal-digit>*U
             long:                  <decimal-digit> <decimal-digit>*L
             unsigned long:         <decimal-digit> <decimal-digit>*UL
d)   Examples of integer constants::
             123      (signed int, or just int)
             123U (unsigned int, or just unsigned)
             123L (long int)
             123UL (unsigned long int)
e)   There are also "character constants" that represent character codes, but the type of such constants is int.
f)   Format of a character constant (not most general):
             '<C-source-character>'           (where <C-source-character> is a graphic character)
             Constants for non-graphic (and some graphic) characters require the escape character (\), e.g.,
                    '\n' represents newline
                    '\t' represents horizontal tab
                    '\\' represents backslash
                    '\'' represents single quote
g)   Thus, for example, 'a' is an int (not a char) whose value is the character code for lower case a..
h)   Examples of character constants:
             'A', 'B', 'a', 'b'
             '+', '/', '"'
             '0', '1', '2'
i)   Format of floating point type constants (not most general):
             float:                 <mantissa> <exponent>optF
             double:                <mantissa> <exponent>opt
             long double:           <mantissa> <exponent>optL
     where <mantissa> is a seqence of decimal digits including a decimal point,
     <exponent> is E <sign>opt <decimal-digit>*, and <sign> is + or -
j)   Examples of floating point constants:
                       0.F          .0F                1.23F            1E23F
                       1E-23F       1.23E4F            0.               .0
                       1.23         1E23               1E-23            1.23E+4
                       0.L          .0L                1.23L            1E23L
                       1E-23L       1.23E4L            0.0


3.3 COMBINING OPERATORS AND OPERANDS
a)   A simple expression (i.e., an identifier or a constant) by itself is a well-formed expression.
b)   Expressions combined by appropriate operators are also well-formed expressions.
c)   The order of evaluation of the operands of most binary (two-operand) operators is undefined
d)   The value of an assignment expression is the value of the right-hand operand of the = operator.
e)   Examples (we assume all identifiers have already been declared):
            1, 2, adjusted_income and head_count are valid (simple) expressions
            in the expression (1+2)-(3+4), the operands of the subtraction operator are (1+2) and
             (3+4)—both expressions in their own right. Further, 1, 2, 3 and 4 are all valid expressions.
            in the expression tax_owed                =     tax_rate*adjusted_income, tax_rate and
             adjusted_income are operands of the * (multiplication) operator; tax_owed and the value of
             tax_rate*adjusted_income are the operands of the = (assignment) operator. Further,
             tax_owed, tax_rate and adjusted_income are all valid expressions.
            adjusted_income = 28500.00 has value 28500.00
            if k is an int with value 5, then ++k has value 6
            if k is an int with value 5, then the value of ++k + k could be 11 or 12, depending on the order
             of evaluation of the two addition operands (so don't use such expressions!)
UNIT 3                                             CPS 196                                            Page 13



3.4 PRECEDENCE OF OPERATORS [ILLUS.: "PRECEDENCE OF C OPERATORS"]
a)   Needed to establish the order of operator application in ambiguous expressions like a+b*c (here the
     question is whether the operand b is bound by the + operator or the * operator).
b)   The higher precedence operator binds the operand over the lower precedence operator
c)   Note the assignment operator has, appropriately, a very low precedence level—only one rather unusual
     operator has lower precedence.
d)   If in doubt about precedence when coding complex expressions, use parentheses.
e)   Examples:
              a+b*c is treated as a+(b*c)
              in (a+b)*c, the addition operator binds b
              a*b+c is treated as (a*b)+c
              a/b-c*d is treated as (a/b)-(c*d)
              a*-b is treated as a*(-b), as unary arithmetic operators precede binary arithmetic operators
              a = a/b-c*d is treated as a =((a/b)-(c*d))
              y*sin(x) is treated as y*(sin(x))

3.5 ASSOCIATIVITY OF OPERATORS
a)   Needed to establish order of operator application when operators have same precedence level, e.g., 3-1-2
     (could be (3-1)-2, or 3-(1-2), which have different values)
b)   Since all operators having the same precedence level have the same associativity, the following rule
     suffices:
              an operand is grouped with the left or right operator if the competing operators are left-associative
               or right-associative, respectively.
c)   The usual associative laws obeyed by addition and multiplication do not hold, e.g., (a+b)+c may not
     equal a+(b+c)—e.g., if overflow occurs due to the finite precision of operands.
d)   Examples:
              3-1-2 is treated as (3-1)-2
              a*b/c%d is treated as ((a*b)/c)%d                 (all operators have same precedence level)
              x = y = 9 is treated as x = (y = 9)               (because = is right associative)


4. STATEMENTS
We must be able to store the value of an expression in a variable, compare the values of different expressions,
choose among alternative actions based on data values, etc., in order to solve real problems. Statements are
the C language "action" units that allow us perform these tasks.

4.1 EXPRESSION STATEMENTS
a)   The most commonly used type of C statement is the expression statement, which is simply an expression
     followed by a semi-colon. (Most statements end with a semi-colon. There is even an empty statement
     consisting of just a semi-colon which does nothing at all; it's usefulness will be seen later.)

4.1.1 Assignment Statements
a)   Since assignment expressions form a sub-class of all expressions, this means that assignment
     statements form a sub-class of all expression statements.
b)   The simplest form an assignment statement may take is:

                  <variable> = <expression>;

c)   Examples:

                  int k;              declares k to be of int type
                  int m;              declares m to be of int type

                  k   =   1;          assigns value 1 to variable k
                  m   =   2;          assigns value 2 to variable m
                  k   =   m;          assigns value 2 to variable k
                  k   =   m+3;        assigns value 5 to variable k
                  k   =   m*k+1;      assigns value 11 to variable k
UNIT 3                                            CPS 196                                            Page 14



4.1.2 Function Call Statements
a)   A function call is a kind of expression, too, so function call statements are just another kind of expression
     statement.
b)   Example:

                  double x = 2.0;             declares x to be of double type & initializes x to 2.0
                  double y = 3.0;             declares y to be of double type & initializes y to 3.0

                                              assigns value 8.0 (i.e., 2.0 ) to variable x; pow is a
                                                                           3.0
                  x = pow( x, y );
                                               standard library function declared in standard header math.h


5. HOW TO WRITE OUT VALUES OF BASIC TYPES USING printf
C has no built-in operators that perform I/O (Input/Output). Instead, every Standard C implementation provides
a collection of standard library functions programmer can call to accomplish various common programming
tasks, including I/O. While there are many standard I/O functions, we'll start out using just the scanf and
printf functions for input and output, respectively.

5.1 GENERAL FORMAT:
                  printf( <control-string>, <value-list> );


5.2 WRITING OUT THE VALUE OF AN EXPRESSION OF int TYPE (%i CONVERSION)
a)   Examples:
                      printf( "%i", k );
                      printf( "%i %i", k, m );
                      printf( "%i %i", k*m, 2*m+1 );

5.3 WRITING OUT THE VALUE OF AN EXPRESSION OF double TYPE (%f CONVERSION)
a)   Examples (N.B.: printf and scanf conversion specifications are asymmetric for doubles):
                      printf( "%f", x );
                      printf( "%f %f", x, y );
                      printf( "%f %f", x-y, x+y );
b)   Using %lf instead of %f has the same effect in a printf control string:
                      printf( "%lf", x );
                      printf( "%lf %lf", x, y );
                      printf( "%lf %lf", x-y, x+y );

5.4 CAN INTERLEAVE EXPRESSION VALUES WITH "AS IS" TEXT:
a)   Examples (assuming that the values of k, x and y are 65, 12.3 and 5.04, respectively):
                   printf( "And the lucky number is %i", k );
                    writes out "And the lucky number is 65" (without the quotes)
                   printf( "Adding %f to %f yields %f", x, y, x+y );
                    writes out "Adding 12.300000 to 5.040000 yields 17.340000" (without the quotes)
UNIT 4                                              CPS 196                                             Page 15




UNIT 4: MORE ON EXPRESSIONS, FUNCTIONS AND I/O
This Unit covers mixed mode expressions, implicit and explicit casts, printf and scanf conversion
specifications, the meaning of the & operator (in scanf arguments), and how to define functions.


1. TYPE CONVERSIONS [ILLUS.: "THE USUAL CONVERSIONS"]
1.1 IMPLICIT CONVERSIONS
1.1.1 In Mixed Arithmetic Expressions
a)   An arithmetic expression containing operands of different types is called a mixed expression.
b)   Operands of "narrower" types are converted to "wider" types before performing operations, so as not to lose
     information.
c)   The value of a mixed expression usually has the same type as its widest operand.
d)   Examples:
          int i = 5;
          double d = 3.2;

         d+i                 i converted to double before addition; result is double
         i/d + i             i converted to double before division and again before addition;
                              result is double

1.1.2 Due to Assignment
a)   The type of an assignment expression is always the type of the LHS (but the value is always RHS's value).
b)   If the type of the LHS is narrower than the RHS, then information may be lost.
c)   Examples:
           d = i              value of i converted to double and stored in d; typically, no information lost
           i = d              value of d converted to int (truncated) and stored in i; information may be lost

1.1.3 Of Function Return Values
a)   The declared return type of a function controls the type of return value, implicitly converting the return value,
     if necessary.
b)   Example:

         int                                                      int
         truncfunc( void )                                        truncfunc( void )
         {                              is the same as            {
             return 10.4;                                             return 10;
         }                                                        }

1.2 EXPLICIT CONVERSIONS (CASTS)
a)   Most compilers issue warning messages when implicit conversions are made that may lose information.
b)   To avoid such messages, use an explicit type conversion operator, or cast.
c)   General format of a cast operator:

         (<type>)

d)   Examples:

         i = (int)d          no warning message issued as intention is signaled
         (double)i/j         forces floating point division (N.B.: integer j is implicitly converted to double)



2. HOW TO INPUT VALUES OF BASIC TYPES USING scanf
The first argument to the powerful scanf function is a control string which specifies the types of values to be
read in. The remaining arguments are the addresses of memory areas into which the input values, after
conversion to the appropriate internal format, will be stored. The unary address-of operator, &, which returns
the address of its operand, may be used to supply the address values needed in scanf calls.
UNIT 4                                             CPS 196                                           Page 16



2.1 GENERAL FORMAT:
                  scanf( <control-string>, <address-list> );

2.2 READING VALUES INTO int VARIABLES (%i CONVERSION)
a)   General format:

                  scanf( <control-string>, <address-list> );

b)   Examples:
                  scanf( "%i", &k );
                  scanf( "%i%i", &k, &m );

2.3 READING VALUES INTO double VARIABLES (%lf CONVERSION, %f FOR float)
a)   Examples (using %f instead of %lf does not have the same effect, as in a printf control string):
                 scanf( "%lf", &x );
                 scanf( "%lf%lf", &x, &y );

2.4 THE MEANING OF A scanf RETURN VALUE
a)   Each call to the scanf function returns an int, which represents the number of successful conversions
     performed in that call—this number could be zero if the first conversion specified in the control string isn't
     matched by compatible input.
b)   scanf returns upon encountering a mis-match of any conversion specification with the data in the input
     stream; thus, only some of the arguments following the control string may receive values.
c)   If the end of the input stream is encountered before any conversions can be made, scanf returns the
     special value EOF (normally #defined as -1 in stdio.h). N.B.: Ctrl-z signals end-of-file from the
     keyboard in Windows (Ctrl-d serves similarly in UNIX).


3. FUNCTIONS
3.1 RECALL THAT ALL EXECUTABLE CODE RESIDES INSIDE FUNCTION DEFINITIONS
3.2 DEFINING FUNCTIONS
a)   A function definition consists of a function header immediately followed by a function body
b)   The function header specifies the return type, name and parameters of the function
c)   The function body specifies local declarations and statements within curly braces, {}
d)   The syntactic structure { [<declarations>] [<statements>] } is called a compound statement; thus, a function
     body is a compound statement
e)   At least 1 return statement must be coded inside each function body
f)   General format of a function definition:

         <return-type>                          return type specified on a separate line
         <function-name>( <parameter-list> )    parameter list specifies parameter names & types
         {                                      function body begins with a { on a separate line
             <declaration>
             …                                  declarations are optional; always precede statements
             <declaration>
                                                leave a blank line between declarations and statements
             <statement>
             …                                  must include at least 1 return statement
             <statement>
         }                                      function body ends with a } on a separate line

3.3 FUNCTION RETURN VALUES
a)   A function need not return a value (like a Pascal procedure)
b)   The void return type indicates that the function being defined does not return a value
c)   All return statements coded in a function with a void return type have form: return;
d)   All return statements coded in a function with a non-void return type have form: return <expression>;
     (the type of <expression> should be the same as, or convert without loss of data to, the return type declared
     in the funtion header).
UNIT 4                                            CPS 196                                    Page 17



3.4 FUNCTION PARAMETERS
a)   To signal that a function takes no parameters, specify void for the parameter list.

3.5 EXAMPLES
3.5.1 Calculating the Volume of a Sphere
         #define PI 3.141592653                         note parameterless macro

         double
         volume_of_sphere( double r )
         {
            return 4.0/3.0 * PI * r * r * r;
         }

    How can this be made more efficient? (fold 4.0/3.0 to 1.333333333333; can go further)

3.5.2 Writing a Form Letter
         void
         bug_letter( void )
         {
            printf( "Dear Sir or Madam,\n" );
            printf( "We were shocked to hear that you found a bug\n" );
            printf( "on your plate at our cafeteria. Please accept\n" );
            printf( "our apologies and our blue-plate special, gratis.\n\n" );
            printf( "\t\tSincerely,\n\nTom Maine\n" );

             return;
         }
UNIT 5                                             CPS 196                                           Page 18




UNIT 5: MORE ON DECLARATIONS AND FUNCTIONS
The following introduces various classes of data declarations and covers the scope, visibility and extent of
declarations. The rationale behind dividing code into distinct functions is discussed and C's call-by-value
parameter-passing method is covered.


1. FILE-LEVEL VS. LOCAL DECLARATIONS
1.1 FILE-LEVEL DECLARATIONS
a)   File-level declarations are declarations that occur outside of any function definition.
b)   There are two kinds of file-level declarations: global and static.

1.1.1 Global Declarations
a)   Any file-level declaration of the type already covered is global, i.e., the variable declared is potentially
     accessible from other source files in the same program.

1.1.2 Static Declarations
a)   If the reserved word static is prefixed to a file-level declaration, the variable declared will not be
     accessible from other source files in the same program—this reduces name conflicts between source files.

1.2 LOCAL DECLARATIONS
a)   Local declarations are declarations that occur inside function definitions (including main)
b)   More generally, any declaration occurring in a compound statement is considered a local decalaration


2. DEFINING VS. REFERENCING DECLARATIONS
a)   Not all declarations actually cause storage to be reserved for the declared variable; those that do are called
     defining declarations.
b)   If the reserved word extern is prefixed to a declaration, no storage is allocated for that variable and the
     declaration is styled a referencing declaration.
c)   May want to declare a variable without allocating storage for it because a defining declaration was made in
     another source file, but you need to declare the type of that variable for use in the current source file.
d)   Examples:

         /* File: A.C */                                                 /* File: B.C */
         extern int b;                                            extern int a;
         int a;                                                          int b;
         . . .                                                           . . .
         a = a + b;                                                      b = a + b;
         . . .                                                           . . .
                                           A.OBJ          B.OBJ
     "b is external"                                                    "a is external"

     a                                                                  b

     <A's machine instructions>                                         <B's machine instructions>
UNIT 5                                            CPS 196                                            Page 19




3. SCOPE, VISIBILITY & EXTENT
3.1 SCOPE OF A DECLARATION
a)   The scope of a declaration of a variable is that part of the program where an occurrence of the variable
     name could possibly refer to that declaration.
b)   The scope of a file-level declaration is from that declaration to the end of the source file containing it.
c)   The scope of a local declaration is the function body (or compound statement) containing it.

3.2 VISIBILITY OF A DECLARATION
a)   The visibility of a declaration of a variable is that part of the program where an occurrence of the variable
     actually refers to that declaration.
b)   A local declaration of a variable with the same name as a previously occurring file-level variable is said to
     shadow the file-level declaration, e.g.,

         int k;                      file-level (global) declaration

         int
         shade( int m )
         {
             int k;                  local declaration shadows the file-level declarations

             k = k*2 + m;            inside the shade function, all uses of k refer to the local declaration
             return k;
         }

c)   Function parameters may also shadow file-level declarations [change void to int k above].

3.3 EXTENT OF A DECLARATION
a)   The extent of a a declaration of a variable is the period of time when storage is allocated to that variable.
b)   There are three kinds of extent:
              local extent - storage allocated on entry to function or compound statement, freed on exit
              static extent - storage allocated before program execution begins, stays alloc'd until termination
              dynamic extent - storage allocated/freed explicitly using standard library functions
c)   Local variables and formal parameters have local extent; thus, their values are not preserved between
     function calls.
d)   File-level variables have static extent (so does any local variable whose declaration is prefixed with the
     static reserved word).
e)   Dynamic allocation of storage will be covered later in the semester.

3.4 EXAMPLE:
         /*
         ** scopext.c - examine scope, extent and visibility of
         **             declarations.
         */

         int y = 20;

         int
         func_a( int x )
         {
             return ( x + y );
         }

         int x = 10;

         int
         func_e( int y )
         {
             return ( x + y );
         }
UNIT 5                                           CPS 196                                           Page 20


         int
         func_k( void )
         {
             int x = 1, y = 2;

             x = func_a( x );
             y = func_e( y );

             return ( x + y );
         }

         int
         main( void )
         {
             int x, y;

             x = y = func_k();

             return 0;
         }

    N.B.: there are four distinct declarations for both x and y. More specifically, each of x and y is declared
     once as a global variable, twice as a local variable and once as a function parameter.
    Which declarations are file-level? global? static? local?
    Which declarations are defining? referencing?
    What is the scope of each variable declaration?
    What is the extent of each variable declaration?
    What is the visibility of each variable declaration?


4. MORE ON FUNCTIONS
4.1 RATIONALE FOR FUNCTIONS
a)   Functions act as very high-level statements, making code easier to read.
b)   Functions compartmentalize code, making it easier to debug (crucial in large programs).
c)   The separation of programming tasks into single-purpose units like functions eases code modification.
d)   Once a function is written and debugged, it can be re-used in other programs.
e)   Writing modular code through the judicious division of programs into functions greatly increases
     programmer productivity and effectiveness.

4.2 ARGUMENTS AND PARAMETERS
a)   The values passed to a function in a function call expression are called arguments (alternatively, actual
     parameters).
b)   The objects declared between the parentheses in a function header are that function's parameters
     (alternatively, formal parameters).
c)   The type of each argument must agree with its corresponding parameter.
d)   When a function is called, it's parameters receive copies of the corresponding argument values, so
     changing parameter values inside a function has no effect on the corresponding arguments. [Show the
     classic "swap" function.] This method of parameter passing is styled call-by-value—it is the only parameter
     passing method available in C. (By contrast, Pascal has two methods: call-by-value for so-called "value"
     parameters and call-by-reference for so-called "var" parameters.)
e)   Complex expressions (not just variables and constants) may be coded as function arguments; in all cases,
     the types of the argument expressions must agree with the types of the corresponding parameters.

4.3 NESTING
a)   Unlike Pascal and some other languages, function definitions cannot be nested.
b)   Nested function calls are permitted, however, and are common: add( a, mult( b, c ));
UNIT 6                                              CPS 196                                       Page 21




UNIT 6: BOOLEAN EXPRESSIONS AND CONDITIONAL STATEMENTS
This Unit covers "boolean" expressions and conditional expressions and statements. C operators that produce
boolean values, including the relational operators and the logical operators are discussed, noting the short-
circuit evaluation of expressions involving the logical operators || and &&.


1. BOOLEAN EXPRESSIONS
A boolean expression is any expression with a boolean value, i.e., true or false. Boolean expressions are
sometimes called predicates (a function returning a boolean value is also called a predicate).

1.1 NO "BOOLEAN" TYPE IN C (SCALAR TYPES USED INSTEAD)
a)   However, any scalar value can be interpreted as denoting a boolean value, as follows:
             any zero value denotes false
             any non-zero value denotes true
b)   Typically, programmers use 0 and non-0 values of int type to represent boolean values.
c)   The canonical scalar value representing true is 1.
d)   Examples:

         0, 0U, 0.0, 0.0F                                all denote false
         1, 2, -3, 32768U, 6.02E+23, -98.6F              all denote true


1.2 C OPERATORS THAT PRODUCE BOOLEAN VALUES
The result of applying any of the following operators is always 0 (false) or 1 (true).

1.2.1 Relational Operators
a)   Equality: <expr1> == <expr2> has value 1 if <expr1> and <expr2> have the same value; 0 otherwise.
b)   Inequality: <expr1> != <expr2> has value 0 if <expr1> and <expr2> have the same value; 1 otherwise.
c)   Greater than: <expr1> > <expr2> has value 1 if <expr1> greater than <expr2>; 0 otherwise.
d)   Less than: <expr1> < <expr2> has value 1 if <expr1> less than <expr2>; 0 otherwise.
e)   Greater than/equal to: <expr1> >= <expr2> has value 1 if <expr1> greater /equal <expr2>; 0 otherwise.
f)   Less than/equal to: <expr1> <= <expr2> has value 1 if <expr1> less /equal <expr2>; 0 otherwise.
g)   All the above operators work on operands of any arithmetic type.
h)   Relational operators have lower precedence than arithmetic operators (see last example below).
i)   Examples:



         Expression                         Value
         3 == 2                             0
         3 == 3                             1
         1.5 != 1.5                         0
         1.50…01 <= 1.50…02                 1
         1.50…01 < 1.50…02                  ? (if too many 0's, expr's will be considered equal)
         i+(j+k) == (i+j)+k                 ? (associative law may not hold due to overflow)
         [show int i=INT_MAX, j=1, k=-1; vs. int i=1, j=2, k=3;]
UNIT 6                                             CPS 196                                            Page 22



1.2.2 Logical Operators
a)   C provides three logical operators, which combine with boolean expressions and yield boolean values.
b)   Logical not:          !<expr> is true iff <expr> false:

                                                <expr>       ! <expr>
                                               non-zero         0
                                                 zero           1

c)   Logical or:            <expr1> || <expr2> is false iff both <expr1> and <expr2> are false:

                             <expr1>                <expr2>            <expr1>   || <expr2>
                            zero value            zero value                     0
                            zero value          non-zero value                   1
                          non-zero value          zero value*                    1
                          non-zero value        non-zero value*                  1
                                       * <expr2> not evaluated in this case

d)   Logical and:           <expr1> && <expr2> is true iff both <expr1> and <expr2> are true:

                             <expr1>                <expr2>            <expr1>   && <expr2>
                            zero value            zero value*                    0
                            zero value          non-zero value*                  0
                          non-zero value          zero value                     0
                          non-zero value        non-zero value                   1
                                       * <expr2> not evaluated in this case

e)   Application of logical operators always yields a 0 or 1 value, though operands can have any scalar value.
f)   Unlike the case with arithmetic operators, the left operand of the || or && operator is always evaluated first,
     and the right operand may not be evaluated, depending on the value of the left operand. This is called
     short-circuit evaluation.
g)   || and && have lower precedence than the relational operators, ! has higher precedence (see examples).
h)   Examples:

         Expression                                    Value
         lo <= k && k <= hi                            ? (depends on values of lo, k and hi [try a few])
         2>1 || 1/0                                    1 (division not executed due to short-circuit evaluation)
         1>2 && 1/0                                    0 (division not executed due to short-circuit evaluation)
         b != 0 && a/b > 3                             ? (b != 0 is a guard expression, prevents div-by-0)
         x != y == !(x == y)                           1 (for any values of x and y; RHS parentheses required)
         x < y == !(x >= y)                            1 (for any values of x and y; RHS parentheses required)
         x > y == !(x <= y)                            1 (for any values of x and y; RHS parentheses required)


2. CONDITIONAL STATEMENTS AND EXPRESSIONS
2.1 if (OR CONDITIONAL) STATEMENTS
a)   Similar to if statements in other high-level languages. like Pascal.
b)   General Format of if statements:

              if ( <boolean-expr> )            <boolean-expr> called "condition" or "test"
                 <statement1>                  executed if <boolean-expr> is true
              else
                 <statement2>                  executed if <boolean-expr> is false

c)   The entire else <statement2> clause may be absent.
d)   <statement1> and/or <statement2> can be compound statements.
e)   <statement1> and/or <statement2> can be, or include, other if statements; in such cases we speak of nested
     if statements.
UNIT 6                                          CPS 196                                          Page 23



f)   When if statements are nested, a given else clause belongs to the immediately preceding if clause at
     the same nesting level.
g)   To avoid confusion w.r.t. to preceding rule, always use fully bracketed syntax—if either the if clause or
     the else clause of an if statement is a compound statement, enclose both clauses in braces.
h)   Often, programmers need to code long series of nested if-else's. In this case only, do not apply the
     usual indentation rules; instead, all the if-else's should line up in the same column. The if conditions
     are typically mutually exclusive in such cases.
i)   Examples:

         if ( grade >= 95 )
         {
            printf( "Dear Folks,\n" );
            printf( "Remember the new car you promised me if…\n" );
            . . .
         }

         ---------------------------------------------------------------------------

         if ( grade >= 70 )
         {                           what happens if this { and the corresponding } are missing?
            if ( grade >= 95 )
            {
               printf( "Dear Folks,\n" );
               printf( "Remember the new car you promised me if…\n" );
               . . .
            }
         }
         else
         {
            printf( "Dear Warden,\n" );
            printf( "…\n" );
            . . .
         }

         ---------------------------------------------------------------------------

         if (   is_democrat )
         {
            .   . .
         }
         else   if ( is_republican )
         {
            .   . .
         }
         else   if ( is_independent )
         {
            .   . .
         }
         else   /* not registered */
         {
            .   . .
         }
UNIT 6                                            CPS 196                                            Page 24




2.2 CONDITIONAL EXPRESSIONS
a)   The conditional, or ternary operator ?: combines three operands (no Pascal counterpart).
b)   General Format of a Conditional Expression:

         <boolean-expr> ? <expr1> : <expr2>

c)   If <boolean-expr> is true, then the value of the conditional expression is the value of <expr1>; otherwise the
     value of the conditional expression is the value of <expr2>.
d)   <expr1> and <expr2> must have compatible types (see p. 219 in Harbison & Steele for the gory details); you
     can't go wrong if (and it is recommended that) you ensure both expressions have the same type.
e)   Examples:

         int   i, j, k; double x, y;
         . .   .
         k =   i > j ? i : j;                           k is assigned the max of i and j
         k =   i < j ? i : j;                           k is assigned the min of i and j
         k =   x > (double)i ? y : i;                   may provoke msg due to diff types of y and i
UNIT 7                                            CPS 196                                           Page 25




UNIT 7: ITERATION
We cover iteration in C using the while, for and do statements. Along the way, we introduce the pre-
increment (++<expr>), pre-decrement (--<expr>) and assignment operators +=, -=, *=, /= and %=.


1. ITERATION STATEMENTS
[Students are assumed to be familiar with iteration statements in other languages. As motivation, you might ask
the class how they'd print out a form with fifty ruled lines on a page.] C provides three looping constructs: the
while, for and do statements.

1.1 while STATEMENTS
a)   Similar to the while statements of Pascal, Fortran and PL/1.
b)   General Format of the while statement:

              while ( <boolean-expr> )        <boolean-expr> called "loop condition" or "continuation test"
                 <statement>                  the loop body

c)   The loop condition is evaluated before entering the loop body. If it's false, the loop body is not executed;
     otherwise, the loop body is executed repeatedly until the condition is false. Note that the loop body may not
     be executed at all.
d)   The loop body must therefore have the potential to falsify the loop condition—this is the case for all loop
     statements (unless an "infinite loop" is desired).
e)   To effect an infinite loop with a while statement, use the standard idiom while ( 1 ) <statement>.
f)   Here's a classic C looping idiom:

              while ( (<variable> = <func-name>( <arguments> )) != <terminating-value> )
                 <statement>

g)   Examples:


         /*
         ** cat.c - copy standard input to standard output.
         */

         #include <stdio.h>         /* for decl. of getchar and putchar, defn. of EOF */

         int
         main( void )
         {
             int c;

             while ( (c = getchar()) != EOF )                to terminate at keyboard: enter Ctrl-Z (WINDOWS)
                putchar( c );                               (getchar returns code of next std. input character
                                                            or EOF macro value at End-Of-File; putchar writes
             return 0;                                      character with code c to std. output and returns c
         }                                                  unless error occurs, in which case it returns EOF)


         ---------------------------------------------------------------------------
UNIT 7                                            CPS 196                                            Page 26


         printf( "Enter age: " );
         while ( scanf( "%i", &k ) == 1 )     to terminate at keyboard: enter Ctrl-Z (WINDOWS)
         {
            printf( "You don't look a day over %i!n\n", k/2 );
            printf( "Enter age: " );          note repeated prompt (is it avoidable?)
         }


1.2 for STATEMENTS
a)   C's most powerful looping construct (more flexible than the Pascal/Basic for or FORTRAN do statement).
b)   General Format of the for statement:

              for ( <entry-expr>; <boolean-expr>; <continuation-expr> )            the loop header
                 <statement>                                                       the loop body

c)   None of the three expressions in the loop header are required (but both semi-colons are required).
d)   <entry-expr> is evaluated first; a typical use is to initialize a counter with an assignment expression.
e)   <boolean-expr> is then evaluated; it controls entry to the loop body as in a while loop.
f)   After each iteration of the loop body, and before <boolean-expr> is re-evaluated, <continuation-expr> is
     evaluated; a typical use is to update a counter.
g)   The loop header for (;;) is commonly used for infinite loops.
h)   Example:

         /* print out first 16 powers of 2 */
         for ( k = 0; k < 16; k = k+1 )
            printf( "2^%i = %i\n", k, (int)pow( 2.0, (double)k ) );


1.3 do STATEMENTS
a)   Least commonly used looping statement.
b)   Similar to the while statement, but the loop condition is tested after each iteration of the loop body; thus,
     the loop body is always executed at least once. If the loop condition is true, the loop body is re-executed.
c)   The C do statement is not directly equivalent to the Pascal repeat...until, though it is easy to see that
     do...while( !<cond> ); is equivalent to repeat...until <cond>.
d)   General Format of the do statement:

              do
                 <statement>             make this a compound statement, even if not required
              while ( <boolean-expr> );

e)   Example (how would you do the same thing using a for or while statement?):

         /* read & process characters up to (and including) first newline */
         do
         {
            process( c = getchar() );
         }
         while ( c != '\n' );


1.4 WHEN TO USE WHICH LOOP STATEMENT?
Usually, the task at hand will suggest the most appropriate statement. Use the examples given in Units and
labs as a guide. Since any one of the loop statements can simulate the others (with more or less bother), it
pays to use the statement that makes the mechanism by which the loop terminates clearest. Here are some
more specific guidelines.
UNIT 7                                             CPS 196                                            Page 27



a)   For loops without uniformly maintained counters or the like (i.e., that can't take advantage of the for
     statement's header structure), use the while statement.
b)   If a variable (e.g., a counter variable) is to be uniformly updated with each iteration, then the for statement
     is preferred (as counter maintenance can be conveniently placed in the for loop header, which cleans up
     the loop body).
c)   In cases where the loop body must first be executed in order to evaluate the loop condition, use the do
     statement.


2. PRE-INCREMENT, PRE-DECREMENT AND ASSIGNMENT OPERATORS
   [ILLUS.: PRECEDENCE OF C OPERATORS]
The operations of adding 1 to a variable and subtracting 1 from a variable are so often required that many
machines have built-in instructions that perform these operations. Most higher-level languages do not have
such "increment" or "decrement" operators, but C does. It is also quite common for the RHS of an assignment
expression to combine the LHS of the assignment expression and some other expression with an arithmetic
operator. C's assignment operators provide a means of writing such assignment expressions in a more concise
and readable, and less error-prone, way.

2.1 THE PRE-INCREMENT OPERATOR
a)   The pre-increment operator ++ is a unary operator, written to the left of its operand, which may be of any
     scalar type.
b)   The simplest form of a pre-increment expression is: ++<variable>.
c)   The value of the pre-increment expression ++<variable> is the original value of <variable> plus 1.
d)   Evaluation of ++<variable> has the "side effect" of actually changing the value of <variable> to its original
     value plus 1 (an expression is said to have a side effect if the value of any variable is changed as a result
     of its evaluation; e.g., all assignment expressions have side effects).
e)   Thus, evaluating ++<variable> is similar to evaluating (<variable> = <variable> + 1); the former
     expression is not only more concise, but may result in faster code (in some environments with crummy,
     non-optimizing compilers).
f)   There is, of course, a post-increment operator. It is a more complex operator and will be covered later.
g)   Examples:

         k = -3;      ++k;            after both statements executed, k has value -2

         j = 10;      i = 2 * ++j;             after both statements executed, i has value 22

         /* print out first 16 powers of 2, using standard counter incr. method */
         for ( k = 0; k < 16; ++k )
            printf( "2^%i = %i\n", k, (int)pow( 2.0, (double)k ) );

2.2 THE PRE-DECREMENT OPERATOR
a)   As per the pre-increment operator, mutatis mutandis.

2.3 ASSIGNMENT OPERATORS
a)   Code like k = k <operator> <expression> appears often in programs (e.g., k = k+i, k = k-2, etc.).
b)   Assignment operators provide a shorthand for such code (N.B.: the parentheses below are significant—they
     prevent operator precedence problems):

         <lvalue> <operator>= <expression> is equivalent to <lvalue> = <lvalue> <operator> (<expression>)

c)   All arithmetic operators can be joined with = to form assignment operators (so can some other operators).
d)   Simple Examples:

         k += i;                               same as k = k+(i);   (here, parentheses are superfluous)
         k *= i+j;                             same as k = k*(i+j); (here, parentheses are required)
         k -= i/j;                             same as k = k-(i/j); (here, parentheses are superfluous)
         a_very_long_name /= 2;                same as a_very_long_name = a_very_long_name/(2);
UNIT 7                                   CPS 196                            Page 28



e)   Examples:

         #include <stdio.h>    /* for printf function */
         #include <math.h>     /* for pow function */
         ...
         /* print out first 16 powers of 2 */
         for ( k = 0; k < 16; k += 1 )
             printf( "2^%i = %i\n", k, (int)pow( 2.0, (double)k ) );

         ---------------------------------------------------------------------------

         /* "fast exponentiation" */
         long
         fast_pow( int b, unsigned e )
         {
            long r = 1;
            long a = b;

             while ( e > 0 )
             {
                if ( e % 2 )
                   r *= a;
                a *= a;
                e /= 2;
             }

             return 0;
         }
UNIT 8                                            CPS 196                                            Page 29




UNIT 8: FILTERS AND FUNCTION PROTOTYPES
This time out we discuss how and why to construct programs as filters, and we introduce function prototypes.


1. FILTERS
1.1 WHAT ARE FILTERS?
a)   A filter is a program that reads [only] from standard input and writes [only] to standard output. Output could
     be a copy of the input, a modified form of the input, some information gleaned from the input (e.g., the line,
     word and character counts of a text file), or whatever.
b)   At the heart of most filters is a read-and-write loop, in which case some authors speak of iterative filters.
c)   The term filter comes from the UNIX world, where many utilities are written as filters.
d)   Normally, filters are non-interactive. Thus, standard input should not be associated with the keyboard.

1.2 WHY HAVE FILTERS?
a)   Commonly used operating systems (e.g., Windows, Mac OS X, Linux, etc.) support command line I/O re-
     direction, which allows the association of standard input and/or output with files or devices other than the
     keyboard and screen at execution time.
b)   Via command line I/O re-direction, a filter can read from any file and write to any file—without modification
     of the source code—by specifying command line arguments in a special way (see example 1.3.1 below).
     This greatly increases the flexibility and utility of filters.
c)   Such operating systems usually also support pipes, which allow sending ("piping") the output of one filter to
     the input of another filter. This allows filters to be mixed and matched conveniently using I/O re-direction
     and pipes, which adds greatly to their utility.

1.3 EXAMPLES
1.3.1 The cat.c Copy Filter of UNIT 7, section 1.1
a)   The command "cat" copies keyboard input to screen output.
b)   The command "cat <a >b" copies file a to file b, via command line I/O re-direction.

1.3.2 A Double-Spacing Filter
         #include <stdio.h>

         int
         main( void )
         {
             int c;

             while ( (c = getchar()) != EOF )
             {
                putchar( c );
                if ( c == '\n' )
                   putchar( c );
             }

             return 0;
         }


1.3.3 An Spanish-to-Macedonian Translation Filter
a)   We cannot present such a program—it's way too hard. Just pretend we wrote one in source file spam.c.
b)   The command "spam" reads Spanish input from the keyboard and writes the equivalent in Macedonian to
     the screen.
c)   The command "spam <sp_doc >mac_doc" writes the Macedonian equivalent of the Spanish file sp_doc
     to the file mac_doc.

1.3.4 Combining Filters with Pipes
a)   pgma | pgmb connects, or pipes, the standard output stream of pgma to the standard input stream of
     pgmb; the | is called the pipe symbol.
UNIT 8                                              CPS 196                                              Page 30



b)   The command "cat <sp_doc | spam | dblspace >dmac_doc" writes a double-spaced Macedonian
     translation of the Spanish file sp_doc. The cat step isn't really necessary—the following command is
     equivalent: "spam <sp_doc | dblspace >dmac_doc" is equivalent.
c)   Piping obviates the need for temporary files when many processing steps are required.


2. FUNCTION PROTOTYPES
2.1 MOTIVATION
a)   Just as data items must be declared before their first use, functions must be declared before their first call.
b)   There are two ways to declare a function:
             place the whole definition of the function before the first call to it.
             place a function declaration, or prototype, before the first call to it.
c)   When a prototype for a function is specified before the first call to the function, the actual function definition
     may appear anywhere after the prototype, even following calls to that function.
d)   Just as data declarations associate a type with a data variable name, function prototypes associate the
     return value type, and the number and types of parameters, with a function name.
e)   The information provided by a prototype is used by the compiler to validate function calls (is the number of
     arguments correct? are the types of the arguments compatible with the parameter types? is the return
     value type compatible with the use of the return value?)
f)   Many standard headers consist largely of prototypes for standard library functions.

2.2 HOW TO CODE FUNCTION PROTOTYPES
a)   A function prototype is syntactically the same as a function header, with the following exceptions:
              a prototype is always terminated by a semicolon
              the names of the parameters are optional
b)   The layout convention is to code the prototype on one line, if possible.
c)   In source files, prototypes should be placed just before the first function definition.
d)   Many programmers find it convenient to place the main function first for quick access (others code it last,
     for the same reason, like the main block of a Pascal program).
e)   Examples (user-defined functions):

         double future_value( double principal, double rate, int nperiods );
         double future_value( double, double, int );
         double area_of_circle( double radius );
         long powint( int base, unsigned exponent );
         long powint( int, unsigned );
         long fast_pow( int, unsigned );

f)   Examples (standard library functions; such prototypes would be specified in stdio.h; note that ellipsis
     indicates a variable number of arguments):

         int   getchar( void );              void parameter means function takes no arguments
         int   putchar( int );
         int   printf( char *format, ... );  return value is number of chars written or EOF on error
         int   scanf(char *format, ... );
UNIT 8                                         CPS 196                                        Page 31



g)   Example (this is circ.c with a prototype allowing main and area_of_circle functions to be reversed):

         #include <stdio.h>                                  includes prototypes for printf and scanf

         double area_of_circle( double radius );  prototype for area_of_circle

         int
         main( void )
         {
             double radius;
             double area;

             printf( "Enter radius: " );
             scanf( "%lf", &radius );
             area = area_of_circle( radius );      call to area_of_circle before definition
             printf( "Area of circle with radius %g: %g\n", radius, area );

             return 0;
         }

         double
         area_of_circle( double radius )
         {
            return ( 3.141592653 * radius * radius );
         }
UNIT 9                                                CPS 196                                               Page 32




UNIT 9: INTRODUCTION TO ARRAYS
One-dimensional arrays are introduced in this Unit. We discuss why they are needed, how to declare them,
how storage for them is allocated in memory, how to set and retrieve array element values, how to read/write
array element values using scanf/printf. C's use of 0-relative indices is motivated (without introducing
pointers). We also cover using loops to process one element of an array per iteration, how to declare an array
as a function parameter, and how to pass an array as an argument in a function call. Finally, the size_t type is
introduced and related to the sizeof operator.


1. ONE-DIMENSIONAL ARRAYS [ILLUS.: C TYPES]
1.1 MOTIVATION
a)   Consider how to write a program that reads in n integers and writes them out in reverse entry order:
              all n values would have to be kept in memory
              using only scalar variables, n distinct variables must be declared to keep track of all n values
              the same program can't easily be used for different values of n [show pgms for n = 2 and 3.]
b)   To avoid this sort of mess, most languages (including C, of course) provide an array type, which is an
     aggregate type—an object of aggregate type may comprise many values.
c)   All the values stored in an array variable must have the same type.

1.2 ARRAY DECLARATIONS
a)   An array declaration specifies the array name, the type of its elements (the base type), and the number of
     elements the array can hold.
b)   General Format of an Array Declaration (without initialization):

              <base-type> <array-name>[<#elements>];

c)   Examples:
               int ra[3];                                 ra is an array of 3 ints
               double price[15];                          price is an array of 15 doubles
d)   Arrays may be partially or fully initialized when declared. Partial initialization always initializes an initial
     segment of the array (from the 0th element to the nth element, where n < <#elements>.
e)   If array is initialized, then <#elements> need not be specified—it will be deduced from the initializer.
f)   General Format of an Array Declaration (with initialization):

              <base-type> <array-name>[<#elements>] = { <value0>, ..., <valuen-1> };

g)   Examples:
              int ra[3] = { -23, 12, 0 };                            full initialization
              double price[15] = { 1.2, -.5 };                       partial initialization—first 2 elements initialized
              double xval[] = { 3.14, 2.718 };                       full initialization—xval only has 2 elements
h)   It's usually best to avoid the temptation to make array names plural.
i)   Uninitialized elements of locally declared arrays have indeterminate values. For file-level arrays, the value
     of any element that is not explicitly initialized is the zero value of the base type (0, 0.0, etc.).

1.3 ACCESSING ARRAY ELEMENTS
a)   The elements of an array are allocated contiguous (no gaps) storage.
b)   Each individual element of an array is numbered consecutively, starting from 0. The integer associated with
     a given element is called its index or subscript.
c)   To access an individual array element, use the subscript operator, [], which takes two operands: the array
     name and the index of the desired array element. The order of the operands is immaterial, though it is
     customary to write the array name first.
d)   General Format of a Subscript Expression (<index-expression> is an expression of any integer type):

              <array-name>[<index-expression>]             this is the standard (and recommended) way to do it
UNIT 9                                               CPS 196                                            Page 33



e)   Alternate Format for a Subscript Expression:

              < index-expression >[<array-name>]            oddball way, but equivalent to the above

f)   Examples:
           elements of ra are ra[0], ra[1] and ra[2]
           elements of price are price[0], price[1],…, price[14]
           x = price[i];         sets x to ith element of price (i should be >= 0 and <=14)
           ra[i] = k;            sets ith element of ra to k (i should be >= 0 and <=2)

g)   The following figure illustrates the naturality of 0-relative array indices:


                                                      ra
           indices               0                     1                   2
                                                                        
                              ra[0]               ra[1]                ra[2]

              byte                         +1*sizeof(int) +2*sizeof(int)
              addr's


1.4 AN IMPORTANT NOTE OF CAUTION!!!
Unlike Pascal and some other languages, C does not perform range-checking of indices (i.e., index values are
not implicitly tested for validity w.r.t. to the array size before applying the [] operator. Hence, an array access
like ra[567], where ra is declared as above, will neither be flagged at compile time (by most compilers), nor
caught by implicitly generated range-checking code at the moment the bogus array access is made during
execution. While such code may cause a run-time error, it may not be detected immediately and it may be
difficult to determine the cause of whatever error may ensue. Moral: if range-checking is desired, the
programmer must write such checks into the C source code him/herself.


2. TYPICAL ARRAY PROCESSING PATTERNS
2.1 PROCESSING ARRAYS IN LOOPS
a)   It is common to process the elements of an array in a loop, with one element processed per loop iteration.
b)   The most convenient iteration statement for this is the for statement, since the loop header cleanly
     encapsulates the initialization and incrementation of an index variable.
c)   Example:

         #define MAXELES 15                easier to maintain code if a symbolic constant is used
         . . .
         double a[MAXELES], avg, sum = 0.0;
         int     i;                                for array indexing
         . . .
         for ( i = 0; i < MAXELES; ++i )  this loop reads in values for array elements
             scanf( "%lf", &a[i] );
         . . .
         for ( i = 0; i < MAXELES; ++i )  this loop sums up the elements of array a
             sum += a[i];
         avg = sum / MAXELES;                      avg = average of the elements of array a


2.2 ARRAYS AS FUNCTION PARAMETERS AND ARGUMENTS
2.2.1 Declaring an Array as a Function Parameter
a)   C allows arrays to be passed to functions as arguments, so that a function may operate on different arrays
     with each call.
b)   There are two ways to declare an array as a function parameter in a function header or prototype. We'll
     only cover one way, for now.
UNIT 9                                             CPS 196                                             Page 34



c)   General Format of an Array Parameter Declaration (<array-name> may be omitted in a prototype):

            <base-type> <array-name>[]                   array size can be specified, but is ignored (so don't)
d)   Examples:

         double average( double a[], int neles );
         . . .
         double
         average( double a[], int neles )
         {
            double sum = 0.0;
            int    i;

             for ( i = 0; i < neles;              ++i )
                sum += a[i];
             return sum / neles;
         }

2.2.2 Passing an Array to a Function
a)   To pass an array to an array parameter in a function call, just specify the array name as argument.
b)   Examples:

         #define MAXELES 15
         . . .
         double a[MAXELES], avg;
         . . .
         avg = average( a, 10 );                                  avg = average of first 10 elements of array a


3. ON THE SIZE OF ARRAYS
How big an array can be and how many elements it may contain depend on the environment. Though the int
type is commonly used to declare index variables, the Standard C standard does not require that the int range
be large enough for indexing all elements of any array in any environment.

3.1 ARRAYS AND THE size_t TYPE
a)   There is a type, size_t, which is useful for declaring index variables since variables of this type are
     guaranteed to have a large enough range to accommodate any possible array index.
b)   This type is not built-in, it's defined via C's typedef facility (to be covered later). The definition of size_t
     is in the standard header stddef.h, which must be #included in order to use this type.
c)   N.B.: size_t is always defined to be some unsigned integer type, so never use size_t as the type of a
     variable used for indexing where the index could go negative (e.g., the insertion sort example in LAB 8).

3.2 ARRAYS AND THE sizeof OPERATOR
a)   Recall that the unary sizeof operator yields the size of its operand, in bytes. The operand may be a type
     name or a unary expression (see Harbison and Steele for definition), typically a variable name.
b)   The result of a sizeof operation is always of type size_t.
c)   When applied to an array name, the sizeof operator yields the size of the whole array. The number of
     elements in an array can therefore be expressed as sizeof(<array-name>)/sizeof(<array-name>[0]).
UNIT 10                                            CPS 196                                             Page 35




UNIT 10: MORE ON ARRAYS
Coverage of one-dimensional arrays continues. We discuss details of passing arrays to function parameters,
introduce the relationship between arrays and pointers and dissect array declarations to determine the type and
value of an array name.


1. ARRAY ARGUMENTS AND THE CALL-BY-VALUE RULE
a)   Recall that the values of function call arguments are copied to the corresponding function parameters ("call-
     by-value"), and that an array argument is specified by writing just the name of the array to be passed.
b)   Consider the array declarations double x[5], y[1000]; and recall the average function of UNIT
     9.2.2.1(d). What happens when the calls average( x, 5 ) and average( y, 1000 ) are made?
     The call-by-value rule suggests that these calls cause x and y, respectively, to be copied to average's a
     parameter. This implies that the amount of storage allocated to parameter a either fluctuates (how can
     average know what arguments will be passed to it at run-time?) or is so tremendous as to accommodate
     any array of doubles, however large (this would make deeply nested calls to functions with array
     parameters impractical). [Draw pictures showing the differently-sized regions of memory, for array
     arguments of different sizes, that would be required under this interpretation.]
c)   Because of the difficulties involved in actually copying a whole array to a function parameter, whole array
     arguments are not copied in function calls. Instead, a pointer to (i.e., the address of) the first element of an
     array argument is copied to the corresponding array parameter. Using this method, the storage reserved
     for any array parameter is of the same small fixed size regardless of the corresponding argument size. This
     technique of argument passing is more efficient for (potentially) large objects like arrays.


2. THE TYPE AND VALUE OF AN ARRAY NAME
a)   Since, e.g., the call average( x, 5 ) results in passing the address of the first element of array x to the
     average function, the type and value of argument x is brought into question. Were the type of x "array of
     5 ints", requiring that the value of x be the 5 int values x comprises, then the call-by-value rule would
     require that all 5 ints be copied—but we now know this isn't what is actually done.
b)   C manages to preserve the uniformity of the call-by-value rule by giving array names the appropriate
     pointer types and values. In general, given the declaration:

                    <base-type> <array-name>[<#elements>];

     C considers the type of <array-name> to be <base-type> * (“pointer to <base-type>”), and the value of
     <array-name> to be &<array-name>[0] (i.e., the address of the first element).
c)   Examples:

          Declaration                       Type                                             Value
          double z[100];                    double * ("pointer to double")                   &z[0]
          int      ia[5];            int *    ("pointer to int")                    &ia[0]
          char     s[30];            char *   ("pointer to char")                   &s[0]

d)   A more concrete Example (assuming the declaration int ra[3];—here the value of ra is ):

          indices               0                   1                  2
                                                                    
                            ra[0]               ra[1]              ra[2]

             byte      +0*sizeof(int) +1*sizeof(int) +2*sizeof(int)
             addr's


3. DECLARING ARRAYS AS FUNCTION PARAMETERS—REVISITED
a)   The fact that C array names have pointer types is reflected in the alternative (and preferred) way of
     declaring one-dimensional array parameters in function headers and prototypes.
b)   Preferred General Format of an Array Parameter Declaration (<array-name> may be omitted in a prototype):

              <base-type> *<array-name>
UNIT 10                                         CPS 196                                          Page 36



c)   Examples (compare with examples in UNIT 9, [show same-size parameter regions for different arrays]):

          double average( double *a, int neles );
          . . .
          double
          average( double *a, int neles )
          {
             double sum = 0.0;
             int    i;

              for ( i = 0; i < neles;          ++i )
                 sum += a[i];
              return sum / neles;
          }

          ---------------------------------------------------------------------------

          double dotprod( double *, double *, int );      array names missing in prototype
          . . .
          /* Compute the inner (dot) product of two real vectors of dimension dim */
          double
          dotprod( double *u, double *v, int dim )
          {
             double result = 0.0;
             int    i;

              for ( i = 0; i < dim; ++i )
                 result += u[i] * v[i];

              return result;
          }
UNIT 11                                             CPS 196                                            Page 37




UNIT 11: ARRAYS AND POINTERS
Here we provide more detail on pointer variables: how to declare and initialize them; how to use them as
function arguments and parameters. "Scaled” pointer arithmetic is introduced, and the relationship between
pointers and arrays is examined in further detail. Lastly, we show how to "circumvent" C’s call-by-value
parameter passing mechanism by passing a (copy of a) pointer to an object to a function rather than (a copy of)
the object itself.


1. POINTERS
a)   Recall that the value of a pointer represents a machine memory address.
b)   In most environments, all data pointers all of equal size, which is usually sizeof( int ). (There also
     function pointers, typically all of equal size, though this may be different from the data pointer size.)
c)   Pointers are useful since they are small, fixed size objects which can be efficiently used to access other,
     possibly much larger, objects.
d)   Format of a pointer declaration (not most general; note similarity to array parameter declaration format):

          <base-type> *<variable-name>;                  variable not given an initial value
          < base-type> *<variable-name> = <value>;       initializing declaration (value must have type <type> *)

b)   Examples:

          Declaration                                   Type                        Value
          double z[100];                                double    *                 &z[0]
          double *dp = z;                               double    *                 &z[0]
          int      *p, *q;                              int *,    int *             ?, ? (both uninitialized)
          int      *ip = (int *)100;                    int *                       100
          size_t *lp;                                   size_t    *                 ? (uninitialized)

          Note 1: if the * were missing before q above, then q would be of int type, not int * type.
          Note 2: the last example shows that pointers to user-defined types may be declared.

c)   It is important to remember that two pointers are of different types if they are declared to have different base
     types. In general, pointers of one type may be explicitly cast to another pointer type.
d)   In diagrams, we use arrows to represent pointer values, since the actual addresses don't usually matter:

                                        int k = 26;
                                        int *p = &k;  p points to k (the value of p is , but who cares?)

                                    k                                      p

                                   26                . . .                 
                 byte                                            
                 addr's
UNIT 11                                             CPS 196                                              Page 38




2. THE DEREFERENCE (OR INDIRECTION) OPERATOR
a)   The unary dereference (or indirection) operator * provides the means of extracting the value to which a
     pointer points from the value of the pointer itself. Obviously, the operand of the dereference operator must
     be of some pointer type.
b)   If <variable-name> is a variable of type <base-type> *, then the value of the expression *<variable-name> is
     the value of the object of type <base-type> to which <variable-name> points.
c)   Note that the form of a pointer declaration suggests how to interpret expressions like *<variable-name>. For
     example, the declaration int *p; is usually read "p is a pointer to an int", but it can also be interpreted
     as saying that "*p is an int".
d)   The * and & operators are inverse to one another. For example, w.r.t. to the preceding figure, *&k has the
     same value as k, and &*p has the same value as p.
e)   Examples (refer to preceding figure):

          *p = 21                      assigns 21 to k (not to p)
          *p == 21                     this expression is true
          *p = *p + 9;                 assigns 30 to k
          *&k == k;                    this expression is true (regardless of value of k)
          &*p == p;                    this expression is true (regardless of value of p)

f)   While array names may be assigned to a pointer variable of appropriate type, an array name may not be
     assigned to:

          int   ra[3] = { 12, -40, 1 }, *p, *q;

          p = ra;            this is OK (type of ra is int *)
          q = &ra[1];        this is OK (type of &ra[1] is int *)
          ra = p;            ILLEGAL! NEVER ASSIGN TO AN ARRAY NAME!!


3. POINTER ARITHMETIC
3.1 GENERALITIES
a)   Pointers can be used as operands for the binary + and - operators, provided the other operand is an
     expression of some integer type, i.e., expressions of form <pointer-expression> ± <integer-expression> are
     legitmate, and the value of such an expression has the same pointer type as <pointer-expression>.
b)   The <pointer-expression> is typically a pointer variable, but since <pointer-variable> ± <integer-expression> is a
     pointer-valued expression of the same type as <pointer-variable> which may in turn may be combined with
     an integer value, the specification <pointer-expression> ± <integer-expression> is more general.
c)   Any pointer-valued expression (not necessarily a pointer variable) can be dereferenced with the * operator.
d)   Somewhat confusing is the fact that C "scales" pointer arithmetic. This means that before the
     <integer-expression> is added to (or subtracted from) the address represented by <pointer-expression>, the
     <integer-expression> is multiplied by the size of the base type of the <pointer-expression>. This is done to
     facilitate array processing via pointers, as we shall see.
e)   Examples (refer to 2.2(f) for related declarations):


                             ra[0]               ra[1]              ra[2]
                               12                 -40                  1

             byte      +0*sizeof(int) +1*sizeof(int) +2*sizeof(int)
             addr's

                      p                   p+1                 p+2
UNIT 11                                          CPS 196                                            Page 39



              Expression              Type           Value (assuming sizeof(int) is 2)
              p                       int *          &ra[0] (= )
              ra                      int *          &ra[0] (= )
              p+1                     int *          &ra[1] (= +2)
              ra+1                    int *          &ra[1] (= +2)
              p+2                     int *          &ra[2] (= +4)
              ra+2                    int *          &ra[2] (= +4)
              p-1                     int *          &ra[-1] (= -2)
              p+10                    int *          &ra[10] (= +20)
              *p                      int            ra[0] (= 12)
              *ra                     int            ra[0] (= 12)
              *(p+1)                  int            ra[1] (= -40)
              *(ra+1)                 int            ra[1] (= -40)
              *(p+2)                  int            ra[2] (= 1)
              *(ra+2)                 int            ra[2] (= 1)
              *(p-1)                  int            ?
              *(p+10)                 int            ?
              *(ra+10)                int            ?

3.2 ACCESSING ARRAY ELEMENTS VIA A POINTER
a)   The preceding examples illustrate an important general equivalence:

          <pointer-expression>[<index_expression>]  *(<pointer-expression> + <index-expression>)

b)   Example (compare with 1.3(c)):

          double average( double *a, int neles );
          . . .
          double
          average( double *a, int neles )
          {
             double sum = 0.0;
             int    i;

              for ( i = 0; i < neles;           ++i )
                 sum += *(a+i);                                this is the only different line
              return sum / neles;
          }
UNIT 11                                           CPS 196                                                 Page 40




4. HOW TO ZAP FUNCTION ARGUMENTS
a)   The call-by-value protocol prevents functions from affecting the values of the arguments passed to them.
b)   If a function parameter receives a copy of a pointer to an object, however, the function can use that pointer
     value to change the value of the object to which that pointer refers.
c)   Examples:


                  m       n           a    b                             m        n              p    q

                  5   100     . . .   5   100                            5    100     . . .
                                                                              5
     byte                                                 byte             
     addr's                                                 addr's


     int m, n;                                                  int m,       n;
     . . .                                                      . . .
     swap( m, n );  no affect on m or n                        swap2(       &m, &n );         really swaps m and n
     . . .                                                      . . .
     void                                                       void
     swap( int a, int b )                                       swap2(       int *p, int *q )
     {                                                          {
        int temp = a;                                              int       temp = *p;

          a = b;                                                     *p = *q;
          b = temp;                                                  *q = temp;

          return;                                                    return;
     }                                                          }
UNIT 12                                          CPS 196                                           Page 41




UNIT 12: MULTI-DIMENSIONAL ARRAYS
We conclude our coverage of arrays with an introduction to multi-dimensional (mostly two-dimensional) arrays.


1. MULTI-DIMENSIONAL ARRAY GENERALITIES
a)   Just as one-dimensional (or linear) arrays use a single index value for accessing elements, multi-
     dimensional arrays use two (or more) index values for accessing elements.
b)   The shape of a particular multi-dimensional array is the number of indexes required to access any of its
     elements, together with the sizes of each dimension.
c)   Two-dimensional (or double, or rectangular) arrays require two index values for accessing elements,
     three-dimensional arrays require three index values, and, in general, n-dimensional arrays require n index
     values.
d)   Abstractly, an n-dimensional array is an object that associates a value of the base type with an ordered
     sequence of n integers i0, i1,…,in-1, where 0  ik < size of kth dimension for each k = 0, 1,…, n-1.
e)   Multi-dimensional arrays are useful in applications because one often needs to associate data values with
     combinations of indices, e.g., in graphics applications, color values are associated with two-dimensional
     screen coordinates; in a heating/ventilation application, temperatures may be associated with three-
     dimensional room coordinates.


2. BASICS OF MULTI-DIMENSIONAL ARRAYS IN C
2.1 DECLARING MULTI-DIMENSIONAL ARRAYS
a)   Unlike other languages, e.g., FORTRAN and Pascal, C does not actually provide a multi-dimensional array
     type, in the sense that an expression such as <array-name>[ i0, i1,…,in-1] can be used to access an
     n-dimensional array element.
b)   Instead, C allows the elements of an array to be arrays themselves. As always, all the elements of an array
     must have the same type, so all the elements of an array of arrays must be arrays of the same base type
     and shape.
c)   General Format of a Multi-Dimensional Array Variable Declaration:

              <base-type> <array-name>[<size-of-dim0>][<size-of-dim1>][<size-of-dimn-1>];

d)   Examples:

             int screen[480][640];                              screen is an array of 480 arrays of 640 ints
             double temperature[10][12][8];                     temperature is an array of 10 arrays of
                                                          12 arrays of 8 doubles
e)   Format of a Two-Dimensional Array Decl. with Initialization (see Harbison & Steele for general format):

              <base-type> <array-name>[n][m] = { <value0>, ..., <valuen-1> };

              where each <valuek> has form { <valuek,0>, ..., <valuek,m-1> } and each <valuek,j> is a constant
              expression of type <base-type>.

f)   Example:

             int ia[3][2] = { {-23, 12}, {0, 1}, {5,6} };
UNIT 12                                             CPS 196                                             Page 42




2.2 ACCESSING MULTI-DIMENSIONAL ARRAY ELEMENTS
a)   The elements of a multi-dimensional array are allocated contiguous (no gaps) storage.
b)   Along each dimension, positions are numbered consecutively, starting from 0. The ordered combination of
     integers associated with a given element is called its index or subscript (the individual integers are also
     referred to as indices or subscripts).
c)   The subscript operator, [], is used for accessing multi-dimensional arrays. However, for an n-dimensional
     array, the subscript operator must be applied n times to access any particular <base-type> item.
d)   General Format of a Subscript Expression (<index-expression> is an expression of any integer type):

              <array-name>[<index-expression0>][<index-expressionn-1>]

e)   Examples:

             screen[239][319] is the middle pixel (picture element) of a (640480) graphics screen
             screen[0] is an array of 640 ints, the first row of pixels of a (640480) graphics screen
             temperature[9][11][7] is the temperature at one corner of a (10128) room
             temperature[9] is a 128 array of doubles, representing the temperatures of the "last" 128
              cross-section of a (10128) room
             elements of ia are ia[0][0], ia[0][1] , ia[1][0], ia[1][1], ia[2][0] and ia[2][1]
              with values -23, 12, 0, 1, 5 and 6, respectively
             screen[10][23] == BLUE (assume BLUE is an appropriately #defined constant)
             temperature[2][5][1] = 80;

f)   The following shows the layout of a two-dimensional array in memory (assumes sizeof(int) is 2; refer to
     example 2.1.f):

                                                    ia


                       ia[0][0]ia[0][1]ia[1][0]ia[1][1]ia[2][0]ia[2][1]

              byte              +2       +4       +6       +8       +10      +12
              addr's
                              ia[0]               ia[1]              ia[2]


2.3 THE TYPE AND VALUE OF AN MULTI-DIMENSIONAL ARRAY NAME
a)   The type of a multi-dimensional array names is also a pointer type, but since each element is itself an array,
     the particular pointer type is of form "pointer to array of...<base-type>".
b)   In general, given the declaration:

              <base-type> <array-name>[s0][s1][sn-1];

     C considers the type of <array-name> to be <base-type> (*)[s1][sn-1] (“pointer to array of s1 arrays of s2
     arrays...of sn-1 elements of <base-type>”), and the value of <array-name> to be &<array-name>[0] (i.e., the
     address of the first element).
c)   Examples:

          Declaration                                    Type
          int screen[480][640];                          int (*)[640] ("pointer to array of 640 ints")
          int screen2[200][400];                         int (*)[400] ("pointer to array of 400 ints")
          int screen3[200][640];                         int (*)[640] ("pointer to array of 640 ints")
          double temperature[10][12][8];                 double (*)[12][8] ("pointer to array of 12 arrays
                                                                                   of 8 doubles")
d)   Note that the sizes of all but the first dimension of a multi-dimensional array are part of the type of the array
     name. Thus, in the above examples, screen and screen2 have different types, while screen and
     screen3 have the same type.
UNIT 12                                           CPS 196                                             Page 43




3. USING MULTI-DIMENSIONAL ARRAYS WITH FUNCTIONS—EXAMPLES




          #include <stdio.h>
          #include <stddef.h>         /* for size_t type definition */

          #define MAXROWS 3
          #define MAXCOLS 2

          void display2d( int a[][MAXCOLS], size_t nrows, size_t ncols );
          void vflip( int a[][MAXCOLS], size_t nrows, size_t ncols );
            .
            .
            .
          void
          display2d( int a[][MAXCOLS], size_t nrows, size_t ncols )
          {
             size_t i, j; /* loop counters */

              for ( i = 0; i < nrows; ++i )
              {
                 for ( j = 0; j < ncols; ++j )
                    printf( "%i", a[i][j] );
                 printf( "\n" );
              }
              return;
          }


          void          Note that the size of the first dimension is not needed (not part of type)
          vflip( int a[][MAXCOLS], size_t nrows, size_t ncols )
          {
             size_t i, j;                        /* loop counters */
             size_t midrow = nrows / 2; /* "middle" row */
             int    t;                           /* for element swapping */

              for ( i = 0; i < midrow; ++i )
                 for ( j = 0; j < ncols; ++j )
                 {
                    t = a[i][j];
                    a[i][j] = a[nrows-(i+1)][j];
                    a[nrows-(i+1)][j] = t;
                 }
              return;
          }
UNIT 12                                   CPS 196                                    Page 44




          /* another way to display a 2D array */

          #include <stdio.h>
          #include <stddef.h>

          #define MAXROWS   3
          #define MAXCOLS   2
            .
            .
            .
          void
          display_array( int *a, size_t neles )      int a[] is OK instead of int *a
          {
             size_t   j;

              for ( j = 0; j < neles;    ++j )
                 printf( "%i", a[j] );               *(a+j) is OK instead of a[j]

              return;
          }


          void
          display_2D( int (*s)[MAXCOLS] )     int s[][MAXCOLS] OK instead of (*s)[MAXCOLS]
          {
             size_t   i;

              for ( i = 0; i < MAXROWS; ++i )
              {
                 display_array( s[i], MAXCOLS );     *(s+i) is OK instead of s[i]
                 printf( "\n" );
              }
              return;
          }
UNIT 13                                            CPS 196                                             Page 45




UNIT 13: INTRO. TO STRUCTURES AND DATA ABSTRACTION
Here, structures are introduced as the principal data abstraction facility available in C. Structure type definitions
and structure object declarations are covered, and the typedef facility is introduced.


1. STRUCTURE TYPES
1.1 MOTIVATION [ILLUS.: "C TYPES"]
a)   Structure types form one of the two kinds of aggregate types in C (array types being the other kind).
b)   Recall the following about arrays:
              all elements must have the same type
              elements are accessed by combining an array type expression and an integer expression with the
               [] operator (different array elements do not have individual names, just distinct index numbers)
              an array declaration declares a variable (the array declared)
c)   A single array is therefore not very helpful for representing a group of related data items that do not all
     share the same type. For example, an individual's personnel record would contain an employee ID (an
     int, say), a name (an array of chars) and a pay rate (a double), among many other items—an array isn't
     suitable for holding data of such disparate types.
d)   Enter structures. A given structure definition comprises declarations of one or more (usually more, of
     course) named data items, called members. The members of a structure may have different types.
e)   A structure definition defines a type, not a variable. Member declarations look like variable declarations, but
     no storage is allocated for members by a structure definition. Following the definition of a structure type,
     however, variables may be declared to have that structure type.
f)   A variable of some structure type is allocated sufficient storage to hold all the members declared in the
     definition of that type. We may then speak of the members of that structure variable.
g)   A member of a structure variable is accessed by combining the variable name and a member name with the
     . (direct selection) operator

1.2 DEFINING A STRUCTURE TYPE & DECLARING STRUCTURE VARIABLES
a)   Format of a Structure Type Definition:

              struct <tag>            <tag> is an identifier that names a particular structure type (can be the
              {                        same as a variable/function name, but keep distinct for C++ compatibility)
                 <declaration>
                 <declaration>        N.B.: initializing declarations are not allowed inside a structure definition
                 ...                        (this is sensible since we're defining a type, not declaring a variable)
                 <declaration>
              };

b)   Examples:

          struct item  N.B.: this is a cross-section of the "parallel" arrays holding billing data in LAB 7
          {
             int     id;          /* id number */
             int     quantity; /* number of units */
             double  price;       /* price per unit */
          };

          struct frac
          {
             int nu; /* numerator */
             int de; /* denominator */
          };

c)   Format of a Structure Variable Declaration:

              struct <tag> <variable-name>;              assumes struct <tag> type is already defined
UNIT 13                                              CPS 196                                        Page 46



d)   Examples:

                struct item it, it_ra[10];                       N.B.: arrays of structs may be declared
                struct frac q, r, frac_ra[50];

1.3 ACCESSING STRUCTURE MEMBERS [ILLUS.: PRECEDENCE OF C OPERATORS]
a)   Members of a structure variable are accessed using the . (direct selection) operator, which takes two
     operands: the left operand must be an expression of some structure type (e.g., a structure variable) and
     the right operand must be the name of a member of that structure type. Note that the order of the operands
     of the direct selection operator is important.
b)   General Format of a Direct Selection Expression:

                 <struct-expression>.<member-name>

c)   Examples:

                it.id = 123;                  set id member of it to 123
                it.price /= 2.0;              item it now at half-price
                it.price * it.quantity        total cost ("extended price") of item it
                it_ra[9].price                price of the item at index 9 (precedence of [] is higher than .)
                q.nu = 22;                    set numerator of q to 22
                q.de = 7;                     set denominator of q to 7
                r.id                          ERROR! id is not a member of the frac structure type
                frac_ra[5].nu = -11;          set numerator of the frac at index 5 in frac_ra to -11


1.4 HOW STRUCTURES ARE STORED IN MEMORY
a)   The members of a structure object occupy storage in the order they appear in the structure definition.
     However, members may not be strictly contiguous in all environments (i.e., the first byte of a member may
     not immediately follow the last byte of the preceding member).
b)   The following illustrates a typical way the item structure may be stored in memory (we assume 2-byte ints
     and 8-byte doubles, and that there are no gaps between members):


                                                           it


                               it.id   it.quant                 it.price

               byte                   +2        +4                                      +12
               addr's


c)   Here's how the array of items declared above may be laid out in memory [show how to access the
     members of various elements of it_ra]:
d)

                                                         it_ra


                                                     . . . . .

              byte                               +12               +108                +120
              addr's
                                 it_ra[0]                                  it_ra[9]


2. DATA ABSTRACTION
a)   The importance of structures is far greater than the mere convenience afforded by them may suggest, since
     structures facilitate data abstraction in C.
UNIT 13                                            CPS 196                                             Page 47



b)   By data abstraction, we mean techniques of keeping details about the internal structure of data objects
     hidden, except on a "need-to-know" basis (information hiding).
c)   Hiding details is accomplished by limiting access to the internal structure of data objects of a given type to a
     corresponding set of type management functions, in which all operations on objects of that type are
     performed—all other code (user, or client, code) must call these functions in order to manipulate data
     objects of that type. User code knows what operations the management routines perform, but not how they
     are implemented.
d)   Data abstraction thus makes it possible to change data representations without changing user code.
e)   The idea of data abstraction requires some getting used to, and will be best absorbed by paying careful
     attention to illustrative examples in class and in lab as we progress through the semester. It will help to
     bear in mind that conceptual objects are typically defined by their attributes. Remember the following
     correspondences between our mental constructs and C structures:
             type of conceptual object  struct type
             instance of a type of conceptual object  variable or other data object of struct type
             attributes of a particular instance  members of a particular data object of struct type
f)    [Ask students for conceptual objects and turn them into C structs, e.g., poker hand, personnel record, etc.]


3. THE typedef FACILITY
a)   Consider the declaration of a billing item, e.g., struct item it;—it's hard to be declare items
     abstractly, since the very declaration itself gives away the fact that this type is implemented using a
     structure. It would be better (more abstract) if we could just declare such an item as follows: item it;
b)   C's typedef facility provides a way to associate a single name with any C type, no matter how complex.
     The "typedef'd name" may then be used as a synonym for the associated type.
c)   Identifiers defined using the typedef facility are said to be user-defined types. Note that one cannot
     actually create new types, but only provide punchy, one-word names for types already representable in the
     C declaration syntax.
d)   A variable declaration with a user-defined type gives no clue regarding how that type is actually defined.
     The typedef facility thus helps programmers hide information from user code.
e)   We call a user-defined type, together with the set of related type management functions, an abstract data
     type (ADT for short).
f)   General Format of a Type Definition via typedef:

              if <declaration> is a variable declaration (say the variable's name is v), then
                                       typedef <declaration>
              defines v as a synonym for the type of v in <declaration>. (N.B.: only typedef <declaration> is
              coded; if <declaration> and typedef <declaration> were both coded, this would cause a compile-
              time error, since typedef names and variable names are in the same "name space".)

g)   Examples:

             The declaration unsigned size_t; makes size_t a variable of unsigned type, so
              the definition typedef unsigned size_t; makes size_t a synonym for unsigned.
              One can now write size_t x, y; instead of unsigned x, y;
              (This is exactly how many compilers define the size_t type; recall that size_t is a user-defined
              type.)
             The declaration int array[100]; makes array an array of 100 ints, so the definition
              typedef int array[100]; makes array a synonym for the type "array of 100 ints"
             The declaration struct item item; makes item a variable of struct item type, so
              typedef struct item item; makes item a synonym for struct item.
              One can now write item it; instead of struct item it;
             The declaration struct frac frac; makes frac a variable of struct frac type, so the
              definition typedef struct frac frac; makes frac a synonym for struct frac. Thus,
              frac q, r; declares both q and r to be fracs)
UNIT 14                                          CPS 196                                          Page 48




UNIT 14: MORE ON STRUCTURES AND DATA ABSTRACTION
We continue our examination of structures and data abstraction. Additional examples of structure type
definitions using the typedef facility are presented, as are structures containing members of aggregate types
and pointers to structures. Initialization of structure variables within declarations and passing structures to
function parameters are also covered. Lastly, we demonstrate how data abstraction allows client code to be
independent of any particular implementation of an abstract data type.


1. CONCISE TYPE DEFINITIONS FOR STRUCTURE TYPES
a)   As a general rule, programmers should aim for highest practical degree of abstraction in their code. Thus,
     structure types should be typedef'd. There is a less verbose way to do this than shown last time.
b)   A structure type may be defined and typedef'd at the same time, as follows:

          typedef struct <tag>opt        as <type-name> will be used hereafter, we don't need <tag>
          {
              <declaration>
              . . .
          }
          <type-name>;           <type-name> is customarily the same as <tag>

c)   After the above typedef, variables may be declared to have this structure type as follows:

          <type-name> <variable-name> [, <variable-name>, ...]opt ;

c)   Examples:

          typedef struct item               <tag> not needed, but OK to have it
          {
             int      id;       /* id number */
             int      quantity; /* number of units */
             double   price;    /* price per unit */
          }
          item;

          typedef struct                               <tag> is omitted as it's not needed
          {
             int nu; /* numerator */
             int de; /* denominator */
          }
          frac;

          item   i1, i2;                      N.B.: much higher level of abstraction!
          frac   p, q, r;                             N.B.: much higher level of abstraction!
UNIT 14                                         CPS 196                                          Page 49




2. STRUCTURES WITH MEMBERS OF AGGREGATE TYPES
The members of a given structure type are specified by ordinary (non-initializing) declarations. There are no
restrictions on the type of a structure member, so a structure member may be of some aggregate type (except
that a structure cannot contain a member of the same type as the structure itself, to avoid infinite regress).

2.1 NESTED STRUCTURES
a)   If a structure type includes a structure-type member, that member is said to be a nested structure.
b)   Nested structures are ubiquitous in non-trivial applications, as many conceptual objects have non-scalar
     attributes.
c)   Examples:

          typedef struct mixed_number
          {
             int   whole;    /* whole part */
             frac fraction; /* fractional part */        nested structure
          }
          mixed_number;
          --------------------------------------------------------------------------
          typedef struct point
          {
             int x; /* x coordinate */
             int y; /* y coordinate */
          }
          point;

          typedef struct rect
          {
             point uleft;   /* upper-left corner */                    nested structure
             point lright; /* lower-right corner */                    nested structure
             int   color;   /* rendering color   */
          }
          rect;

d)   Members of nested structures may be accessed using the direct selection operator . in the obvious way.
     Recall that . is left associative, so parentheses are not needed when several . operators are adjacent.
e)   Examples:

          mixed_number   sp;          take sp to be a stock price, which is usually a mixed number
          ...
          printf( "share cost: %i %i/%i", sp.whole, sp.fraction.nu, sp.fraction.de );
          --------------------------------------------------------------------------
          rect r;
          ...
          printf( "coordinates of upper-left corner are: (%i,%i)",
                                                                 r.uleft.x, r.uleft.y );

2.2 ARRAYS AS STRUCTURE MEMBERS
a)   Arrays, the other aggregate type, may also be members of structures. Note that the . and [] operators
     have the same precedence and (left) associativity, so parentheses usually aren't necessary.
b)   Example:

          typedef struct address
          {
             char street[100]
             char city[50];
             char state[20];
             char zip[10];
          }
          address;
UNIT 14                                               CPS 196                                        Page 50


          typedef struct employee
          {
             char     name[100];
             address addr;                      nested structure
             ...
          }
          employee;

          employee emp;
          ...
          printf( "first letter of emp's state is: %c", emp.addr.state[0] );


3. POINTERS TO STRUCTURES
a)   Any valid type in C can be the base type of a pointer type, including structure types.
b)   If <base-type> is a structure type, then <base-type> * is the type "pointer to structure of type <base-type>"
c)   As one would expect, pointers to structures of different type are of different types themselves.
d)   If <base-type> is a structure type, then the type of <name> in the array declaration <base-type>
     <name>[<#eles>] is <base-type> *, as expected.
e)   Examples:

          item it, *pit = &it;                  *pit has type item, so pit has type item *
          ...
          (*pit).quantity += 10;                increase quantity member of it by 10 (the parentheses are
                                                necessary—see precedence chart)

                                         it                                                   pit


                     id   quant.              price                       . . . .

       byte              +2      +4                           +12                        
       addr's




4. INITIALIZATION OF STRUCTURE VARIABLES UPON DECLARATION
a)   An initializer for a structure variable is a sequence of constant expressions of the appropriate types,
     separated by commas and placed between a pair of braces.
b)   The number of constant expressions must equal the number of members of the structure variable, and the
     types of the constant expressions must be the same as the corresponding members (in the same order).
c)   The above rules apply to nested structures recursively. A constant expression initializing an array member
     is just an array initializer as previously discussed.
d)   N.B.: structure initializers cannot be used outside of structure declarations; this is similar to the case for
     arrays. However, assigning a structure object to a structure variable is allowed and entails the member-by-
     member assignments one would desire; this is unlike the case for arrays, since an array variable cannot be
     the LHS of an assignment (and, moreover, on the RHS of an assignment, an expression of array type is
     taken as a pointer type, so no element-by-element would be done in any case).
e)   Examples:

          item it = { 12345, 10, 29.95 };                        ten $29.95 purchases of product #12345
          frac p = { 22, 7 }, q = p;                             p is 22/7 (approximately ); q is 22/7 also
          mixed_number  sp = { 51, {5, 8} };                     sp is 51 5/8 (note nested initializer)
          rect r = { {0, 0}, {479, 639}, 3 };                    r has corners at (0,0) and (479,639), color 3
UNIT 14                                         CPS 196                                           Page 51




5. THE JOY OF DATA ABSTRACTION
We've seen the definition of the frac type, for representing fractions, as a structure type. We could also have
associated the frac type name with a different implementation:

          typedef int frac[2];

This defines frac as a synonym for "array of 2 ints"; we assume the first int respresents the numerator and
the second int represents the denominator.

It's important to note that, whether one defines the frac type as a structure or as an array, the programmer
need only code frac q, r, ...; to declare variables to have the frac type. If all direct access to the
components of the frac type is done in a fixed set of type management functions (and nowhere else), then we
will have achieved a high degree of data abstraction, since the client code will then be independent of the
implementation details of the frac type. Most significantly, client code will not have to change if the
implementation of the frac type (that is, the type definition and management function definitions supporting
fraction operations) changes from the structure representation to the array representation or vice versa.
UNIT 15                                           CPS 196                                           Page 52




UNIT 15: SWITCH STATEMENTS AND ENUM TYPES
Multi-way branching with the switch statement and the use of enumerated types are discussed.



1. THE SWITCH STATEMENT
a)   In programming, one often decides among alternative processing paths based on the value of some
     expression. We know we can code such multi-way decisions by using a series of cascaded if-else
     statements, for example:

             if ( v == 1 )
             {
                <do stuff>
             }
             else if ( v == 2 )
             {
                <do some other stuff>
             }
             ...
             else if ( v == n )
             {
                <do something else again>
             }
             else
             {
                <do if v isn't 1,2,...,n>
             }

b)   In the above code, it isn't clear—without examining every if condition—that all we're doing is selecting one
     of several sequences of statements to execute based on the value of the same expression, v,
c)   C provides a clear and elegant way to code multi-way decisions based on the value of a single integer-
     valued expression—the switch statement.
d)   Typical Format of switch statements:

             switch ( <integer-valued-expr> )
             {
             case <integer-constant-expr1>:
                <zero or more statements>
                break;
             case <integer-constant-expr2>:
                <zero or more statements>
                break;
             ...
             case <integer-constant-exprn>:
                <zero or more statements>
                break;
             default:
                <zero or more statements>
             }

e)   Each case <integer-constant-expri>: clause is treated as a label for the immediately following statements.
f)   Execution of a switch statement begins by evaluating <integer-valued-expr>. If its value is the same as that
     of <integer-constant-expri>, then execution continues at the statement immediately following the
     case <integer-constant-expri>: label. If no <integer-constant-expri> matches the value of <integer-valued-
     expr>, execution continues at the statement immediately following the default: label.
g)   Executing a break statement inside a switch statement causes execution to continue at the first
     statement following the entire switch statement. (This is similar to the action of break statements inside
     loops.)
h)   You need not terminate the sequence of statements following each case label with a break statement, but,
     if you don't, execution will "fall-through" to the statements following the next case label. Because fall-
UNIT 15                                           CPS 196                                           Page 53



     through is so often unintended (you just forget to code the break statement), you must always comment
     your use of this feature to aid future maintenance.
i)   There need not be any statements at all following a case label. This enables you to associate multiple
     case labels with the same sequence of statements.
j)   Strictly speaking, the default case need not be specified, but it's good programming practice to handle the
     situation where <integer-valued-expr> doesn't evaluate to any of the expected, or legitimate, values.
     Therefore, you should always provide a default case. If all legitimate values are covered by other cases,
     the default case is a good place for error-handling code.
k)   The following code is precisely equivalent to that of (a), but much more perspicuous:

              switch ( v )
              {
              case 1:
                 <do stuff>
                 break;
              case 2:
                 <do some other stuff>
                 break;
              ...
              case n:
                 <do something else again>
                 break;
              default:
                 <do if v isn't 1,2,...,n>
              }



2. ENUMERATION TYPES
a)   Consider an int variable, affiliation, whose value represents a political party with which a voter may
     be affiliated. Say the value 1 stands for the Democratic party, 2 for the Republican party, and 3 represents
     an independent, or unaffiliated, voter.
b)   To perform different processing based on party affiliation, a switch statement is ideal:

              switch ( affiliation )
              {
              case 1:
                 blame_everything_on_the_rich();
                 break;
              case 2:
                 blame_everything_on_the_poor();
                 break;
              case 3:
                 blame_everything_on_democrats_and_republicans();
                 break;
              default:
                 send_voter_registration_literature();
              }

c)   The problem with the above code is that 1, 2 and 3 are magic numbers (literals without obvious
     significance). Case labels like Democratic, Republican, etc., would make things much clearer. Better
     to define appropriately named symbolic constants to represent these numbers. In C, we might use the pre-
     processor's macro facility for this, i.e., we could place the pre-processor commands

          #define DEMOCRATIC 1
          #define REPUBLICAN 2
          #define INDEPENDENT 3

     before the declaration of any variable used to represent a party affiliation. In modern C, it’s better to
     declare the appropriate integer constants as follows (the older usage, as shown above, persists however):

          const int DEMOCRATIC = 1;
          const int REPUBLICAN = 2;
          const int INDEPENDENT = 3;
UNIT 15                                              CPS 196                                             Page 54




     Once the appropriate symbolic constants have been established, we can code the example in (b) above as:

              switch ( affiliation )
              {
              case DEMOCRATIC:
                 blame_everything_on_the_rich();
                 break;
              case REPUBLICAN:
                 blame_everything_on_the_poor();
                 break;
              case INDEPENDENT:
                 blame_everything_on_democrats_and_republicans();
                 break;
              default:
                 send_voter_registration_literature();
              }

d)   The preceding code is certainly an improvement, but there are some infelicities in this approach:

             all those const declarations (or #defines) are a little verbose
             the int type must be used to declare variables to represent a party affiliation—it would be more
              perspicuous to have a "party" type whose values are 1, 2 and 3

     We should point out that the const approach is superior to using #defines. The compiler sees only the
     literal constants substituted for the macro names by the pre-processor—thus, the compiler has no type
     cues to help it detect a mis-use of a "party" type value, as it would via the const approach.

e)   All the above infelicities can be (at least partially) overcome through the use of an enumerated type. The
     definition of an enumerated type entails the specification of a list of identifiers, used as symbolic constants
     (called enumeration constants), whose values form the range of that type. You may then declare
     variables and parameters of that type, and thereby take advantage of the type-checking facilities of the
     compiler.
f)   Format of an Enumerated Type Definition:

              enum <tag>                          <tag> is an identifier that distinguishes a particular enum type
              {                                    (can be the same as a variable/function name in C, but keep
                                                    distinct for C++ compatibility)
                   <enum-constant-definition>,
                   <enum-constant-definition>,
                   ...
                   <enum-constant-definition>
              };

     where each <enum-constant-definition> is either a simple identifier, which is given a default value (see
     below), or an initializer of form <identifier> = <integer-constant-expr>, which overrides the default value. The
     type thus defined is enum <tag>.
g)   The default value for the first enumeration constant is 0. The default value for each succeeding
     enumeration constant is 1 greater than the previous enumeration constant (whether set by default or by
     explicit initialization).
h)   It is customary to use a zero-valued enumeration constant to represent a "bad" value (unless a "good"
     enumeration constant must receive the zero value for application purposes).
i)   Examples of Enumerated Type Definitions:

             enum party { Bad_party, Democratic, Republican, Independent };
              (Bad_party has value 0, Democratic 1, Republican 2 and Independent 3)
             enum small_num { Zero, One, Two, Three };
              (Zero has value 0, One value 1, Two value 2, and Three value 3)
             enum color { Bad_color, Black = 10, Blue, Green, Red = 20, White };
              (Bad_color has value 0, Black has value 10, Blue 11, Green 12, Red 20, and White 21)
             enum { Bad_type, Clothing, Linen, Food, Equipment };
              (this defines enumeration constants Bad_type, etc., but not an enumerated type – no tag)
UNIT 15                                           CPS 196                                          Page 55




j)   Format of an Enumerated Type Variable Declaration:

              enum <tag> <variable-name>;              assumes enum <tag> type is already defined

k)   Examples of Enumerated Type Variable Declarations:

             enum party affiliation = Democratic;  initializing declarations OK
             enum color background_color, pallette[50];  enum arrays can be declared

l) As with structure types, it's best to use the typedef facility to maximize abstractness. The enumeration
   tag is not needed when typedef'ing an enumerated type.
m) Examples of Typedefs with Enumerated Types::

             typedef enum party party;      typedef for previously defined enum party type
             typedef enum { Bad_type, Clothing, Linen, Food, Equipment } itemtype;

n)   Using Typedef'd Enumerated Types in Variable Declarations:

             party affiliation,
                    focus_group[] = { Bad_party, Independent, Democratic, Republican };
             itemtype what_I_bought = Equipment;

o)   Finally, let's re-do our switch statement example one more time, this time the best way we can do it in C:

              switch ( affiliation )
              {
              case Democratic:
                 blame_everything_on_the_rich();
                 break;
              case Republican:
                 blame_everything_on_the_poor();
                 break;
              case Independent:
                 blame_everything_on_democrats_and_republicans();
                 break;
              default:
                 send_voter_registration_literature();
              }

p)   Question: What does the following switch statement do?:

              switch ( count )
              {
              case 3:
                 printf( "A foolish consistency is the hobgoblin of little minds" );
              case 2:
                 printf( "A foolish consistency is the hobgoblin of little minds" );
              case 1:
                 printf( "A foolish consistency is the hobgoblin of little minds" );
              default:
              }
UNIT 16                                           CPS 196                                            Page 56




UNIT 16: STRING PROCESSING
We introduce the rudiments of string processing in C: string constants, how strings are stored in memory, the
use of char arrays and char *'s as string variables, and some basic string operations.


1. MOTIVATION
a)   In order to do computations involving text, as opposed to purely numerically-oriented processing, we must
     be able to declare, initialize, copy, compare, read and write objects whose values represent chunks of text
     (character strings, or just strings).
b)   A simple example of a string processing application would be a program that reads in some text and writes
     it out with all abbreviations expanded.
c)   More sophisticated examples of text processing applications include spell-checkers, grammar-checkers and
     on-line thesauri.
d)   Of course, any program involving objects that include string information (e.g., an employee record could be
     a structure with an employee name member, a string) may be said to be doing string processing.
e)   Clearly, any general purpose programming language should provide decent string processing facilities. It
     may therefore come as a surprise that C does not have a built-in string type. Nevertheless, C does support
     string processing in its own idiosyncratic way.


2. STRING CONSTANTS
     We have frequently used such constants as the "control string" arguments of printf and scanf calls, and
     occasionally, as subsequent printf arguments in conjunction with the %s conversion specification.

2.1 SYNTAX
a)   Naively, a string constant is just a sequence of characters enclosed in double quotes. However, the issue
     of including non-graphic characters in string constants requires a more careful definition of string constant.
b)   Format of a String Constant:

              "<char-spec><char-spec>..."

     where each <char-spec> is either an ordinary graphic character (a,...,z,A,...,Z,0,1,...,9,+,-, etc.), or an
     escape character (\n,\t,\",\\, etc.).
c)   It is legitimate for no <char-spec>'s to appear within the double quotes, in which case we have a string
     constant representing an empty string, i.e., a string with no characters.
d)   Very long string constants (too long to conveniently fit on one line) can be split into several shorter string
     constants. As long as nothing but whitespace characters separate the pieces, the compiler will treat the
     adjacent string constants as one big string constant.
e)   Characters whose codes cannot be expressed by a graphic or standard escape character may have their
     character codes explicitly specified in hexadecimal. The general format is \xhh, where each h is replaced
     by a hexadecimal digit. For example, \x1A is a <char-spec> for the character with character code 1A16
     (decimal equivalent: 26)—this is the character code of <Ctrl-Z> in ASCII.
f)   Examples:

             "Hello, World!"
             "Hi, Mom.\n"                              note that \n represents just one character (newline)
             "\"Murder,\" she says"                    note how the internal double quotes are escaped
             "wait!\\no!"                              if backslash not escaped, this string would print as
                                                         wait!
                                                         o!
             "Yo, " "my " "man"                        treated as if "Yo, my man" were coded
             "This is The End.\x1A"                    this string ends with the <Ctrl-Z> character

2.2 HOW STRING CONSTANTS ARE STORED IN MEMORY
a)   A string constant is allocated one byte for every <char-spec> it contains, plus one more byte to hold the
     special null character, which is used as a string delimiter. That is, in memory, a string constant value is
     always terminated with a null character just as, in source code, a string constant is always terminated by a
     final double quote character. Do not confuse the null character with the NULL pointer.
b)   The null character is specified as follows: \0. Its character code is always 0.
UNIT 16                                            CPS 196                                             Page 57



c)   Here's how the string constant "ABC" is stored in memory:


                            A      B      C      \0
               byte            +1    +2     +3
               addr's

d)   Thus, while the "length" of the string constant "ABC" is 3, its "size" (the amount of memory required for its
     storage) is 4. In general, strings of length n require n+1 bytes of memory.


3. THE "STRING" TYPE IN C
a)   As previously mentioned, C has no built-in string type.
b)   Instead, any sequence of char's delimited by a null character is considered to be a string.
c)   The natural place to store a string is clearly an array of char's.
d)   Thus, as far as C is concerned, the type of a string is char *. For example, in 2.2(c) above, the type of
     "ABC" is char * and its value is  (of course, we think of its value as "ABC").
e)   Sometimes a char array itself is called a string if it's used to store a string. It is also customary to call a
     char * object a string if it points to an area of storage holding a string. This ambiguity can be confusing.
f)   Arrays of strings are frequently employed. They are arrays of pointers to strings held elsewhere in storage.
g)   Examples of String Declarations [draw the pictures that follow]:

              #define MAXLEN 3                                    greatest expected string length (not including
              ...                                                  the terminal null character)
              char s[MAXLEN+1] = {'Y','o','\0'};                  the +1 is for the null character
              char ss[] = {'Y','o','\0'};                         array size inferred from array initializer
              char t[MAXLEN+1] = "Yo";                            note abbreviated initialization syntax (use it!)
              char tt[] = "Yo";                                   array size inferred from string constant
              char *name = "Robert";                              name points to 'R' in "Robert"
              char *addr[] =                                      an array of strings
              {
                 "Rufus T. Firefly",
                 "5678 W. Chauncey Street",
                 "Brooklyn, NY 11299",
                 NULL                                             a NULL "sentinel" is often used to signal
              };                                                   the end of an array of strings

h)   Questions:

             What is the value of sizeof( s )?
             What is the length of string s?
             What is the value of sizeof( ss )?
             What is the length of string ss?
             What is the value of sizeof( name )?
             What is the length of string name?
             What is the value of sizeof( addr[0] )?
             What is the length of addr[0]?
             What is the value of sizeof( addr )?
             What is the value of sizeof( addr )/sizeof( addr[0] )?
             What is the type of addr?
UNIT 16                                                 CPS 196                                     Page 58



                                          s                                         t


                             Y      o     \0               ...        Y         o       \0
              byte               +1   +2     +3                          +1   +2      +3
              addr's


                                 name


                                          ...       R     o       b       e     r    t       \0

               byte                            
               addr's




                                 addr[]

                         0                              "Rufus T. Firefly"
                         1                              "5678 W. Chauncey Street"
                         2                              "Brooklyn, NY 11299"




i)   The C standard library provides so many functions that manipulate null-delimited sequences of characters
     that it's almost like C does have a string type. String functions not related to I/O are declared in the
     standard header string.h; their names begin with str…. String I/O functions are declared in stdio.h.


4. STRING I/O—FIRST STEPS
4.1 READING IN STRING VALUES WITH scanf
a)   The %s conversion specifier in a scanf control string instructs scanf to read a whitespace-delimited string
     from the standard input stream and store it at the address specified by the corresponding scanf argument,
     which must be of char * type. We say the corresponding argument addresses the input buffer.
b)   The char * argument is usually the name of some char array. It may also be any expression of type
     char * that points to a char array or a dynamically allocated block of storage.
c)   %s conversion proceeds by skipping initial whitespace (this is the case for all scanf conversion
     specifications except %c and %[…]), then storing all input characters up to—but not including—the next
     whitespace character into the corresponding scanf argument. Finally, a null character is stored after the
     last input character, which makes a genuine string aof the input buffer.
d)   N.B.: scanf cannot read in a string with embedded whitespace (like a full name or book title) with a single
     %s conversion. Multi-word strings require multiple %s conversions, each with a corresponding char *
     argument. It is difficult to preserve the whitespace "as-is" from the input stream. For example, if the input
     stream contains "Hi, Mom" (without the quotes), the call scanf( "%s", buffer ); only reads "Hi,"
     (again, without the quotes) into the input buffer. This makes scanf of limited usefulness for string input.
UNIT 16                                           CPS 196                                           Page 59



e)   Examples (refer to Examples 3(g) for variable declarations):

             scanf( "%s", s );                                reads whitespace-delimited string into s array
             char *p = s; scanf( "%s", p );                   reads whitespace-delimited string into s array
             scanf( "%s%s", s, t );                      reads two whitespace-delimited strings into s and t

f)   Questions:

             What happens if the next whitespace-delimited string is longer than 3 characters in the above?
             How could we faithfully read an arbitrary sentence into a char array with scanf?
             Can we use scanf to read in an empty string?

g)   All the scanf calls in (e) above are dangerous, since scanf will cheerfully overrun the input buffer if the
     next whitespace-delimited string in the input stream is longer than the buffer size.
h)   Luckily, you may supply a maximum field width as part of the %s conversion specifier. scanf will stop
     reading characters into the input buffer when the number of characters stored equals the maximum field
     width of the corresponding conversion specifier. The maximum field width must be specified as a positive
     decimal integer constant. Here are some examples:

             scanf( "%3s", s );                     reads at most 3 input characters into s
             char *p = s; scanf( "%3s", p );        reads at most 3 input characters into s
             scanf( "%s%3s", s, t );        reads at most 3 input characters into t

i)   Always specify a maximum field width when coding an %s conversion.
j)   Note that a null character is always stored to terminate the string, whether or not a maximum field width is
     specified. When a maximum field is specified, at most (maximum field width + 1) characters are stored.

4.2 WRITING OUT STRING VALUES WITH printf
a)   The %s conversion specifier in a printf control string instructs printf to write a string value to the
     standard output stream from the corresponding printf argument, which must be of char * type and
     which specifies the "output buffer" address. (We've already used this conversion before; see Lab 4, etc.)
b)   %s conversion proceeds by writing out all the characters in the corresponding output buffer up to—but not
     including—the terminal null character. There is no special treatment for string values with initial, embedded
     or trailing whitespace.
c)   Examples (refer to Examples 3(g) for variable declarations):

             printf( "%s", s );                                 writes string in s array to stdout
             char *p = s; printf( "%s", s );                    writes string in s array to stdout
             printf( "%s\n%s", s, t );                          writes strings in s and t on separate lines
             for ( i = 0; addr[i] != NULL; ++i )                writes elements of addr string array on
                 printf( "%s\n", addr[i] );                       on separate lines
UNIT 17                                            CPS 196                                            Page 60




UNIT 17: MORE ON STRING PROCESSING
We continue our coverage of string processing by presenting common operations on strings and the standard
library functions that perform them. We also extend our repertoire of standard library string I/O functions.


1. OBTAINING THE LENGTH OF A STRING
a)   Below, we give three ways to determine the length of a string, not including the terminal null character. We
     saved the best way for last.

          #include <stddef.h>         /* for size_t definition */
          #include <string.h>         /* for strlen prototype */

          #define MAXLEN 100
          ...
          char    s[MAXLEN+1] = "Zut!", *p = s;
          size_t len;
          ...
          for ( len = 0; s[len] != '\0'; ++len )
              ; /* do nothing */                  an empty statement (comment as shown)
          ...
          while ( *p++ != '\0' )
              ; /* do nothing */
          len = (size_t) (p - s - 1);
          ...
          len = strlen( s );

b)   Here's the prototype of the standard library function strlen (the definition could be similar to the above):

          size_t strlen( char *s );

c)   Do not confuse the sizeof operator and the strlen function.             In the above example, the value of
     sizeof( s ) is 101, but the value of strlen( s ) is 4.


2. COPYING A STRING
a)   Here are three progressively cleverer ways of copying a string to s from t:

          #include <stddef.h>         /* for size_t definition */
          #include <string.h>         /* for strcpy prototype */

          #define MAXLEN 100
          ...
          char    s[MAXLEN+1], t[MAXLEN+1] = "dasypygial", *p = s, *q = t;
          size_t i;
          ...
          for ( i = 0; (s[i] = t[i]) != '\0'; ++i )
              ; /* do nothing */
          ...
          while ( (*p++ = *q++) != '\0' )
              ; /* do nothing */
          ...
          strcpy( s, t );

b)   In the for loop header, note how the boolean expression both copies characters and terminates the loop
     when the null character terminating the string in t is copied to s.
c)   The while loop introduces a new operator, the unary, left-associative post-increment operator (++),
     whose sole operand may be of any scalar type. A glance at the operator precedence chart shows that
     *p++ has the implicit parenthesization *(p++). The value of a post-increment sub-expression like p++ is
     simply the original value of its operand, p. However, after the sub-expression value is used in the larger
     expression, the value of the post-increment operand is incremented. Thus, the expression *p++ = *q++
     copies the current value of *q to *p, then increments the pointers p and q so they point to the next
     characters in s and t, respectively. When q points to the terminal null character of the string in t, the value
UNIT 17                                             CPS 196                                             Page 61



     of the expression *p++ = *q++ is '\0' (which, recall, has int value zero)—this ends the loop. There is
     also a unary post-decrement operator, which the analogous semantics.
d)   Here's the prototype for the standard library function strcpy is given below. strcpy copies char's
     starting at address src to successive char's starting at address dest. The return value is the original
     value of dest. (The definition of strcpy could be similar to one of the first two copy methods above.)

          char * strcpy( char *dest, char *src );

e)   The return value of strcpy facilitates nesting string functions. For example, the value of the expression

          strlen( strcpy( s, t ) )

     is the length of the string s following the copy of string t to string s.
f)   The strcpy function is dangerous because the destination (or target) area may be too small to hold the
     source string. There is another standard library function, strncpy, whose prototype is given below. No
     more than size many char's are copied from src to dest in a strncpy call. Typically, the size
     parameter is set to the size of the destination string. The return value is the original value of dest.

          char *strncpy( char *dest, char *src, size_t size );

g)   Unfortunately, if the source string in a strncpy call has length greater than or equal to the size argument,
     a null character won't be stored, so the destination "string" won't really be a string at all. We now present a
     parameterized macro that is safer than both strcpy and strncpy. We call it STRSCPY (for STRing Safe
     CoPY). Note how the backslash-newline combination is used to split the long macro definition over several
     lines:

          #define STRSCPY(s,t,n) ( ((n) > 0) \
                                   ? (strncpy((s),(t),(n)-1), (s)[(n)-1]='\0', (s)) \
                                   : (s) )

h)   The STRSCPY macro introduces a new operator, the binary, left-associative sequential evaluation (also:
     sequence or comma) operator (,). This operator has the lowest precedence of all the C operators. The
     two operands may be expressions of any type, but the right operand must produce a value (i.e., it cannot be
     of void type). The operands are evaluated from left to right, and the value of a sequence of expressions
     separated by sequence operators is simply the value of the last expression in the sequence. For example,
     the expression (x = 2.0, y = 8.0 * x, z = sqrt( y )) has value 4.0. Despite this double
     duty for the comma, it's usually quite easy to distinguish the use of a comma as a sequence operator from
     its use as a function argument separator.
i)   In the STRSCPY macro definition above, the evaluation of the comma expression
     (strncpy((s),(t),(n)-1), (s)[(n)-1]='\0', (s)) unfolds as follows: first, strncpy is called;
                  th
     next, the n character in string s is set to the null character; finally, the rightmost comma operand is
     evaluated, making the value of the entire comma expression the original value of s. Thus, the invocation
     STRSCPY(s,t,n) copies at most n-1 characters from t to s, and always terminates the destination string
     with a null character.


3. COMPARING STRINGS
a)   Here are three progressively cleverer ways of comparing a string s with a string t, yielding a zero result if
     they are equal, a negative result if s is less than t, and a positive result if s is less than t. All comparisons
     of strings are made with respect to the character code, e.g., ASCII, used in the environment):
UNIT 17                                            CPS 196                                            Page 62


          #include <string.h>         /* for strcmp prototype */

          #define MAXLEN 100
          ...
          char s[MAXLEN+1] = "Al", t[MAXLEN+1] = "Bill", *p = s, *q = t;
          int    i, result;
          ...
          for ( i = 0; s[i] == t[i]; ++i )
              if ( s[i] == '\0' )
                 result = 0;
          result = (s[i] < t[i]) ? -1 : +1;
          ...
          for ( ; *p == *q; ++p, ++q )
              if ( *p == '\0' )
                 result = 0;
          result = (*p < *q) ? -1 : +1;
          ...
          result = strcmp( s, t );

b)   Here's the prototype of the standard library function strcmp (the definition could be similar to the above):

          int strcmp( char *s, char *t );

c)   The positive and negative values returned by strcmp need not be +1 and -1, respectively, so your code
     must never rely on any particular non-zero return values.



4. MORE ON STRING I/O
4.1 READING IN ARBITRARY STRINGS WITH gets
a)   The standard library gets reads in a whole line from the standard input stream, verbatim. Initial and/or
     embedded and/or trailing whitespace is preserved, unlike the case with scanf. Here's the prototype, which
     appears in the standard header stdio.h:

          char *gets( char *s );

b)   More precisely, when gets is called, characters are read from the standard input stream and stored at
     consecutive positions in the input buffer s until a newline ('\n') character is read. The newline character
     terminates the current input line. It is not stored in the input buffer. Instead, a null character is appended
     after the last input character stored in the buffer, so that the input buffer contains a true string.
c)   If a gets call is successful, the return value is the original value of its argument. Upon failure, NULL is
     returned (this will be the case if the end-of-file has been reached.)
d)   Since the gets function receives no other information besides the address of the input buffer, buffer
     overflow is a potential problem. Defensive programming requires that we use the related, but safer, fgets
     function instead. We will be cover fgets soon, after we introduce some needed concepts of general file
     I/O in C.

4.2 WRITING OUT STRINGS WITH puts
a)   The standard library puts writes a string to the standard output stream. Here's the prototype, which
     appears in the standard header stdio.h:

          int puts( char *s );

b)   More precisely, when puts is called, the characters of the string to which s points are written to the
     standard output stream, up to—but not including—the terminating null character. A newline character
     ('\n') is then written to the standard output stream (whether or not string s contains a newline or not).
c)   If an error occurs, puts returns the value of the symbolic constant EOF (#defined in stdio.h);
     otherwise, it returns some non-negative value.
d)   Note how well gets and puts are matched: the input function clips newlines and stores nulls; the output
     function clips nulls and writes newlines.
e)   There is a fputs function to go with the fgets function, to be covered in due course.
UNIT 18                                              CPS 196                                            Page 63




UNIT 18: CHARACTER PROCESSING FACILITIES AND FILE I/O
We now formally cover C's character processing facilities (we've already had a peek at them in L ABS 16 and 17).
We also introduce the basics of file processing in C.


1. CHARACTER PROCESSING GENERALITIES
a)   Conceptually, a character is the smallest unit of text. In C, it is the smallest significant part of a string.
b)   Hence, to process text means processing its constituent characters. For example, text processing
     programs often need to test whether a character is a digit or a letter or whitespace or punctuation, etc.
     Conversion between upper- and lower-case letters may also be required.
c)   Convenient facilities for working with individual characters are therefore indicated.
d)   The standard header ctype.h provides easy-to-use character processing facilities. (We use the non-
     committal word "facilities" here advisedly since the ctype.h facilities are implemented as functions in
     some environments and as macros in others. Either way, their affects must be the same.)


2. THE CTYPE.H FACILITIES
a)   There are two kinds of facilities provided by ctype.h:
            character classification facilities, each of which tests whether a character belongs to a particular
             class of characters, and
            character conversion facilities, which translate characters in some fashion.

2.1 CHARACTER CLASSIFICATION FACILITIES
2.1.1 Generalities
a)   Here is the generic prototype for all the ctype.h classification facilities:

          int is<char-class>( int c );

b)   In the above generic prototype, <char-class> names a character class. If the character whose code is
     passed to parameter c belongs to character class <char-class>, then is<char-class> returns true (some non-
     zero value); otherwise, is<char-class> returns false (zero).
c)   Thus, all the ctype.h classification facilities are predicates.

2.1.2 Individual Facilities (there are others, see C Reference Manual)
a)   Prototypes and descriptions for particular character classification facilities are given below. There are other
     classification facilities, but the following are the most basic. Students may consult C A Reference Manual
     for full particulars.

       int   isdigit(    int   c   );    is c a decimal digit ('0'–'9')?
       int   isalpha(    int   c   );    is c alphabetic ('a'–'z' or 'A'–'Z')?
       int   isalnum(    int   c   );    is c alphabetic or a decimal digit ('a'–'z', 'A'–'Z' or '0'–'9')?
       int   isupper(    int   c   );    is c upper-case alphabetic ('A'–'Z')?
       int   islower(    int   c   );    is c lower-case alphabetic ('a'–'z')?
       int   isspace(    int   c   );    is c a whitespace character (' ', '\t', '\n', '\r', '\f', or '\v')?
       int   iscntrl(    int   c   );    is c a "control" char (char code-dependent; in ASCII codes 0–31, 128)?
       int   isprint(    int   c   );    is c a "printable" character (i.e., not a control character)?
       int   ispunct(    int   c   );    is c a "punctuation" char (i.e., not whitespace, digit, alpha or control)?
UNIT 18                                           CPS 196                                            Page 64



2.1.3 A Classification Example—Testing Whether a String is an Identifier
          #include <ctype.h>
          ...
          int
          is_identifier( char *s )
          {
              char c;

              if ( (c = *s++) == '\0' )                          an empty string is not an identifier
                 return 0;

              if ( !(isalpha( c ) || c == '_') )     identifiers begin with a letter or underscore…
                 return 0;
              while ( (c = *s++) != '\0' )           …continue with letters, digits or underscores
                 if ( !(isalnum( c ) || c == '_') )
                    return 0;

              return 1;
          }

2.2 CHARACTER CONVERSION FACILITIES
a)   There are two basic standard character conversion facilities: toupper and tolower. Each have a single
     parameter of int type, whose value is treated as a character code. Each returns a value of int type, also
     a character code.
b)   If the argument to toupper is a code for a lower-case letter, the corresponding upper-case letter code is
     returned. If the argument value is not a lower-case letter code, the unchanged argument value is returned.
     tolower works the same way, mutatis mutandis. Here are their prototypes:

          int toupper( int c );               return upper-case version of c, if c a lower-case letter
          int tolower( int c );               return lower-case version of c, if c an upper-case letter


2.2.1 A Conversion Example—Translating a String to all Upper-Case Letters
          #include <ctype.h>
          ...
          char *
          strupr( char *s )
          {
              char *t = s;                                                save original pointer for return value

              while ( (*s = toupper( *s )) != '\0' )                      chg. lower-case to upper-case letters
                 ++s;

              return t;
          }


3. INTRODUCTION TO GENERAL FILE PROCESSING
3.1 STREAMS AND FILES
a)   So far, we've done I/O using the standard functions scanf/printf, getchar/putchar, gets/puts.
b)   All the above input functions read from the standard input stream (keyboard); similarly, all the above output
     functions write to the standard output stream (screen). So…what exactly is a stream?
c)   Conceptually, a stream is any source of, or destination for, data. Up to now, the keyboard has been our
     only source of data, and the screen our only destination. In general, however, a disk file or other physical
     device (e.g., printer, CD-ROM, chemical instrument, music synthesizer, drafting machine) may be a data
     source or destination.
d)   C uses objects of the abstract data type FILE to hold information about streams. FILE objects serve as
     internal (i.e., in-memory) representatives of streams, which are external files or devices.
e)   The FILE type is defined, via the typedef facility, in the standard header file stdio.h.
f)   All the standard I/O functions operate on streams indirectly by operating on the associated FILE objects.
UNIT 18                                             CPS 196                                                Page 65



g) Thus, before reading/writing from/to a stream, one must create and initialize an internal FILE object to
   represent the external file or device. We call this opening the stream.
h) Once opened, we may process the stream, i.e., read from and/or write to it, via the appropriate standard
   I/O functions. Each of these I/O functions has (implicitly or explicitly) one argument of FILE * type, which
   addresses the FILE, and hence the stream, to be operated upon.
i) When we have finished reading and/or writing to a stream, and so no longer require it, we signal this by
   breaking the connection between the external stream and the internal FILE object that represents it. This
   is called closing the stream. Note that a stream cannot be accessed after it is closed.
j) This open/process/close progression characterizes all stream-handling programs.
k) The I/O functions we've covered so far implicitly operate on standard streams. The standard streams are
   represented by global objects of FILE * type, declared in stdio.h. These are automatically opened
   before main receives control and closed when main returns. This explains why we haven't seen explicit
   open/close operations or explicit use of FILE * objects previously.
l) Standard library functions that read from standard input implicitly use the FILE * object stdin, while
   functions that write to standard output implicitly use the FILE * object stdout. There is a third standard
   FILE * object, stderr, which represents a third standard stream, the standard error stream. The
   standard error stream is a useful destination for error/debugging messages (see example below). stderr
   has the same default association as stdout (the screen), but this may be changed to advantage.
m) (Under most OS's, a lot more than the foregoing happens when a stream is opened, e.g., security checking,
   preventing access by other programs, etc.)
n) (Under many OS's, a stream may have to be closed in order to allow for it to be opened by another
   program. For a stream to be re-opened by the same program, presumably in a different access mode—
   e.g., so one can read after writing, it must be closed first.)

3.2 OPENING STREAMS WITH fopen
a)   An external file (or device) is connected with an internal FILE object by the fopen function, which has the
     following prototype:

          FILE *fopen( char *name, char *openmode );

     where name is a string specifying the name of the file (or device) and openmode is a string specifying how a
     file may be accessed (see below).
b)   The fopen function returns a pointer to a FILE structure that it sets up for I/O with the named file (or
     device). If the named stream cannot be opened in the mode specified, fopen fails, returning the value
     NULL. If the fopen call is successful, we say the named stream is open.
c)   The return value must be saved in a FILE * variable, because, after fopen is called, the opened stream is
     no longer referred to by name, but rather by the associated FILE pointer.
d)   Always check that the return value of an fopen call is not NULL before proceeding.
e)   The format of stream names is environment-dependent. We'll use names suitable for most environments.
f)   We now present some particulars of the commonest open modes:

          "r"     open an existing file for input only ("read-only" access); it's an error if file doesn't exist

          "w"     open a file for output ("write-only" access); if file already exists, it is overwritten so that the
                  previous contents are lost; if file doesn't exist, a new file is created

          "a"     open a file for output in "append" mode; if file already exists, new output is written after the
                  existing contents; if file doesn't exist, a new file is created

e)   It is an error to read from a file opened for output ("w" or "a" open mode) or write to a file opened for input
     ("r" open mode).
f)   (A stream may be opened for input and output by appending a plus sign (+) to one of the above modes.)

3.3 PROCESSING STREAMS WITH fscanf/fprintf
a)   The fscanf and fprintf functions are just like scanf and printf, respectively, but they each take an
     additional FILE * type parameter that specifies the stream to be accessed. Here are their prototypes:

          int fscanf( FILE *fp, char *fmt, ... );
          int fprintf( FILE *fp, char *fmt, ... );
UNIT 18                                           CPS 196                                          Page 66



c)   The FILE * argument in an fscanf call specifies the stream to be used as the input source; similarly, in
     an fprintf call, the FILE * argument specifies the stream to be used as the output destination.
d)   The remaining arguments, and the meanings of the return values, are the same as for scanf/printf.
e)   In most C standard library implementations, calls to scanf and printf actually result in calls to fscanf
     and fprintf, respectively, as follows:

          scanf( "...", ... );               results in call   fscanf( stdin, "...", ... );
          printf( "...", ... );              results in call   fprintf( stdout, "...", ... );


3.4 CLOSING STREAMS WITH fclose
a)   A FILE object may be disconnected from the open stream it represents by calling the fclose function,
     which has the following prototype:

          int fclose( FILE *fp );

     where fp points to the FILE object representing the stream.
b)   The fclose function returns the int value 0 upon success. If an error is detected, fclose returns the
     value of the symbolic constant EOF. Casual programmers seldom check the return of an fclose call, but,
     of course, it is safest to do so. Attempting to close a stream that isn't currently open is a common error.
c)   If the fclose call is successful, the stream is said to be closed.
UNIT 18                                  CPS 196                                     Page 67




3.5 A SIMPLE STREAM PROCESSING EXAMPLE




          #include <stdio.h>
          #include <stddef.h>

          int
          main( void )
          {
              FILE *in, *out;                /* input and output file pointers */
              char *infile = "numbers.dat"; /* input file name */
              char *outfile = "newnums.dat"; /* output file name */

              if ( (in = fopen( infile, "r" )) == NULL )
              {
                 fprintf( stderr, "cannot open %s\n", infile );
                 return ( 1 );
              }
              if ( (out = fopen( outfile, "w" )) == NULL )
              {
                 fprintf( stderr, "cannot open %s\n", outfile );
                 return ( 1 );
              }

              while ( fscanf( in, "%i", &number ) == 1 )    reads file of whitespace-separated
                 fprintf( out, "%i\n", number );            numbers, writes out one-per-line

              if ( fclose( in ) == EOF )
              {
                 fprintf( stderr, "cannot close %s\n", infile );
                 return ( 1 );
              }
              if ( fclose( out ) == EOF )
              {
                 fprintf( stderr, "cannot close %s\n", outfile );
                 return ( 1 );
              }

              return ( 0 );
          }
UNIT 19                                           CPS 196                                           Page 68




UNIT 19: MORE ON FILE I/O AND COMMAND LINE ARGUMENTS
We wrap-up our coverage of file processing, and discuss how to use command line arguments.


1. MORE STDIO.H STREAM I/O FUNCTIONS
1.1 CHARACTER I/O WITH fgetc/fputc
a)   The fgetc and fputc functions are just like getchar and putchar, respectively, but they each take an
     additional FILE * type parameter that specifies the stream to be accessed.
b)   Here are their prototypes:

          fgetc( FILE *fp );
          fputc( int c, FILE *fp );

c)   The FILE * argument in an fgetc call specifies the stream to be used as the input source; similarly, in an
     fputc call, the FILE * argument specifies the stream to be used as the output destination.
d)   In most C standard library implementations, calls to getchar and putchar actually result in calls to
     fgetc and fputc, respectively, as follows:

          getchar();              results in call  fgetc( stdin );
          putchar( c ); results in call     fputc( c, stdout );


1.2 STRING I/O WITH fgets/fputs
1.2.1 fgets
a)   The fgets function is similar to gets, but with some important differences. Here is the prototype:

          char *fgets( char *s, int n, FILE *fp );

b)   The FILE * type parameter, fp, specifies the stream from which to read and the int parameter, n,
     specifies the size of the input buffer to which s points. (A gets call specifies only the input buffer.)
c)   Characters are read from the stream associated with fp and stored in the input buffer to which s points until
     one of the following occurs (whichever comes first):

             a newline character ('\n') is read into the input buffer
             end-of-file is encountered
             n1 characters have been read into the input buffer

d)   A string-terminating null character ('\0') is appended after the last character read into the input buffer.
e)   Note that, if input is terminated because a newline was read, the newline is stored in the input buffer.
     (Recall that gets does not store line-terminating newlines!)
f)   Note further that no more than n characters can be stored in the input buffer, so, if n specifies the size of
     the input buffer, overflow is impossible. (Recall that gets has no such overflow protection!)
g)   The return value of a fgets call is the original value of its argument (same as gets).
UNIT 19                                            CPS 196                                            Page 69



1.2.2 fputs
a)   The standard library fputs is like puts, with a couple of differences. One is that fputs takes an
     additional FILE * type parameter that specifies the stream to which output is sent. Here's the prototype:

          int fputs( char *s, FILE *fp );

b)   When fputs is called, all the characters of the string in the output buffer to which s points (up to, but not
     including the terminating null character) are written to the stream associated with fp. Unlike the case with
     the gets function, a newline character ('\n') is not written to the output stream following the last character
     of the string.
c)   If an error occurs, puts returns the value of the symbolic constant EOF (#defined in stdio.h);
     otherwise, it returns some non-negative value.
d)   Note how well fgets and fputs are matched: assuming no buffer overflow, the input function stores a
     newline; the output function writes any newlines it finds in the output buffer.

1.3 BINARY I/O WITH fread/fwrite
a)   This pair of functions is intended for binary I/O, i.e., the transfer of bit-for-bit copies of internal data to
     external media. This is useful for, among other applications, record-oriented I/O of data held in structures.
b)   Consult C A Reference Manual for full particulars.

1.4 REDIRECTING FILE POINTERS WITH freopen
a)   Suppose a program must perform the same processing on a number of files, that may vary from execution
     to execution (say it must do word counts on a bunch of text files). How many FILE * variables should be
     declared?
b)   One way to do it is to just declare a single FILE * variable, and re-use it by associating it with with each
     file in turn using ordinary fopen and fclose calls.
c)   There is a better way, however: using the freopen function. Here's the prototype:

          FILE *freopen( char *name, char *openmode, FILE *fp );

     where name is a string specifying the name of the stream and openmode is a string specifying how the
     stream may be accessed—just as for fopen; fp must point to some FILE object.
d)   When freopen is called, the following happens, in the order shown:

             if a stream is currently associated with fp it is closed, as though via fclose( fp );
             the named stream is opened in the open mode specified, as though by a call to fopen; however,
              the stream is associated with the FILE object to which fp points, not some new FILE object.
             the value of fp is returned, unless an error occurs, in which case NULL is returned.

e)   Thus, freopen re-associates a given FILE object (and so the any FILE * that points to it) with a new
     stream. This re-association technique is particularly valuable for re-associating the standard FILE *'s
     stdin, stdout and stderr with files or devices other the defaults—in fact, using freopen with a
     standard stream argument accomplishes the same thing as command line re-direction.


2. COMMAND LINE ARGUMENT PROCESSING
2.1 MOTIVATION
a)   Most operating systems (e.g., DOS and UNIX) allow users to execute programs by specifying their names
     at the command prompt. Thus, an executable file name is a command name.
b)   Programs provided by the OS for purposes of file management, e.g., copy and word count programs, permit
     the user to specify the names of the files to processed on the command line. For example, the command
     "copy filea fileb" tells DOS to copy the file named filea to the file named fileb.
c)   Often, users may specify options that modify the behavior of programs on the command line as well. For
     example, the command "rm goofus" tells UNIX to delete the file named goofus, but the command "rm -
     i goofus" tells UNIX to delete the same file only after requesting confirmation. By convention, options are
     introduced by a "switch" character, e.g., '-' in UNIX and '/' in DOS.
d)   The tokens (file names and options) that a user enters on the command line are called command line
     arguments. By convention, options precede file names on the command line. The question before us now
     is: how can we access these command line arguments in our programs?
UNIT 19                                        CPS 196                                        Page 70



2.2 ACCESSING COMMAND LINE ARGUMENTS VIA main WITH PARAMETERS
a)   Command line parameters may be accessed by coding an alternate form of the main function header which
     specifies two parameters instead of none. Here's the corresponding new prototype:

          int main( int argc, char *argv[] );

b)   The parameter argc (argument count) is passed the number of arguments on the command line, including
     the command name itself. Thus, argc is always at least 1. The parameter argv (argument values) is an
     array of strings that is initialized to contain all the command line arguments.
c)   argv[0] always points to the command name (executable file name) and argv[1],
     fargv[2],…,argv[argc1] point to the remaining command line arguments, in the order that they appear
     on the command line. argv[argc] is always a NULL pointer. Here's the picture:

                        argv[]

                   0                                  <executable file name>
                   1                              <first command line argument>
                           .                                    .
                           .                                    .
                           .                                    .

            argc-1                                 <last command line argument>

               argc




d)   Example (assuming the command line "lese auf wiedersehen"):

                                    argv[]

                               0                      "lese"

                               1                      "auf"

                               2                      "wiedersehen"

                               3




e)   The actual argv[0] value one gets may be more complicated than the above. For example, most DOS-
     based environments provide a full path name for the executable file.
UNIT 19                                          CPS 196                                        Page 71




2.3 SOME SIMPLE COMMAND LINE ARGUMENT PROCESSING PROGRAMS


          /*
          ** args.c - echo command line arguments to standard output
          */

          #include <stdio.h>

          int
          main( int argc, char *argv[] )
          {
              int  i;

              for ( i = 0; i < argc; ++i )
                 printf( "command line argument #%i: %s\n", i, argv[i] );

              return ( 0 );
          }



          ---------------------------------------------------------------------------



          (compare the following program with cat.c of UNIT 7.1.1(g))

          /*
          ** cat2.c - copies the file named on the command line to standard output;
          **          if no file is named, copies standard input to standard output
          */

          #include <stdio.h>

          int
          main( int argc, char *argv[] )
          {
              if ( argc > 1 )                       true iff there is a command line argument
              {
                 if ( freopen( argv[1], "r", stdin ) == NULL )
                 {
                    fprintf( stderr, "%s: cannot open '%s'\n", argv[0], argv[1] );
                    return 1;
                 }
              }

              while ( (c = getchar()) != EOF )                 reads from file or stdin, depending on
                 putchar( c );                                  presence of command line argument

              return 0;
          }



          ---------------------------------------------------------------------------
UNIT 19                                   CPS 196                                 Page 72




          /*
          ** opts.c - demonstrates technique of parsing command line options
          */

          #include <stdio.h>

          int
          main( int argc, char *argv[] )
          {
              char *msg = "This line is repeated according to -n<count> option";
              unsigned i, count = 1; /* default repeat count is 1 */

              /* parse command line options */
              for ( i = 1; i < argc && argv[i][0] == '-'; ++i )
              {
                switch ( argv[i][1] )
                 {
                 case 'n':                          repeat count follows -n, e.g., -n23
                    sscanf( argv[i]+2, "%u", &count ); break;  like scanf, but from string
                 case 'c':                          -c option means capitalize msg
                    strupr( msg ); break;           strupr def'd in UNIT 21.2.2.1
                 case 'h':                          'h' for help ('?' is also common)
                 default:
                    fprintf( stderr, "usage: %s [-n<count>] [-c]\n", argv[0] );
                    return 1;
                 }
              }

              while ( count-- )
                 printf( "%s\n", msg );

              return 0;
          }
UNIT 20                                            CPS 196                                                Page 73


UNIT 20: ABSTRACT DATA TYPES: AN EXTENDED EXAMPLE
In this session we present an extended example of data abstraction in action, using the frac type (for
representing fractions) discussed previously.

The following points should be ept in mind as you study the example:

             A type is a set of values plus a set of operations on those values.

             The set of values of an abstract data type (ADT) in C is the set of values that can be
              assumed by the ADT's data component type (typically some structure type, as is the
              case below).

             The typedef facility is used to provide a meaningful one-word name for a structure
              type.

             The relevant operations on values of an ADT are implemented as functions which
              have some parameters and/or return values of that type.

             Users, or clients, of an ADT can declare variables to be of that type, and call type-
              related functions to manipulate objects of that type—but they never, NEVER peek
              inside the actual structure of such objects (the type-related functions do all the dirty
              work).

             C has no means to enforce the no-peeking rule (C++ etc. do)—it’s a “gentleperson’s
              agreement”.

             In well-modularized programs, the code comprising an ADT is often split into two
              files, a .h header file (e.g., frac.h) containing the relevant typedefs and function
              declarations (i.e., prototypes), and a .c implementation file (e.g., frac.c)
              containing the definitions of the functions declared in the corresponding header file.

             The header file is #included in every source file that wants to use the
              corresponding type. A .c/.h file pair that defines an ADT is often called a type
              manager or module (implementing that type).

             The interface of an ADT consists of the type name (e.g., frac) and the type-related
              function prototypes.

             The implementation of an ADT consists of the actual type definition (e.g., struct
              frac { int nu; int de; }) and type-related function definitions.

             A type manager's header file corresponds to an ADT's interface and a type
              manager's implementation file corresponds to an ADT's implementation. Note that
              the header file contains not only the entire abstract interface, but also the actual type
              definition, which is properly part of the implementation. This is a breach of
              abstraction, since client code which #includes the header file has access to the
              innards of the type. Unfortunately, this cannot be avoided in C. (C++, etc., can
              dodge this bullet, however.)

             Structure-valued arguments are passed to structure parameters of functions in the
              obvious way: call-by-value, as always. What might this mean for large structures?
       UNIT 20                 CPS 196             Page 74




/*
** frac.c - working with a user-defined fraction type.
*/


/*- INCLUDES -----------------------------------------------*/

#include <stdio.h>


/*- TYPE DEFINITIONS ---------------------------------------*/

typedef struct frac
{
   int nu;     /* numerator */
   int de;     /* denominator */
}
frac;



/*- FUNCTION PROTOTYPES ------------------------------------*/

frac      frac_create( int n, int d );
int       frac_numer( frac a );
int       frac_denom( frac a );
frac      frac_read( void );
void      frac_write( frac a );
frac      frac_add( frac a, frac b );
frac      frac_adjust_asm( frac a );
frac      frac_div( frac a, frac b );
frac      frac_invert( frac a );
frac      frac_mult( frac a, frac b );
frac      frac_sub( frac a, frac b );
int       frac_eq( frac a, frac b );
int       frac_gt( frac a, frac b );
int       frac_lt( frac a, frac b );
     UNIT 20                CPS 196                 Page 75

/-* TEST DRIVER --------------------------------------------*/

int
main( void )
{
    frac q, r, s, t, bad = frac_create( 1, 0 );

    for (;;)
    {
       printf( "Enter three fractions: " );
       if ( frac_eq( (q = frac_read()), bad )
            || frac_eq( (r = frac_read()), bad )
            || frac_eq( (s = frac_read()), bad ) )
          break;
       frac_write( q ); printf( "\n" );
       frac_write( r ); printf( "\n" );
       frac_write( s ); printf( "\n" );
       printf( "%i/%i == %i/%i: %s\n",
                frac_numer( r ), frac_denom( r ),
                frac_numer( s ), frac_denom( s ),
                frac_eq( r, s ) ? "true" : "false" );
       . . .  other relational operation tests
       t = frac_add( r, s );
       printf( "%i/%i + %i/%i: %i/%i\n",
                frac_numer( r ), frac_denom( r ),
                frac_numer( s ), frac_denom( s ),
                frac_numer( t ), frac_denom( t ) );
       . . .  other arithmetic operation tests
       t = frac_div( r, s );
       printf( "%i/%i / %i/%i: %i/%i\n",
                frac_numer( r ), frac_denom( r ),
                frac_numer( s ), frac_denom( s ),
                frac_numer( t ), frac_denom( t ) );
       t = frac_add( frac_invert( frac_mult( r, s ) ), q );
       printf( "1/(%i/%i * %i/%i) + %i/%i: %i/%i\n",
                frac_numer( r ), frac_denom( r ),
                frac_numer( s ), frac_denom( s ),
                frac_numer( q ), frac_denom( q ),
                frac_numer( t ), frac_denom( t ) );
    }

    return 0;
}
    UNIT 20                CPS 196               Page 76

/-* ARITHMETIC OPERATION IMPLEMENTATIONS -------------------*/

frac
frac_add( frac a, frac b )
{
   frac r;

     r.nu = a.nu*b.de + a.de*b.nu;
     r.de = a.de * b.de;

     return frac_adjust_asm( r );
}



frac
frac_div( frac a, frac b )
{
   return frac_mult( a, frac_invert( b ) );
}


frac
frac_mult( frac a, frac b )
{
   frac r;    /* r to be product of a and b */

     r.nu = a.nu * b.nu;
     r.de = a.de * b.de;

     return frac_adjust_asm( r );
}


frac
frac_sub( frac a, frac b )
{
   frac r;

     r.nu = a.nu*b.de - a.de*b.nu;
     r.de = a.de * b.de;

     return frac_adjust_asm( r );
}
    UNIT 20                 CPS 196                 Page 77
frac
frac_adjust_asm( frac a )
{
   if ( a.de == 0 )
   {
      /* product is meaningless, use standard form 1/0 */
      a.nu = 1;
   }
   else if ( a.nu == 0 )
   {
      /* product is 0, use standard form 0/1 */
      a.de = 1;
   }
   else if ( a.de < 0 )
   {
      /* denominator is negative, reverse signs */
      a.nu = -a.nu;
      a.de = -a.de;
   }
   return a;
}


frac
frac_invert( frac a )
{
   int   temp = a.nu;   /* use to invert nu and de */

   a.nu = a.de;   /* complete inversion of nu and de */
   a.de = temp;   /* " */

   /* now adjust the inverted fraction */
   if ( a.nu == 0 || a.de == 0 )
   {
      /* original fraction is 0 or meaningless; inverse is 1/0 */
      a.nu = 1;
      a.de = 0;
   }
   else if ( a.de < 0 )
   {
      /* nu, de both negative or nu positive, de negative */
      a.nu = -a.nu;
      a.de = -a.de;
   }
     UNIT 20                CPS 196                Page 78
    return a;
}




/-* RELATIONAL OPERATION IMPLEMENTATIONS -------------------*/



int
frac_eq( frac a, frac b )
{
    return (a.nu*b.de == a.de*b.nu) ? 1: 0;
}




int
frac_gt( frac a, frac b )
{
    return (a.nu*b.de > a.de*b.nu) ? 1: 0;
}




int
frac_lt( frac a, frac b )
{
    return (a.nu*b.de < a.de*b.nu) ? 1: 0;
}
     UNIT 20                  CPS 196              Page 79




/-* FRACTION CREATION FROM COMPONENTS & COMPONENT ACCESS ---*/



frac
frac_create( int n, int d )
{
   frac r;

    r.nu = n;
    r.de = d;

    return frac_adjust_asm( r );
}




int
frac_numer( frac a )
{
    return a.nu;
}




int
frac_denom( frac a )
{
    return a.de;
}
     UNIT 20                CPS 196                Page 80




/-* I/O OPERATION IMPLEMENTATIONS --------------------------*/




frac
frac_read( void )
{
   frac r;    /* fraction to be initialized from user input */

    if ( scanf( "%i / %i", &r.nu, &r.de ) != 2 )
    {
       r.nu = 1;
       r.de = 0;
    }

    return frac_adjust_asm( r );
}




void
frac_write( frac a )
{
   printf( "%i/%i", a.nu, a.de );

    return;
}
UNIT 21                                            CPS 196                                            Page 81




UNIT 21: STRUCTURE POINTERS AND DYNAMIC ALLOCATION
We present how structures can be accessed using pointers and the indirect selection operator (->), including
the use of structure pointers as function parameters. An introduction to the concepts of dynamic storage
allocation follows, including coverage of the major pitfalls involved.


1. USING POINTERS TO STRUCTURES
1.1 MOTIVATION
a)   Structure arguments to functions are copied in their entirety to the corresponding formal parameters. This
     stands in contrast to the case for the other aggregate type, arrays. (Why? Because the value of an array
     variable is an address, but the value of a structure variable is the whole structure object, so the whole
     structure is copied under C's call-by-value rule.)
b)   If structure arguments are large, much time will be lost in the parameter passing process.
c)   It thus makes sense to avoid the time delays and storage overhead involved in passing large structures to
     function parameters by passing pointers to structures instead.
d)   Another reason to pass a pointer to structure instead of the structure itself is to enable a function to change
     the value of a structure (not a copy of it).

1.2 DECLARING STRUCTURE POINTERS (REVIEW)
a)   General Format of a Structure Pointer Declaration:

              <structure-type> *<variable-name>;

b)   Examples:

          item   it, *pit = &it;                        pit points to it
          item   *it_ra[15];                            it_ra is an array of 15 pointers to item structs
          frac   p = { 22, 7 }, *pp = &p;               pp points to p

c)   Recall that pointers to structures of different types are of different types themselves, so, e.g., pp and pit
     in Examples 1.2(b) above have different types.

1.3 ACCESSING STRUCTURES VIA POINTERS WITH THE -> OPERATOR
a)   Members of a structure type object are accessed via pointers using the -> (indirect selection) operator.
b)   This operator takes two operands: the left operand must be an expression of type "pointer to some
     structure type" and the right operand must be the name of a member of that structure type. Note that the
     order of the operands of the indirect selection operator is important.
c)   General Format of an Indirect Selection Expression:

              <struct-pointer-expression> -> <member-name>

d)   Note that the expression <struct-pointer-expression> -> <member-name> always has the same value as the
     expression (*<struct-pointer-expression>).<member-name>.
e)   Examples (refer to Examples 1.2(b) above):

             pit->id = 123;    set id member of it to 123 (same effect as it.id = 123;
                                 or (*pit).id = 123;)
             pit->price /= 2.0;                item it now at half-price
             pit->price * pit->quantity        total cost ("extended price") of item it
             it_ra[9]->price                   price of the item pointed to by it_ra[9]
             (*it_ra[9]).price                 same as above (precedence of [] higher than *)
             pp->nu = 22;      set numerator of p to 22 (like p.nu = 22; or (*pp).nu = 22;)
             pp->de = 7;       set denominator of p to 7 (like p.de = 7; or (*pp).de = 7;)

1.4 STRUCTURE POINTERS AS FUNCTION PARAMETERS
a)   Inside the function body, the indirect selection operator is used to access the members of the structure
     object that a structure pointer parameter addresses.
b)   Structure pointer parameters allow functions to change the objects addressed by those parameters.
UNIT 21                                            CPS 196                                            Page 82



c)   Example:

          item    item_ra[MAX_IDS] = { { 111, 5, 9.00 }, ... };

          double
          single_item_cost( item *itp )                                    argument must be an item pointer
          {
             return itp->quantity * itp->price;
          }

          double
          bill_cost( item *b, size_t neles )
          {
             size_t   i;
             double   tot_cost = 0.0;

             for ( i = 0; i < neles; ++i )
                tot_cost += single_item_cost( &b[i] );                     note & operator

              return ( tot_cost );
          }
          ...
          printf( "cost = $%.2f\n", bill_cost( item_ra, MAX_IDS ) );


2. DYNAMIC ALLOCATION
2.1 MOTIVATION
a)   Most structures one encounters in practice are larger than pointers. Structures can be hundreds or
     thousands of bytes (think of an employee record). Even a simple structure type like item requires more
     space than pointer.
b)   Consider a program that reads values into an array of large structures—say the size of a structure object is
     1000 bytes. Suppose further that the program must be able to initialize anywhere from 0 to 100 array
     elements this way. The obvious way to accommodate these requirements is as follows:

                  0                           <big structure #0>
                                                      .
                                                      .
                                                      .



                k-1                          <big structure #k-1>
                  k                                  unused


                                                      .
                                                      .
                                                      .




                 99                                  unused

c)   Note that, if only 1 array element is initialized, 99000 bytes of storage are utterly wasted.
d)   Now assume there is a way to obtain additional storage, as needed, by explicit request. We could then use
     an array of pointers to the structure type in question. When initial values for the next structure are read in,
     we would request just enough space for one structure object, copy the values to this space, and store the
     address of this newly allocated storage area in the next available array element. Here's the picture:
UNIT 21                                           CPS 196                                           Page 83



           0
                                                              <big structure #0>
                    .
                    .                                                  .
                    .                                                  .
                                                                       .


          k-1
                                                             <big structure #k-1>
            k    unused

                    .
                    .
                    .




          99     unused

e)   Clearly, this is more conservative of storage than using an array of the structures themselves, so long as
     there are sufficiently many unused array elements.
f)   Moreover, when the structure size is appreciably larger than pointer size, the amount of space "wasted" for
     pointer storage is neglible compared to the corresponding array of structures. In the immediately preceding
     examples, the array of structures requires 100000 bytes and a "full" array of pointers to structures requires
     100000+100*sizeof( pointer ) bytes, which would be 100200 bytes, assuming 2-byte pointers—a .2%
     "overhead" for pointer storage in the worst case.
g)   Allocation of new storage by explicit request at execution time is called dynamic allocation.

2.2 DYNAMICALLY ALLOCATING STORAGE USING malloc
a)   The standard C library provides the malloc function for dynamically allocating storage as needed while a
     program is executing. An "explicit request" for new storage is just a call to the malloc function.
b)   In order to use malloc, you must #include the standard header file stdlib.h, which provides the
     following prototype (the void * type will be discussed shortly):

          void *malloc( size_t );

c)   The sole argument of a malloc call specifies the number of bytes of storage to be allocated, i.e., given the
     invocation malloc( n ), malloc tries to grab a block of n contiguous bytes of storage.
d)   There is a pool of storage from which malloc attempts to grant dynamic allocation requests. This pool is
     called the heap or free store.
e)   If malloc succeeds in finding a block of storage of the size requested, it returns the address of the first
     byte of this storage block. Of course, malloc also ensures that the storage just allocated is no longer
     considered part of the heap—this prevents allocating the same storage in response to multiple requests.
f)   If the amount of contiguous storage requested isn't available, the special value NULL is returned (NULL is
     #define’d in the standard header stddef.h).
g)   Every call to malloc must check its return value for validity against NULL.
h)   Two other standard functions, calloc and realloc, dynamically allocate storage, but malloc is the most
     basic, and the other allocation functions can be simulated with the help of malloc.

2.3 RELEASING DYNAMICALLY ALLOCATED STORAGE USING free
a)   Just as the need for dynamically allocated storage may arise at some point during program execution, at
     some later point this storage may no longer be needed (think of temporary "scratch" space, or of a program
     that proceeds in several "phases", where the storage dynamically allocated in one phase may not be
     needed in the next phase).
b)   The standard C library provides the free function to release dynamically allocated storage back to the
     heap, so that it may be re-used to satisfy other requests for storage.
c)   In order to use free, you must #include the standard header file stdlib.h, which provides the following
     prototype:
UNIT 21                                           CPS 196                                            Page 84




          void free( void * );

c)   If the sole argument of a free call contains the address of the first byte of a block of storage obtained by a
     call to malloc, then free will return that block of storage to the heap, i.e., given the invocation
     free( p ), where p points to block of dynamically allocated memory, free restores that block of memory
     to the heap.
d)   The argument to free must be a pointer that points to a storage area previously allocated via malloc.
e)   Never call free with a NULL-valued pointer argument.
f)   Example:




          #include <stdio.h>
          #include <stddef.h>
          #include <stdlib.h> /* for malloc and free decl’s, etc. */
          ...
          char *s;
          ...
          if ( (s = malloc( 100 )) == NULL )
          {
              printf( “Yiii! Can’t allocate storage!\n” );
              exit( 1 );
          }

          /* s can now be treated, in many ways, like an array of 100 chars */
          ...

          free( s );      /* now give back storage, so it can be re-used */
          s = NULL;       /* we explain the reason for doing this later */
          ...




2.4 THE void * TYPE
a)   Note the use of the void * (“pointer to void”, or “void pointer”) type in the declarations of the malloc and
     free functions. This type is the closest thing C has to a universal, or generic, pointer type.
b)   This type is handy because it’s assignment-compatible with every other pointer type (e.g., char *,
     int *, frac *, int (*)[30], etc.). Thus, an expression of type void * can be assigned to an object
     of any pointer type and an expression of any pointer type can be assigned to an object of type void *.
c)   In particular, when a function has void * as its return type (like malloc), the function’s return value may
     be assigned to any pointer type—an explicit type cast is not necessary (but couldn't hurt).
d)   Similarly, when a function has a parameter of type void * (like free), a pointer-valued expression of any
     type may be passed to it—an explicit type cast is not necessary.
UNIT 21                                     CPS 196                                            Page 85



e)   Examples:

          void *pv;
          int *pi;
          char *s;

          s = pi;                ERROR!! incompatible pointers types
          pi = s;                ERROR!! incompatible pointers types
          pv = pi;               OK, since pv is of void * type
          pi = pv;               OK, since pv is of void * type
          pv = s;                OK, since pv is of void * type
          s = pv;                OK, since pv is of void * type
          s = malloc( 100 );     OK, since malloc return value has void * type
          free( s );             OK, since free parameter type is void *




2.5 A SIMPLE COMPLETE EXAMPLE

          #include <stdio.h>
          #include <stddef.h>
          #include <stdlib.h>

          #define MAX_IDS 100

          <typedef for item type>
          <definition of single_item_cost; see Example 1.4(c)>

          int
          main( void )
          {
              item *itemp_ra[MAX_IDS];           note that itemp_ra is an array of pointers
              size_t limit, k;
              int    number;
              double price, sum = 0.0;

              for ( limit = 0; scanf( "%lf%i", &price, &number ) == 2; ++limit )
              {
                 if ( (itemp_ra[limit] = malloc( sizeof( item ) )) == NULL )
                    exit( 1 );
                 itemp_ra[limit]->price = price;
                 itemp_ra[limit]->number = number;
              }

              for ( k = 0; k < limit; ++k )
                 sum += single_item_cost( itemp[k] );      & not needed (compare 1.4(c))
              printf( "Total cost of %i different items is %.2f\n", limit, sum );

              for ( k = 0; k < limit;     ++k )
                 free( itemp[k] );

              return 0;
          }
UNIT 21                                             CPS 196                                         Page 86




2.6 COMMON PITFALLS
a)   Failing to check the return value of malloc against NULL. Using a NULL pointer to access storage usually
     causes a run-time error of some sort. The NULL pointer value is typically zero, and storage beginning at
     byte address zero seldom belongs to the executing program. One of the cardinal rules of defensive
     coding (coding that ensures all objects have appropriate values before using them) is: check the return
     value of a malloc call before proceeding.
b)   Failing to check that a pointer is non-NULL-valued before passing it as an argument to free. While a
     NULL argument will be ignored by a fully Standard-compliant C compiler, such arguments are still
     problematic in many compilers. NULL checking is especially important inside a function that has a pointer
     parameter whose associated storage is to be free'd from within that function. The defensive coder will
     always ensure that a free argument is non-NULL. (Note that, in Example 2.3(f), we assume that the
     malloc and free calls are both contained within the same function, so if the malloc call is successful
     we needn't check whether s is non-NULL before the call to free; of course, it wouldn't hurt us to do so).
c)   Dangling pointers. A dangling pointer is a non-NULL-valued pointer that points to storage that has already
     been freed. If such a pointer is used to access storage, say using the -> operator, the results can be
     disastrous (program termination in many cases). To avoid dangling pointers, take care to assign NULL to
     any pointer whose associated storage has been free’d.


              char *s = malloc( 100 );

              if ( s == NULL )
                 exit( 1 );
              free( s );
              s = NULL;  without this assignment s would still be non-NULL, but it would address storage
                          that no longer belongs to this program


                    s

                                                            <100-byte block of storage>




                                   after freeing storage,
                                   this pointer should be
                                   NULLed out




d)   Orphaned storage. Suppose you've called malloc to allocate some space, and saved the return value in
     the pointer variable p. If you subsequently change p's value without either freeing its storage or saving its
     value in another pointer, then there will be no way to access the storage malloc'd to p—this storage has
     become "orphaned". Because you can no longer access this storage, you cannot release it back to the
     heap. This phenomenon is usually the cause of memory leakage, or progressive loss of accessible
     storage during the execution of a program that does dynamic allocation.



              char c, *p = malloc( 100 );

              if ( p == NULL )
                 exit( 1 );
              p = &c;    p no longer addresses the 100-byte block of dynamically allocated storage which,
                          presumably, we no longer need; if we view pointers to objects as parents, it is
                          clear that this block is now an orphan; note that, as we have neglected to return
                          the block to the free store, it cannot be re-allocated to satisfy a subsequent
                          request for storage
UNIT 21                         CPS 196                               Page 87




                  c


          p

                                        <100-byte block of storage>




              there is no path to
              this storage; it can no
              longer be accessed
UNIT 22                                             CPS 196                                              Page 88




UNIT 22: CONSTRUCTORS, ACCESSORS AND DESTRUCTORS
We show how to implement abstract data types (ADTs) in Object-Oriented Programming (OOP) fashion, using
constructors, destructors and accessors. As an example, we present an OOP-ified version of program rect.c
from Lab 13.


1. OBJECT-ORIENTED PROGRAMMING
a)   Object-Oriented Programming is a "philosophy" of programming in which programs are viewed as sets of
     interacting objects which represent instances of various conceptual types called classes.
b)   Things get done in an object-oriented program by having objects send messages to one another. For
     example, a message may request the value of some attribute of an object, called an instance value.
     Another type of message may tell the object to which it is sent to change a particular instance value
     according to some parameter(s) that accompany the message.
c)   The OOP philosophy is "data-centric" rather than "procedure-centric". Experience has proven that data-
     centric code is more flexible, reliable and re-usable.
d)   The OOP programming style is best supported by Object-Oriented languages like C++, of which the C
     language forms a subset (more or less). Object-Oriented languages directly support the message-passing
     paradigm. C does not.
e)   We can approximate some aspects of OOP style in C, however:
              OOP class  C ADT (where the ADT data component is of structure type)
              object (instance of class)  object (instance of ADT data component)
              instance value  attribute (structure member) of ADT object
              sending message to an object  calling ADT function with object of ADT type as a parameter
f)   Good C programmers, who compartmentalize their programs into modules implementing ADTs, and who
     rely on calling ADT functions to get things done, are using OOP-like techniques.


2. ADTS, OOP-STYLE
2.1 CONSTRUCTORS
a)   The data components of ADTs are generally structures. Most structures are larger than pointers—often
     much larger. As a result, it pays to pass a pointer to an ADT object to ADT functions instead of the whole
     object itself. Of course, this must be done if an ADT function is to change an ADT object argument.
b)   To maintain abstractness (i.e., to hide the fact that we're using pointers), the ADT name should be
     typedef'd as a structure pointer type, not a structure type. But then declaring a variable to have the ADT
     type just allocates space for a structure pointer, so we must somehow allocate space for—and initialize—an
     instance of the corresponding structure.
c)   We call a function a constructor (ctor for short) if it creates (i.e., allocates space for) and initializes a new
     object. In particular, an ADT function is called a constructor if it creates and initializes a new instance of the
     data component of the same ADT. An ADT may have more than one constructor function.
d)   An ADT constructor function returns a pointer to the new ADT object. It's value is the address of the (first
     byte of) storage allocated for the new ADT object.

2.2 ACCESSORS
a)   To preserve abstractness, we need a way to access attribute values of ADT objects that doesn't require
     knowledge of the innards of the ADT data structure.
b)   An accessor (asor for short) is a function that either retrieves or sets a value associated with an object. In
     particular, an ADT function is called an accessor if it retrieves or sets the value of an attribute of the data
     component of the same ADT.
c)   Accessors that retrieve object values are called readers, while accessors that set object values are called
     writers. Typically, there is at least one reader, and one writer, for every attribute of an ADT object.
d)   IMPORTANT RULE: Clients of an ADT must use ADT accessor functions to access ADT object values.
     This restriction makes it possible for the implementer of an ADT to change the internal structure of the ADT
     without invalidating client code. Programs that follow this rule have fewer bugs and are easier to maintain.

2.3 DESTRUCTORS
a)   When an ADT object is no longer needed, the storage it occupies should be returned to the heap, or free
     store, where it may be re-used to fulfill later dynamic storage allocation requests.
UNIT 22                                             CPS 196                                             Page 89



b)   A destructor (dtor for short) is a function that releases all the storage allocated for an object. In particular,
     an ADT function is called a destructor if it releases the storage for an object that was allocated by a
     constructor of the same ADT.


3. AN OOP-STYLE ADT REPRESENTING A RECTANGLE
The following program, rect2.c, shows how to implement an ADT in a (relatively) object-oriented fashion and
provides constructor/accessor/destructor examples. Please note the following carefully:

             The ADT is called rect like the ADT of Lab 13, but rect is a pointer type, not a
              structure type, for reasons of flexibility and efficiency as discussed above.

             The relevant operations on rect type objects are implemented as functions which
              have a parameter or return value of rect type. These are "ADT functions"—they are
              considered part of the rect ADT.

             The only function which is a client of the rect ADT happens to be the main function.
              Note that, although the rect type definition is available to main, this function never
              peeks inside objects of rect type directly—the rect ADT functions are called upon
              to perform all direct manipulation of rect objects.

             If we plan on using the rect type in more than one program, it would be wise to split
              off the rect ADT code split into two files: a header file containing the relevant
              typedefs and function declarations (i.e., prototypes), and a implementation file
              containing the definitions of the functions declared in the corresponding header file.
              Then, any source file that wants to use the rect ADT need only #include the
              ADT's header file.
      UNIT 22                     CPS 196                   Page 90



/*
**   rect2.c - Demonstrates a user-defined abstract data
**             type for representing rectangles.
**
**   NOTE 1: This program utilizes the Borland Graphics
**   Interface (BGI) package for actual graphics rendering,
**   and is therefore highly non-portable. The BGI
**   functions called below are:
**
**   initgraph         initializes graphics mode
**   graphresult       returns an initialization result code
**   grapherrormsg     converts error codes to error messages
**   line              draws line between two points on screen
**   setcolor          sets the color for subsequent drawing
**
**   NOTE 2: The non-standard library functions getch and
**   sleep are used. getch is similar to getchar, except
**   that user input is not echoed to the screen. It's use
**   below, for detecting a keypress without disturbing the
**   screen display, is typical. sleep causes the program
**   to pause.
*/


/* standard headers */
#include <stdio.h>
#include <stddef.h>
#include <stdlib.h>               /* malloc/free declarations */

/* non-standard headers */
#include <graphics.h>      /* for BGI decl's and def's */
#include <conio.h>         /* for getch declaration    */
#include <dos.h>           /* for sleep declaration    */


#define         MAX_RECTS   10    /*   max. # of rects to draw   */
#define         XMAX        639   /*   maximum x coordinate      */
#define         YMAX        479   /*   maximum y coordinate      */
#define         MAXCOLOR    15    /*   0 <= color <= MAXCOLOR    */
    UNIT 22                  CPS 196                  Page 91




/*--- TYPEDEFS ----------------------------------------*/

typedef struct point
{
   int   x; /* x coordinate */
   int   y; /* y coordinate */
}
point;

typedef struct rect_st
{
   point uleft;    /* upper-left corner */
   point lright;   /* lower-right corner */
   int   color;    /* rendering color    */
}
rect_st;

typedef rect_st   *rect; /* a rect obj is a ptr!!! */


/*--- FUNCTION PROTOTYPES -----------------------------*/

/* graphics initialization/termination functions */
int   graphics_init( void );
void graphics_term( void );

/* rect type functions */
/* constructor */
rect rect_construct( int, int, int, int, int );
/* destructor */
void rect_destroy( rect );

/* accessors */
int   rect_color( rect );              /* reader */
int   rect_set_color( rect, int );     /* writer */

/* other rect functions */
void rect_draw( rect r );
    UNIT 22                CPS 196                 Page 92



/*--- FUNCTION DEFINITIONS ----------------------------*/

/*-----------------------------------------------------*/
/* main                                                */
/*                                                     */
/* Test driver for rect type. Note that no BGI calls */
/* are made (even though calls like rect_draw          */
/* devolve into calls to BGI routines). This keeps     */
/* main independent of the particular graphics         */
/* package used for actual rendering.                  */
/*                                                     */
/* PARAMETERS                                          */
/*    none                                             */
/*                                                     */
/* RETURNS                                             */
/*    0 for successful completion                      */
/*    1 for failure                                    */
/*-----------------------------------------------------*/

int
main( void )
{
    rect    ra[MAX_RECTS]; /* array of rect objects */
    size_t  count, i, j;   /* loop counters         */
    int     ulftx = 0, ulfty = 0,
            lrgtx = 100, lrgty = 100, color = 1;

   /* create up rect array ra */
   for ( count = 0; count < MAX_RECTS; ++count )
   {
      if ( (ra[count] =
           rect_construct(ulftx,ulfty,lrgtx,lrgty,color))
           == NULL )
         break;
      ulftx += 10; ulfty += 10;
      lrgtx += 20; lrgty += 20;
      color = (color + 1) % MAXCOLOR;
   }

   /* initialize graphics system */
     UNIT 22                 CPS 196                   Page 93



    if ( graphics_init() != 0 )
       return 1;

    /* draw rectangles in ra 5 times, shifting colors */
    for ( i = 0; i < 5; ++i )
    {
       for ( j = 0; j < count; ++j )
          rect_draw( ra[j] );
       for ( j = 0; j < count; ++j )
          rect_set_color( ra[j],
                   (rect_color( ra[j] ) + 1) % MAXCOLOR );
       sleep( 1 );        /* pause for a second */
    }

    printf( "Press any key to quit" );
    getch();

    graphics_term();   /* terminate graphics mode */

    /* now free the storage allocated for rectangles */
    for ( i = 0; i < count; ++i )
       rect_destroy( ra[i] );

    return 0;
}
     UNIT 22                CPS 196                Page 94



/*-----------------------------------------------------*/
/* graphics_init                                       */
/*                                                     */
/* Initializes BGI graphics system. The details of     */
/* what actually goes on in this process are highly    */
/* technical and do not concern us.                    */
/*                                                     */
/* PARAMETERS                                          */
/*    none                                             */
/*                                                     */
/* RETURNS                                             */
/*    0 for successful completion                      */
/*    1 for failure                                    */
/*-----------------------------------------------------*/

int
graphics_init( void )
{
    int  gdriver = DETECT; /* DETECT from graphics.h */
    int  gmode;
    int  errorcode;

    initgraph( &gdriver, &gmode, "" );
    errorcode = graphresult();
    if ( errorcode != grOk )
    {
       printf( "Graphics error: %s\n",
               grapherrormsg( errorcode ) );
       printf( "Press any key to halt:" );
       getch();
       return 1;
    }

    return 0;
}
     UNIT 22               CPS 196                 Page 95



/*-----------------------------------------------------*/
/* graphics_term                                       */
/*                                                     */
/* Shuts down BGI graphics processing in an orderly    */
/* orderly manner; the details are unimportant.        */
/*                                                     */
/* PARAMETERS                                          */
/*    none                                             */
/*                                                     */
/* RETURNS                                             */
/*    no return value                                  */
/*-----------------------------------------------------*/

void
graphics_term( void )
{
   closegraph();

    return;
}
     UNIT 22                CPS 196                 Page 96



/*-----------------------------------------------------*/
/* rect_construct                                      */
/*                                                     */
/* Creates and initializes a new rectangle.            */
/*                                                     */
/* PARAMETERS                                          */
/*    ulftx : x coordinate of upper left corner        */
/*    ulfty : y coordinate of upper left corner        */
/*    lrgtx : x coordinate of lower right corner       */
/*    lrgty : y coordinate of lower right corner       */
/*    color : rendering color                          */
/*                                                     */
/* RETURNS                                             */
/*    initialized rect, if successful                  */
/*    NULL, if unsuccessful                            */
/*-----------------------------------------------------*/

rect
rect_construct( int ulftx, int ulfty,
                int lrgtx, int lrgty, int color )
{
   rect r;

    if ( (r = (rect) malloc( sizeof(rect_st) )) == NULL )
       return NULL;
    r->uleft.x = ulftx;
    r->uleft.y = ulfty;
    r->lright.x = lrgtx;
    r->lright.y = lrgty;
    r->color = color;

    return r;
}
     UNIT 22               CPS 196                 Page 97



/*-----------------------------------------------------*/
/* rect_destroy                                        */
/*                                                     */
/* Destroys a rectangle by freeing storage allocated   */
/* to it. If the input parameter is bad, prints an     */
/* appropriate error message and exits the program.    */
/*                                                     */
/* PARAMETERS                                          */
/*    r : rectangle to be destroyed                    */
/*                                                     */
/* RETURNS                                             */
/*    no return value                                  */
/*-----------------------------------------------------*/

void
rect_destroy( rect r )
{
   if ( r == NULL )
   {
      printf( "rect_destroy: bad parameter value\n" );
      exit( 1 );
   }

    free( r );

    return;
}
     UNIT 22                CPS 196                   Page 98



/*-----------------------------------------------------*/
/* rect_draw                                           */
/*                                                     */
/* Draws the input rectangle on screen. The lines      */
/* forming the sides of the rectangle are drawn in the */
/* order: top, right, bottom, left.                    */
/*                                                     */
/* PARAMETERS                                          */
/*    rect : a rectangle object                        */
/*                                                     */
/* RETURNS                                             */
/*    no return value                                  */
/*-----------------------------------------------------*/

void
rect_draw( rect r )
{
   if ( r == NULL )
   {
      printf( "rect_draw: bad parameter value\n" );
      exit( 1 );
   }

    setcolor( r->color );
    line( r->uleft.x, r->uleft.y,
          r->lright.x, r->uleft.y );
    line( r->lright.x, r->uleft.y,
          r->lright.x, r->lright.y );
    line( r->lright.x, r->lright.y,
          r->uleft.x, r->lright.y );
    line( r->uleft.x, r->lright.y,
          r->uleft.x, r->uleft.y );

    return;
}
    UNIT 22                CPS 196                 Page 99



/*-----------------------------------------------------*/
/* rect_color                                          */
/*                                                     */
/* Returns the color of the input rectangle. If the    */
/* input parameter is bad, prints an appropriate error */
/* message and exits the program.                      */
/*                                                     */
/* PARAMETERS                                          */
/*    r : a rectangle object                           */
/*                                                     */
/* RETURNS                                             */
/*    color code of r                                  */
/*-----------------------------------------------------*/

<define rect_color>




/*-----------------------------------------------------*/
/* rect_set_color                                      */
/*                                                     */
/* Sets the color of the input rectangle to the input */
/* color. If one of the parameters is bad, prints      */
/* appropriate error msg and exits.                    */
/*                                                     */
/* PARAMETERS                                          */
/*    r     : rectangle whose color is to be set       */
/*    color : new color for rectangle r                */
/*                                                     */
/* RETURNS                                             */
/*    the old color value of r                         */
/*-----------------------------------------------------*/

<define rect_set_color>
UNIT 23                                           CPS 196                                          Page 100




UNIT 23: MODULAR CODE AND MULTI-FILE PROGRAMS I
We devote this Unit to coverage of an extended example: a generic list ADT (having the typical list.h/list.c
module organization) which will later be used in the construction of a multi-file program. The basic concepts of
list processing are covered along the way. We also discuss issues that arise in building up programs out of
multiple source files.


1. ELEMENTS OF LIST PROCESSING
a)   When we discussed dynamic allocation, we showed how using an array of pointers to large objects can be
     more space-efficient than using an array of these objects themselves. Nevertheless, when a pointer array
     is used, we cannot allocate any more objects than there are pointers in the array. [See UNIT 16.]
b)   There is a more flexible and space-efficient way to store a collection of objects, whose number cannot be
     determined in advance, that doesn't involve the pre-allocations of any sort of array. Here's how: space is
     allocated for each object only as needed, but we allocate just enough extra space to store a pointer with
     each each object. The objects thus allocated can be linked together via their pointer areas as follows:




c)   A collection of objects linked together this way is called a linked list, or just a list. The pointer in each
     object is called a next pointer. The last object in a list is distinguished by having a NULL-valued next
     pointer. We call each data+pointer object an element of the list. Note that there are no wasted pointers.
d)   The type of data need not be the same in each list element.
e)   To build an ADT that manages such lists, whatever their contents, we define a list object to be pointer to
     a small descriptor structure [show list.h listing]. Here's the picture (after 1 constructor and 3 add calls):

                       list lst;


                               count

                               front

                               rear
UNIT 23                                              CPS 196                                              Page 101




2. ISSUES IN CODE MODULARIZATION
a)   As discussed previously [see UNIT 15], if an ADT is needed in a source file, the interface (header) file for
     that type is is #included. Here, that means list.h. The implementation file, list.c, will define all
     list funcitons. There will be client of the list type, lsclient.c, to be given later.
b)   It is possible for the same header file to be #included more than once in the process of compiling a single
     source file, since inclusions may be nested. This raises the possibility of "multiple declaration" and "re-
     definition" errors during compilation. To avoid such problems, headers should be enclosed in an
     "#include sandwich". [Point out the #include sandwich in list.h, and give a quick discussion of
     conditional compilation: #if, #else and #endif, together with the defined predicate).
c)   While the C standard demands that a compliant compiler recognize at least the first 31 characters of an
     identifier, there is no such standard for linkers. Thus, if the linker only distinguishes identifiers by using the
     first 8 characters, say, the function name a_long_func_name_1 and a_long_func_name_2 will be
     viewed as the same, to disastrous effect (we say the two identifiers cannot be disambiguated). There is a
     way to dodge this problem: use cover macros. A cover macro is just a macro with a long, meaningful
     name that expands into a call to a function whose name is short enough not to cause disambiguation
     problems. [Point out the cover macros in list.h, noting carefully that the list_destroy macro does a
     little extra by passing the address of the macro argument to ldestroy—this is so a list_destroy call
     can prevent its argument from becoming a dangling pointer.]
d)   Yet another problem is "pollution of the name space", which occurs when more than one source file defines
     the global objects (data or functions) with same name, causing linker errors. Many file-level objects, though
     defined so as to be globally available, are only used in the source file where their declarations appear. The
     solution: prefixing the reserved word static to a global declaration renders the declaration private to the
     containing source file (it's still a file-level object, but cannot be accessed outside of the source file in which it
     is declared). [Point out the static functions declarations in list.c.]


3. MULTI-FILE PROGRAMS
a)   Recall that source files are compiled individually, yielding an object file for each source file. It's the job of
     the linker to tie the object files together into a single executable file [see UNIT 2.3.1 and UNIT 5.2(d).]
b)   The linker also brings in pre-compiled object code from the standard library.
c)   If a module is likely to be used often, its object code should be placed in a "user" library, so that using it
     becomes as easy as using standard library functions.
d)   Most environments provide a "make" facility or a "project" facility that help automate the process of building
     a program from multiple sources files.


4. AN OOP-STYLE LIST ADT MODULE
The following module, list.h/list.c, shows how to implement a generic list-handler in a (relatively) object-
oriented fashion. Please note the following carefully:

             This kind of pointer-intensive processing is not unusual in C; in fact, it's the norm.

             As with the rect ADT of UNIT 17, the list type is a pointer type, not a structure
              type, for the usual flexibility and efficiency reasons.

             The code comprising this ADT is split into two files, a .h header file (e.g., list.h)
              containing the relevant typedefs, cover macros and function declarations (i.e.,
              prototypes), and a .c implementation file (e.g., list.c) containing the definitions
              of the functions declared in the corresponding header file.

             We follow the OOP method, providing constructor/accessor/destructor functions (or
              macros, like list_first, list_last, list_isempty and list_length)
UNIT 23                         CPS 196             Page 102




          /*
          ** list.h - list module interface file.
          */

          #if !defined(LIST_H)
          #define LIST_H

          /*- INCLUDES ---------------------------------*/

          #include <stdio.h>
          #include <stddef.h>

          /*- TYPEDEFS ---------------------------------*/

          /*
          ** Generic list element top section; defines list
          ** management overhead structure that precedes
          ** space requested by user for application-
          ** specific list element.
          */
          typedef struct list_ele_top
          {
             struct list_ele_top *next;
          }
          list_ele_top;

          /*
          ** A list is represented by a pointer to a header
          ** structure containing control variables for
          ** list management.
          */
          typedef struct
          {
             int            count; /* #elements in list */
             list_ele_top *front; /* ptr to 1st element */
             list_ele_top *rear; /* ptr to last element*/
          }
          list_hdr;
UNIT 23                      CPS 196   Page 103



          typedef list_hdr *list;
UNIT 23                        CPS 196                Page 104



          /*- MACROS ------------------------------------*/

          /*
          **   All list functions are given long names for
          **   readability via macros; the actual function
          **   names are kept short for portability.
          **   List functions should always be invoked via
          **   macros. Macro arguments below are interpreted
          **   as follows:
          **
          **   es = size of element to be added to list
          **   l = object of type list
          */

          #define  list_add(l,es)   ladd((l),(es))
          #define  list_clear(l)    lclear((l))
          #define  list_construct() lconstruct()
          #define  list_remove(l)   lremove((l))
          #define  list_destroy(l) ldestroy(&(l))
          #define  list_first(l) ( ((l)->front != NULL)   \
                                   ?(void *)((l)->front+1)\
                                   : NULL )
          #define list_isempty(l) (((l)->count) == 0)
          #define list_last(l) ( ((l)->rear != NULL)      \
                                   ? (void *)((l)->rear+1)\
                                   : NULL )
          #define list_length(l)    ((l)->count)
          /*- PUBLIC FUNCTIONS PROTOTYPES ---------------*/

          void *ladd( list, size_t );
          void lclear( list );
          list lconstruct( void );
          void ldestroy( list * );
          int   lremove( list );

          #endif /* if !defined(LIST_H) */
UNIT 23                        CPS 196                Page 105



          /*
          ** list.c - list module implementation file.
          */

          /*- INCLUDES ----------------------------------*/

          #include   <stdio.h>
          #include   <stddef.h>
          #include   <stdlib.h>
          #include   <stdarg.h> /* for var. arg. list mgmt */

          #include "list.h"     /* in quotes, not brackets */

          /*- PRIVATE FUNCTION PROTOTYPES ---------------*/

          static void lcheck( list, char * );
          static void lerror( char *, ... ); /* var arg */

          /*- FUNCTION DEFINITIONS ----------------------*/
UNIT 23                      CPS 196                  Page 106



          /*---------------------------------------------*/
          /* ladd                                        */
          /*                                             */
          /* Adds element at rear of list. Caller        */
          /* supplies size of storage required to hold   */
          /* list element, and ladd allocates that amount*/
          /* plus whatever storage is needed to maintain */
          /* the list structure.                         */
          /*                                             */
          /* PARAMETERS                                  */
          /*   l : list to which element is to be added */
          /*   es : size of element to be added to list */
          /*                                             */
          /* RETURNS                                     */
          /*   a pointer to the user area of new element */
          /*---------------------------------------------*/

          void *
          ladd( list l, size_t es )
          {
             list_ele_top *p;

            lcheck( l, "ladd" );       /* valid list? */

            /* alloc requested space + list mgmt space */
            if ( (p = malloc( es + sizeof(list_ele_top) ))
                 == NULL )
              lerror("ladd: no space for list element\n");

            /* link allocated space into list */
            if ( l->front == NULL )
               l->front = p;
            else
               (l->rear)->next = p;

            l->rear = p;    /* new ele is last in list */
            p->next = NULL; /* rear ele next ptr is NULL*/
            ++l->count;     /* increment list count */

            /* return pointer to user area */
UNIT 23                      CPS 196                Page 107



             return ( (void *) (p + 1) );
          } /* ladd */
          /*---------------------------------------------*/
          /* lcheck                                      */
          /*                                             */
          /* Checks integrity of a list. Terminates      */
          /* program with an error message if integrity */
          /* check fails.                                */
          /*                                             */
          /* PARAMETERS                                  */
          /*   l : list whose integrity is to be checked */
          /*   s : string to display if integrity check */
          /*       fails                                 */
          /*                                             */
          /* RETURN VALUE                                */
          /*   none                                      */
          /*---------------------------------------------*/

          static void
          lcheck( list l, char *s )
          {
             if ( (l == NULL)
                 || (l->front == NULL && l->rear != NULL)
                 || (l->front != NULL && l->rear == NULL)
                 || (l->front == NULL
                      && list_length( l ) != 0)
                 || (l->front != NULL
                      && list_length( l ) == 0)
                 || (list_length( l ) < 0) )
                lerror("lcheck: messed up list (%s)\n", s);

            return;

          } /* lcheck */
UNIT 23                      CPS 196                    Page 108



          /*---------------------------------------------*/
          /* lclear                                      */
          /*                                             */
          /* Removes all elements from a list.           */
          /*                                             */
          /*                                             */
          /* PARAMETERS                                  */
          /*   l : list whose elements are to removed    */
          /*                                             */
          /* RETURNS                                     */
          /*   no return value                           */
          /*---------------------------------------------*/

          void
          lclear( list l )
          {
             list_ele_top *p, *r;

            lcheck( l, "lclear" );         /* valid list? */

            /* free all list elements */

            for ( p = l->front;        p != NULL; )
            {
               r = p;
               p = p->next;
               free( r );
            }

            /* reset list control variables */

            l->front = l->rear = NULL;
            l->count = 0;

            return;

          } /* lclear */
UNIT 23                        CPS 196              Page 109



          /*---------------------------------------------*/
          /* lconstruct                                  */
          /*                                             */
          /* Creates and returns a list (i.e., allocates */
          /* and initializes a list header.)             */
          /*                                             */
          /* PARAMETERS                                  */
          /*    none                                     */
          /*                                             */
          /* RETURNS                                     */
          /*    newly created list (i.e., ptr to alloc'd */
          /*    list header)                             */
          /*---------------------------------------------*/

          list
          lconstruct( void )
          {
             list l;

            if ( (l = (list) malloc( sizeof( list_hdr ) ))
                 == NULL )
               lerror( "lcreate: no space for list\n" );

            l->front = NULL;
            l->rear = NULL;
            l->count = 0;

            return ( l );

          } /* lconstruct */
UNIT 23                       CPS 196                  Page 110



          /*---------------------------------------------*/
          /* ldestroy                                    */
          /*                                             */
          /* Destroys a list by freeing all storage      */
          /* allocated to it.                            */
          /*                                             */
          /* PARAMETERS                                  */
          /*    lp : ptr to list (list hdr ptr) to be */
          /*           destroyed                         */
          /*                                             */
          /* RETURNS                                     */
          /*    no return value                          */
          /*---------------------------------------------*/

          void
          ldestroy( list *lp )
          {
             list_ele_top *p, *r;

            lcheck( *lp, "ldestroy" );     /* valid list? */

            /* free all list elements */

            for ( p = (*lp)->front;     p != NULL; )
            {
               r = p;
               p = p->next;
               free( r );
            }

            free( *lp );     /* free list header */
            *lp = NULL;      /* prevent dangling pointer */

            return;

          } /* ldestroy */
UNIT 23                      CPS 196                Page 111



          /*---------------------------------------------*/
          /* lerror                                      */
          /*                                             */
          /* Reports list handling errors.               */
          /*---------------------------------------------*/

          static void
          lerror( char *fmt, ... )
          {
             va_list ap;

            va_start( ap, fmt );
            vfprintf( stderr, fmt, ap );
            va_end( ap );

            exit( 1 );

          } /* lerror */
UNIT 23                         CPS 196                     Page 112



          /*---------------------------------------------*/
          /* lremove                                     */
          /*                                             */
          /* Removes element at front of list (frees     */
          /* storage associated with that element).      */
          /*                                             */
          /* PARAMETERS                                  */
          /*    l   : list from which first element is */
          /*           to be removed                     */
          /*                                             */
          /* RETURNS                                     */
          /*    1, if element successfully removed;      */
          /*    0, if list is empty                      */
          /*---------------------------------------------*/

          int
          lremove( list l )
          {
             list_ele_top *p;

            lcheck( l, "lremove" );              /* valid list? */

            p = l->front;
            if ( p == NULL )
               return 0;            /* list is empty, return */

            l->front = p->next;           /* pt past removed ele */

            /* if only ele is removed, adjust rear ptr */
            if ( l->front == NULL )
               l->rear = NULL;

            --l->count;               /* decrement list count */

            return 1;                 /* return success code */

          } /* lremove */
UNIT 24                                            CPS 196                                             Page 113




UNIT 24: MODULAR CODE AND MULTI-FILE PROGRAMS II
We conclude our coverage of multi-file programs and multi-file programming issues, by presenting a program
comprising the generic list type module (list.h/list.c) and a source file which is a client of this type,
lsclient.c.


1. A MULTI-FILE LIST-PROCESSING PROGRAM
The following source file, lsclient.c, is a client of the list ADT. When it is compiled, and its object code
linked with that of the list module, a complete list-processing program is formed. The client, which creates
and manipuates a list of strings, doesn't do much more than test the basic functionality of the list type.
Please note the following carefully:

             lsclient.c can use the list type because it #includes the header list.h

             We use a new skip_whitespace function to read past leading whitespace. This
              function employs the fgetc function [see UNIT 22.1.1] and the new standard ungetc
              function, which pushes a character back into an input stream so that it becomes the
              next character read from that stream.

             We use a new read_to_eoln function which is like gets, but which protects
              against buffer overflow, while guarantees that the entire input line is consumed (even
              if not stored in the buffer because too long).

             We use a new standard library function, fflush, to force the contents of the buffer
              associated with an output stream to be written to that stream. All the C standard I/O
              functions, like printf, use data transfer buffers (not seen by the programmer) as an
              intermediate stop between files and devices on one hand, and the users input and
              output buffers on the other. Output data, for example, is collected in an intermediate
              transfer buffer which is flushed (i.e., its contents are written to the associated file or
              device) only under certain conditions (typical are: transfer buffer full, '\n' written to
              transfer buffer or—here's the one that concerns us now—explicit request, via the
              fflush function). The printf call immediately preceding the fflush call doesn't
              write a '\n' to stdout, so we call fflush to ensure that the operation menu is
              displayed immediately.
UNIT 24                           CPS 196                Page 114



          /*
          ** lsclient.c - a client of the list abstract
          **              data type.
          */


          /*- INCLUDES ----------------------------------*/

          #include   <stdio.h>
          #include   <stddef.h>
          #include   <string.h>
          #include   <ctype.h>

          #include "list.h"         /* quotes, not brackets */

          /*- TYPE DEFINITIONS --------------------------*/


          typedef enum
          {
             BAD_OP, QUIT, ADD, REMOVE, FIRST, LAST,
             LENGTH, ISEMPTY, CLEAR
          }
          optype;


          /*- FUNCTION PROTOTYPES -----------------------*/

          int skip_whitespace( FILE *fp );
          int read_to_eoln( FILE *, char *, size_t );

          /*- FUNCTION DEFINITIONS ----------------------*/
UNIT 24                        CPS 196              Page 115



          /*---------------------------------------------*/
          /* main                                        */
          /*                                             */
          /* Test drives the list abstract data type.    */
          /* The list lst is used as a list of arbitrary */
          /* strings. Note that embedded whitespace is */
          /* allowed; leading whitespace is not stored. */
          /*---------------------------------------------*/

          int
          main ( void   )
          {
             list       lst;
             char       p[81], *pe;
             optype     op;

            if ( (lst = list_construct()) == NULL )
               fprintf( stderr,
                        "list_construct failure\n" );


            for (;;)    /* manipulate list loop */
            {
               printf( "\nEnter operation [and string, "
                       "for ADD] where\n\n" );
               printf( "%i = QUIT\n%i = ADD\n"
                       "%i = REMOVE\n%i = FIRST\n"
                       "%i = LAST\n%i = LENGTH\n"
                       "%i = ISEMPTY\n%i = CLEAR\n"
                       "\n\n==> ",
                       QUIT, ADD, REMOVE, FIRST, LAST,
                       LENGTH, ISEMPTY, CLEAR );
               fflush( stdout ); /* flush stdout buffer */

               op = BAD_OP;
               scanf( "%i", &op );
               if ( op == QUIT )
                  break;
UNIT 24                CPS 196                 Page 116



          switch ( op )
          {
          case ADD:
             skip_whitespace( stdin );
             read_to_eoln( stdin, p, sizeof(p) );
             if ( (pe = list_add(lst,strlen(p)+1 ))
                    == NULL )
                fprintf(stderr,"list_add failure\n");
             else
                strcpy( pe, p );
             break;
          case REMOVE:
             if ( (pe = list_remove( lst )) == NULL )
                printf( "list is empty\n" );
             else
                printf("Removed elem. \"%s\"\n", pe);
             break;
          case FIRST:
             if ( !list_isempty(lst) )
                if ( (pe = list_first(lst)) == NULL )
                    fprintf( stderr,
                             "list_first failure\n" );
                else
                    printf( "First elem. = \"%s\"\n",
                                             pe );
             else
                printf( "list is empty\n" );
             break;
          case LAST:
             if ( !list_isempty(lst) )
                if ( (pe = list_last(lst)) == NULL )
                    fprintf( stderr,
                             "list_last failure\n" );
                else
                    printf( "Last element = \"%s\"\n",
                                              pe );
             else
                printf( "list is empty\n" );
             break;
UNIT 24                     CPS 196                Page 117



               case LENGTH:
                  printf( "%i\n", list_length( lst ) );
                  break;
               case ISEMPTY:
                  printf( "%s\n", list_isempty( lst )
                                   ? "true" : "false" );
                  break;
               case CLEAR:
                  list_clear( lst );
                  printf( "list length = %i\n",
                           list_length( lst ) );
                  break;
               default:
                  printf( "bad operation--try again\n" );
                  read_to_eoln( stdin, p, sizeof(p) );
               } /* switch */

               printf( "\n" );

            } /* for */

            list_destroy( lst );

            return 0;

          } /* main() */
UNIT 24                          CPS 196                  Page 118



          /*---------------------------------------------*/
          /* skip_whitespace                             */
          /*                                             */
          /* Reads characters from input file until a    */
          /* non-whitespace character is encountered.    */
          /* The non-whitespace character is pushed back */
          /* into the input file's buffer, so that it    */
          /* becomes the first character read by         */
          /* subsequent input operations.                */
          /*                                             */
          /* PARAMETERS                                  */
          /*    fp     : input file pointer              */
          /*                                             */
          /* RETURNS                                     */
          /*    number of whitespace characters skipped, */
          /*    or EOF (on failure)                      */
          /*---------------------------------------------*/

          int
          skip_whitespace( FILE *fp )
          {
             int      c;
             size_t   k = 0;

              while ( (c = fgetc( fp )) != EOF
                      && isspace( c ) )
                 ++k;

              if ( c == EOF )
                 return EOF;

              ungetc( c, fp );     /* push back char */

              return k;
          }
UNIT 24                          CPS 196                       Page 119



          /*---------------------------------------------*/
          /* read_to_eoln                                */
          /*                                             */
          /* Like standard I/O gets functions, but with */
          /* a few twists:                               */
          /*                                             */
          /*    1. If buffer filled before end-of-line, */
          /*       the remainder of the line (including */
          /*       '\n') is consumed w/o being stored.   */
          /*    2. '\0' is always last character stored. */
          /*    3. Returns number of characters stored, */
          /*       not buffer address like gets/fgets.   */
          /*                                             */
          /* PARAMETERS                                  */
          /*    fp      : input file pointer             */
          /*    s       : points to input buffer         */
          /*    mxsize : size of input buffer            */
          /*                                             */
          /* RETURNS                                     */
          /*    number of chars stored in s buffer (not */
          /*    including '\0'), or EOF (on failure)     */
          /*---------------------------------------------*/

          int
          read_to_eoln( FILE *fp, char *s, size_t mxsize )
          {
              size_t  i;
              int     c;

              for ( i = 0;
                    i < mxsize-1 && (c = fgetc(fp)) != EOF
                                  && c != '\n';
                    ++i )
                 *s++ = (char) c;
              *s = '\0';

              if ( c == EOF && i == 0 )
                 /* report EOF only if no chars were read */
                 return EOF;
              else if ( i == mxsize-1 )
                 /* '\n' not read; consume rest of line */
                 while ( (c = fgetc(fp)) != EOF
                         && c != '\n' )
                    ; /* do nothing */

              return i;
          }

								
To top