VIEWS: 28 PAGES: 166 CATEGORY: Technology POSTED ON: 11/14/2012
CSC 311 DATA STRUCTURES & ALGORITHMS I Mr. Nixon Adu-Boahen nixon.adjei@gcuc.edu.gh FIRST SEMESTER (2011) WHAT IS INVOLVED IN THIS COURSE… Analysis of algorithms Running time of algorithms Big-O-Notation Logic Important C++ concepts (Pointers, structs, classes) Fundamental dynamic data structures including linear lists, queues and linked list, arrays, mapping functions and access tables. We will also implement algorithms in C++ 1. INTRODUCTION Recommended Text: Handout (Nixon Adu-Boahen) Modern Software Development Using Java (Paul T. Tymann and G. Michael Schnieder) Algorithms and Data Structures (N. Wirth - PDF file) WHAT IS A COMPUTER ALGORITHM? An algorithm is a sequence of unambiguous instructions for solving a problem, i.e., for obtaining a required output for any legitimate input in a finite amount of time. FEATURES OF ALGORITHM Besides being a finite set of rules that gives a sequence of operations for solving a specific type of problem, an algorithm has five important features 1. Definiteness >> precise definition of each step 2. Input (zero or more) >> there are zero or more quantities which are externally supplied 3. Output (one or more) >> at least one quantity is produced; 4. Finiteness >> if we trace out the instructions of an algorithm, then for all cases the algorithm will terminate after a finite number of steps; 5. Effectiveness >> everyinstruction must be sufficiently basic that it can in principle be carried out by a person using only pencil and paper. It is not enough that each operation be definite, but it must also be feasible. DIFFERENCE BETWEEN ALGORITHM AND PROGRAM A computer algorithm is A detailed step-by-step method for solving a problem using a computer. A program is An implementation of one or more algorithms A program does not necessarily satisfy the fourth condition of algorithms (finiteness) . >> An important example of such a program for a computer is its operating system which never terminates (except for system crashes) but continues in a wait loop until more jobs are entered. NOTION OF ALGORITHM Problem Algorithm Input Computer Output The Algorithmic Solution DESIGNING CORRECT ALGORITHMS AND PROGRAMS We describe algorithms using natural language and mathematical notation. Algorithms, as such, cannot be executed by a computer. The formulation of an algorithm in a programming language is called a program. Designing correct algorithms and translating a correct algorithm into a correct program are nontrivial and error-prone tasks. You should therefore be prepared to dedicate substantial amount of your time in writing programs. EXAMPLE OF COMPUTATIONAL PROBLEM: SORTING Statement of a problem: - Input: A sequence of n numbers <a1, a2…aN> - Output: A reordering of the input sequence <a1, a2, … aN> so that ai < aj whenever i < j Instance: The sequence < 10, 4, 3, 2, 8> Algorithms: Selection sort Insertion sort Merge sort An many others SELECTION SORT Input: array A[1], … A[n] Output: array A sorted in increasing order Algorithm: for i = 1 to n //n is the number of elements in the array begin examine a[i] to a[n] and suppose the smallest integer is at a[j] swap a[i] and a[j]; end; Homework 1: Write a C++ program that accepts 10 integers from a user, places them in an array, uses the algorithm above to sort the integers, and prints out the numbers in increasing order. BUBBLE SORT Input: array A[1], … A[n] Output: array A sorted in increasing order Algorithm: for pass = 1 to n-1 //n is the number of elements in the array begin compare a[i] to a[i+1] and suppose the smallest integer is at a[i+1] swap a[i] and a[i+1]; end; Homework 1: Write a C++ program that accepts 10 integers from a user, places them in an array, uses the algorithm above to sort the integers, and prints out the numbers in increasing order. WHY STUDY ALGORITHMS? Theoretical importance It is the core of Computer Science. Practical importance A practitioner's toolkit of known algorithms Framework for designing and analyzing algorithms for new problems. DEFINITIONS Computer Science can be defined as the study of data, its representation and transformation by a digital computer. A data type is a term which refers to the kinds of data that variables may hold. E.g. in C++, the data types are int, float, double, boolean, etc. Data object is a term referring to a set of elements, say D. For example the data object integers refers to D = {0, ±1, ±2, ±3, ±4, …}. The data object alphabetic characters A ={‘A’, ‘B’, ‘C’…’Z’}. DEFINITIONS CONT. A data structure is a way to store and organize data in order to facilitate access and modifications. No single data structure works well for all purposes, and so it is important to know the strengths and limitations of several of them. HOW TO CREATE PROGRAMS The process of creating quality programs is broken down into five phases: requirements, design, analysis, coding, and verification. Requirements: Make sure you understand the information you are given (the input) and what results you are to produce (the output). Try to write down a rigorous description of the input and output which covers all cases. Design: You may have several data objects (such as a maze, a polynomial, or a list of names). For each object there will be some basic operations to perform on it (such as print the maze, add two polynomials, or find a name in the list.) Assume that these operations already exist in the form of procedures and write an algorithm which solves the problem according to the requirements. Use a notation which is natural to the way you wish to describe the order of processing. HOW TO CREATE PROGRAMS CONT. Analysis: If you can think of another algorithm, then write it down. Next try to compare the two algorithms you have in hand. It may already be possible to tell if one will be more desirable than the other. If you cannot distinguish between two, choose one to work on for now. Refinement and coding: You must now choose representations for your data objects (a maze as a two dimensional array of degree and coefficients, a list of names possibly as an array) and write algorithms for each of the operations on these objects. Verification: Verification consists of three distinct aspects: program proving, testing and debugging. Each of these is an art in itself. Before executing your program you should attempt to prove it is correct. Testing is the art of creating sample data upon which to run your program. If the program fails to run correctly then debugging is needed to determine what went wrong and how to correct it. TOP-DOWN DESIGN • If we look at a problem as a whole, it may seem impossible to solve because it is so complex. Examples: – writing a tax computation program – writing a word processor • Complex problems can be solved using top- down design, also known as stepwise refinement, where – We break the problem into parts – Then break the parts into parts – Soon, each of the parts will be easy to do ADVANTAGES OF TOP-DOWN DESIGN • Breaking the problem into parts helps us to clarify what needs to be done. • At each step of refinement, the new parts become less complicated and, therefore, easier to figure out. • Parts of the solution may turn out to be reusable. • Breaking the problem into parts allows more than one person to work on the solution. 2. LOGIC LOGIC: BASIC DEFINITIONS Definition: A proposition is a statement that is either true or false, but not both. Defintion: The value of a proposition is called its truth value. Denoted by T if it is true, F if it is false Example 1: The statement “Peter Mensah is the president of Ghana” is a proposition with truth value false. Example 2: The statement “Do your homework” is not a proposition because it is not a statement that can be true or false. LOGICAL CONNECTIVES Connectives are used to create a proposition from several other propositions. Such propositions are called compound propositions The most common connectives are: – NEGATION (¬ or !) – AND (∧) – OR (∨) – XOR (⊕) – IMPLICATION (→) – BICONDITIONAL or IF AND ONLY IF (↔) CONNECTIVE EXAMPLES • Let p be the proposition “The sky is clear.” • Let q be the proposition “It is raining.” • Some examples that combine these are: – The sky is clear and it is raining. (p∧q) – The sky is clear and it is not raining. (p∧¬q) – It is raining if and only if the sky is not clear. (q ↔ ¬ p) TRUTH TABLES • Truth Tables are used to show the relationship between the truth values of individual propositions and the compound propositions based on them. • Example: p q p ∧q T T T T F F F T F F F F NEGATION • If p is a proposition, the negation of p, denoted ¬p, means “it is not p.” • Example: Let p be the statement “this class has 30 students.” Then ¬p is the statement “this class does not have 30 students.” • It should be obvious that the negation of a proposition has the opposite truth value. In other words, if p is true, then ¬p is false. • The truth table for ¬p is p ¬p T F F T AND • Let p and q be propositions. The proposition “p and q,” denoted by p∧q, is true if and only if both p and q are true. • p∧q is called the conjunction of p and q. • The truth table for p∧q is p q p ∧q T T T T F F F T F F F F OR • Let p and q be propositions. The proposition “p or q,” denoted by p∨q, is false if and only if both p and q are false. In other words, it is true if either p or q is true, and false otherwise. • p∨q is called the disjunction of p and q. • The truth table for p∨q is p q pvq T T T T F T F T T F F F XOR • Let p and q be propositions. The proposition “p exclusive or q,” denoted by p⊕q, is true if and only if either p or q is true, but not both. • When the term OR is used in conversation, often the correct interpretation is XOR. • The truth table for p⊕q is p q p⊕q T T F T F T F T T F F F IMPLICATION • Let p and q be propositions. The proposition “p implies q,” denoted by p→q, is false if and only if p is true and q is false. • p→q is called an implication. • The truth table for p→q is p q p→q T T T T F F F T T F F T BICONDITIONAL • Let p and q be propositions. The proposition “p if and only if q,” denoted by p↔q, is true if and only if p and q have the same truth value. • p↔q is called a biconditional. • The truth table for p↔q is p q p↔q T T T T F F F T F F F T CONSTRUCTING TRUTH TABLES • Construct the truth table for the proposition ((p∧q)∨¬q) • We do this step by step as follows: p q p∧q ¬q ((p∧q)∨¬q) T T T F T T F F T T F T F F F F F F T T EVERYDAY LOGIC • Logic is used in many places: Writing Speaking Search engines Mathematics Computer Programs • A proper understanding of logic is useful, as the following examples will demonstrate. LOGIC AT WORK Situation: Your boss said “If you are productive, you can have one week vacation or two weeks off from work.” Problem: You were productive, so you took one week vacation and two weeks off from work. Your boss got mad because you had both. • Solution: A simple miscommunication. By having one week vacation and two weeks off from work, you had one week vacation or two weeks off. But as is often the case in conversation, he really meant XOR, not OR. LOGIC IN PROGRAMMING I • Situation: If x is greater than 0 and is less than or equal to 10, you need to increment it. • Problem: You tried the following, but it seems too complicated, and doesn’t compile. if(0<x<10 OR x=10) x++; • Solution: Try: if(x>0 AND x<=10) x++; SOME TERMINOLOGY • Definition: A tautology is a proposition that is always true. • Definition: A contradiction is a proposition that is always false. • Definition: A proposition that is not a tautology or a contradiction is a contingency. PROPOSITIONAL EQUIVALENCE Definition #1: Propositions p and q are called logically equivalent if p↔q is a tautology. Definition #2: Propositions p and q are logically equivalent if and only if they have the same truth table. Notation: If p and q are equivalent, we write p⇔q Example: The propositions ¬p∨q and p→q are logically equivalent. We can see this by constructing the truth tables. p q p→q ¬p ¬pvq • Hence T T T F T they are T F F F F logically F T T T T equivale F F T T T nt. HOMEWORK ... (SUBMIT ON 19\09\11) 1. Construct the truth table for the following propositions a) (p→q)∧p 2. Is ¬(p→q)→¬q a tautology? Proof it. 3. Show that p↔q and (p∧q)∨(¬p∧¬q) are logically equivalent. 4. Show that (p →q) ↔ (¬q∧¬p) is contradiction 5. Show that (q ∧¬p) is contingency 1. THE ARRAY STRUCTURE The Array is a homogenous structure for storing components of the same type. The array is probably the most widely used data structure; in some languages it is even the only one available. The array is a random-access structure, because all components can be selected at random and are equally quickly accessible. In order to denote an individual component, the name of the entire structure is augmented by the index selecting the component. This index is to be an integer between 0 and n-1, where n is the size, of the array. THE ARRAY STRUCTURE CONT. In C++, an Array is defined as follows: dataType ArrayName [array_size]; e.g. double marks[25]; Student GCUCStudent[25]; .. . 0 1 2 3 N -1 ARRAY INITIALIZATION By default, arrays are uninitialized. When they are declared, they may be assigned a value: float x[7] = {-1.1,0.2,2.0,4.4,6.5,0.0,7.7}; or, float x[7] = {-1.1, 0.2}; in the above declaration, the elements 2 ... 6 are set to zero. Also: int a[] = {3, 8, 9, 1}; is valid, the compiler assumes the array size to be 4. USING ARRAYS Accessing an array out of bounds will not be identiﬁed by the compiler What does the following operations mean? A[5] = A[4]+1 K[9]++ N[12+3]=0 THE ARRAY STRUCTURE CONT. // Arrays1.cpp // gets four ages from user, displays them #include <iostream> using namespace std; int main() { int j; int age[4]; //array ‘age’ of 4 ints for(j=0; j<4; j++) //get 4 ages { cout << “Enter an age: “; cin >> age[j]; //read array element } for(j=0; j<4; j++) //display 4 ages cout << “ “ << age[j] <<“ ”<< endl; return 0; } THE ARRAY STRUCTURE CONT. Here’s a sample interaction with the program: Enter an age: 44 Enter an age: 16 Enter an age: 23 Enter an age: 68 44 16 23 68 The first for loop gets the ages from the user and places them in the array, while the second retrieves the elements from the array and displays them. EXERCISE 1 ...(SUBMIT ON 25\09\10) In a certain island, a worker pays 10 percent of his monthly salary as tax. Write a C++ program that makes use of Array(s) to take the 12 month salary of a worker one after the other and prints out: (i) The total tax the worker paid in the year (ii) The worker’s average take-home (salary - tax) pay for the year. E.g. Enter the amount for month 1: 100 ...... Enter the amount for month 12: 50 the total tax you paid is: 35 your average take-home pay is: 100 EXERCISE 2 ....(TRY AND SUBMIT 25/09/10) Write a C++ program that makes use of Arrays. Your program should accept 10 numbers from a user, and it should output the standard deviation of the numbers. Below is the formula for standard deviation. • Where n is the number of items • Where x stands for each of the values • Where X ̅ is the mean of the numbers MULTIDIMENSIONAL ARRAYS We’ve looked at arrays of one dimension: A single variable specifies each array element. But arrays can have higher dimensions (2D, 3D, etc). In a two dimensional array (2D) The array is defined with two size specifiers, each enclosed in brackets: e.g. double marks[STUDENTS][SUBJECTS]; 2-DIMENSIONAL ARRAYS A 2D array is declared as follows: const int ROWS = 3; const int COLS = 5; int A[ROWS][COLS]; 2D array initialization int a[2][3] = {1,2,3,4,5,6}; int a[2][3] = {{1,2,3},{4,5,6}}; int a[ ][3] = {{1,2,3},{4,5,6}}; Although a is stored in a contiguous block of memory, we may think of it as a 2D rectangle of data. SAMPLE PROGRAM Below is a program, STUDSUBJ, that uses a two-dimensional array to store the exam marks of students for several students and several subjects: // studsubj.cpp // displays mark sheet of students. #include <iostream> using namespace std; const int STUDENTS = 5; //array dimensions const int SUBJECTS = 3; int main() { int stud, subj; double marks[STUDENTS][SUBJECTS]; //two-dimensional array for(stud=0; stud<STUDENTS; stud++) //get array values for(subj=0; subj< SUBJECTS; subj++) { cout << “Enter mark for student “ << stud+1; cout << “, subject “ << subj+1 << “: “; cin >> marks[stud][subj]; //put number in array } cout << “\n Subject\n”; cout << “ 1 2 3”; for(d=0; d<STUDENTS; d++) { cout <<”\nStudent “ << d+1; for(m=0; m<SUBJECTS; m++) { //display array values cout << marks[stud][subj]; //get number from array } return 0; } //end main The above program accepts the marks from the user and then displays them in a table Enter mark for student 1, subject 1: 40 Enter mark for student 1, subject 2: 50 Enter mark for student 1, subject 3: 60 .................................... Enter mark for student 5, subject 1: 65 Enter mark for student 5, subject 2: 73 Enter mark for student 5, subject 3: 58 Subject 1 2 3 student 1 40 50 60 .................. student 5 65 73 58 MULTIDIMENSIONAL ARRAYS CONT. In the program above, the array is defined with two size specifiers, each enclosed in brackets: double marks[STUDENTS][SUBJECTS]; You can think about marks as a two-dimensional array, laid out like a checkerboard. Another way to think about it is that marks is an array of arrays. It is an array of STUDENTS elements, each of which is an array of SUBJECTS elements. there can be arrays of more than two dimensions. A three- dimensional array is an array of arrays of arrays. It is accessed with three indexes: elem = dimen3[x][y][z]; A TWO DIMENSIONAL ARRAY IN MEMORY ASSIGNMENT In a certain class, 20 students registered for 5 subjects. Write a C++ program that receives the marks of each student for each subject and prints out the following: (i) The average mark of each student. (ii) The average mark of each subject. (iii) The highest mark scored in each subject. ASSIGNMENT An example of the interaction your program should have is: Enter mark for student 1 subject 1: Enter mark for student 1 subject 2: ............ Enter mark for student 20 subject 5: The average mark for student 1 is : The average mark for student 2 is: ..... The highest mark scored in subject 1 is: The highest mark scored in subject 2 is: .... EXERCISE Write a program that accepts the dimensions of an 3x3 matrice, and takes the values of 2 matrices (A,B) from the user. The program should then output the following: (i) The new matrice when the two matrices are added (i.e. A+B) (ii) The new matrice when the two matrices are subtracted(i.e. A-B) (ii) The new matrice when the two matrices are multiplied (A*B) An example of the interaction your program should have with the user should be: Enter the dimension for the matrice: 3 Enter matrice A[0][0]: 2 .............. Enter matrice A[2][2]: 4 Enter matrice B[0][0]:2 ...... Enter matrice B[2][2] Below is the sum of the two matrices aa bb cc dd ee ff gg hh ii Below is the subtraction of the two matrices aa bb cc dd ee ff gg hh ii INTRODUCTION TO ADVANCED C++ CONCEPTS INTRODUCTION TO POINTERS We will introduce and discuss one of the most powerful features of the C++ programming language known as Pointers. CALL-BY-VALUE #include<iostream> using namespace std; void change (int v){ v = 2; } int main() { int v = 1; change(v) cout << v <<endl; system(“PAUSE”); } CALL-BY-VALUE CONT… In the program above, the function cannot change the value of v as defined in main() since a copy is made of it in the function. To allow a function to modify the value of a variable passed to it we need a mechanism known as call-by-reference, which uses the address of variables (pointers). POINTERS Normally, a variable directly contains a specific value. However, a pointer contains the memory address of a variable that in turn, contains a specific value. Hence a variable name directly references a value, and a pointer indirectly references a value. Referencing a value through a pointer is often called indirection. DECLARING POINTERS Pointers, like any other variables, must be declared before they can be used. For example: int count, * countPtr; The statement above declares the variable count to be of type int and countPtr to be of type int * (pointer to an int value) and it is called Pointer to int. Each variable being declared as a pointer must be preceded by an asterisk (*). DECLARING POINTERS CONT.. For example double *xaxis, *yaxis, c; Indicates that both xaxis and yaxis are pointers to double values. When * appears in a declaration, it is not an operator, it rather indicates that the variable being declared is a pointer. Pointers can be declared to point to objects of any data type. Pointers have a legal range which includes the special address 0 and a set of positive integers which are the machine addresses of a particular system. POINTERS CONT… Pointers should be initialized either when they are declared or in an assignment. A pointer may be initialized to 0, NULL or an address. A pointer with the value 0 or NULL points to nothing and is known as a null pointer. Symbolic constant NULL is defined POINTERS: ASSIGNMENT count ? countPtr countPtr = NULL; count countPtr countPtr = &count; count countPtr POINTER: ASSIGNMENT CONT… count = 5; 5 count countPtr DEREFERENCING POINTERS count = 10; cout << *countPtr<<endl; This prints 10 count = 23 cout<< *countPtr<<endl; This prints 23 *countPtr = 100; 100 count countP tr cout<<count<<endl; This prints 100 • In many ways the dereference operator * is the inverse of the address operator & …. Pointers may be assigned when both sides have the same type: int *p, x; p = 10; /* Illegal */ p = (int *) 10 ; /* Legal */ P = &x; /* Legal */ THE SWAP FUNCTION #include <iostream> Using namespace std; void swap( int *p, int *q); int main(){ int a = 3, b = 7; cout<<a<<“ “<<b<<endl; swap(&a,&b); cout<<a<<“ “<<b<<endl; system(“PAUSE”); return 0; } Void swap(int *p, int *q){ int tmp; tmp = *p; *p = *q; *q = tmp; } DISSECTING SWAP() tmp 3 a 7 b 0 p q tmp = *p; *p = *q; *q = tmp; tm 7 a 3 b p 3 p q POINTERS AND ARRAYS The concept of array is very much like pointer. The identifier of an array is equivalent to the address of its first element as a pointer is equivalent to the address of the first element that it points to. For example: int a[5]; int *p; Declares an array of 5 elements, and a is the address of the start of the array. Assigning: p = a; is completely valid and the same as: p = &a[0]; POINTERS AND ARRAYS CONT… 100 100 100 101 101 Address 0 4 8 2 6 Array Index 0 1 2 3 4 p To assign p to point to the next element, we could use p = a + 1; Or p = & a[1]; POINTER ARITHMETICS Arithmetical operations on pointers is a little different than to conduct them on regular integer data types. Only addition and subtraction operations are allowed to be conducted with them. Both addition and subtraction have a different behavior with pointers according to the size of the data type to which they point. POINTER ARITHMETICS CONT. For example, Let’s assume that in a given compiler for a specific machine, char takes 1 byte, short takes 2 bytes and long takes 4 bytes. char *mychar; short *myshort; long *mylong; Let’s assume they point to memory locations 1000, 2000 and 3000 respectively. So if we write mychar++; myshort++; mylong++; mychar will contain the value 1001, myshort will contain the value 2002, and mylong would contain the value 3004. Hence when adding one to a pointer we are making it point to the following element of the same type with which it has been defined. DYNAMIC MEMORY In most of our programs, we have only have as much memory available as we declared for our variables, and we can determine the size of all of them in the source code before the execution of the program. What if we need a variable amount of memory that can only be determined during runtime? What if we need some user input to determine the amount of memory space to allocate? OPERATORS NEW [] In order to request dynamic memory space (e.g. for an array) we use the operator new followed by a data type specifier and the number of space we require within a []. It returns a pointer to the beginning of the new block of memory allocated. Its form is: pointer = new type [size] E.g int *marks; marks = new int[5]; The system dynamically assigns space for five elements of type int and returns a pointer to the first element of the sequence, which is assigned to marks. In order to avoid errors from occurring or the program terminating when there is no memory space to allocate, we use a special method called nothrow OPERATOR NEW[] CONT. For example marks = new int [5]; //if it fails an exception will be thrown that will terminate the program. Now the nothrow method can be used. marks = new (nothrow) int [5]; What happens when the nothrow method is used is that when a memory allocation fails, instead of throwing an exception or terminating the program, new returns a null pointer, and the program continues its execution. So if the allocation of the memory fails, the failure could be detected by checking if marks took a null pointer value int *marks; marks = new (nothrow) int [5]; if(marks==0){ //error assigning memory. Take measures } OPERATOR DELETE [] Since the necessity of dynamic memory is usually limited to specific moments within a program, once it is no longer needed it should be freed so that the memory becomes available again for other requests of dynamic memory. This is the purpose of the operator delete, whose format is: delete [] pointer; E.g. delete [] marks; //will free the space allocated to the pointer marks in the previous code. PROGRAM EXAMPLE #include <iostream> using namespace std; int main () { int i,n; int * p; cout << "How many numbers would you like to type? "; cin >> i; p= new (nothrow) int[i]; if (p == 0) cout << "Error: memory could not be allocated"; else { for (n=0; n<i; n++) { cout << "Enter number: "; cin >> p[n]; } cout << "You have entered: "; for (n=0; n<i; n++) cout << p[n] << ", "; delete[] p; } return 0; } RESULTS OF THE PROGRAM WHEN RUN How many numbers would you like to type? 5 Enter number : 100 Enter number : 200 Enter number : 300 Enter number : 400 Enter number : 500 You have entered: 100, 200, 300, 400, 500 THINGS TO NOTE… You can see how by using Pointers, we can allocate dynamic memory space like marks = new nothrow int [i]; When a user enters a value so big that our system cannot handle, a null pointer will be generated. For example in the previous program, we will get a text like "Error: memory could not be allocated"; SUMMING AN ARRAY #include <iostream> using namespace std; int main(){ int a[] = {10, 15, 20, 25, 30, 35, 40}; int *p, sum = 0; p = a; //equate the first item in the array to the pointer. for(int i = 0; i<a.length; i++){ sum = sum + *p; p++; } cout<< “The sum is: “ << sum<<endl; system(“PAUSE”); return 0; } ASSIGNMENT 1. Change the code of the standard deviation to take dynamic numbers. Change the code of the student/subject assignment to take dynamic number of students and subjects NB: ***Use pointers. INTRODUCTION TO STRUCTURES AND CLASSES STRUCTURES A Structure is a container, that is used to hold a group of data elements grouped together under one name. These data elements, known as members, can have different types and different lengths. Data structures are declared in C++ using the following syntax: struct structure_name { member_type1 member_name1; member_type2 member_name2; member_type3 member_name3; . . }[object_names]; where structure_name is a name for the structure type. STRUCTURE MEMBERS • Each thing in a structure is called member. • Each member has a name, a type and a value. • Names follow the rules for variable names. • Types can be any defined type. STRUCTURE EXAMPLE - STUDENT RECORD • We want to save the following details for each and every student. • Student Record: – Name a string – HW Grades an array of 3 doubles – Final Exam double EXAMPLE STRUCTURE DEFINITION struct StudentRecord { string name; // student name double hw[3]; // homework grades double exam; // exam mark }; Using a struct • By defining a structure you create a new data type. • Once a struct is defined, you can create variables of the new type. E.g. struct StudentRecord { string name; // student name double hw[3]; // homework grades double exam; // exam mark }; StudentRecord stu; int main(){ StudentRecord s1,s2; Accessing Members • You can treat the members of a struct just like variables. • You need to use the member access operator '.' (pronounced "dot"): cout << stu.name << endl; stu.hw[2] = 82.3; stu.exam = 50; stu.hw[1] = 0; stu.hw[0] = 23; EXAMPLE CODE #include <iostream> #include <string> using namespace std; struct movies { string title; int year; } mine, yours; void printmovie (movies movie); int main () { string mystr; mine.title = “Expendables"; mine.year = 2010; cout << "Enter title: "; cin>>yours.title; cout << "Enter year: "; cin>>yours.year cout << "My favorite movie is:\n "; printmovie (mine); cout << "And yours is:\n "; printmovie (yours); sytem(“PAUSE”); return 0; } void printmovie (movies movie) { cout << movie.title; cout << " (" << movie.year << ")\n"; } WHEN THE ABOVE CODE IS RUN, THE FOLLOWING APPEARS Enter title: Salt Enter year: 2010 My favorite movie is: Expendables (2010) And yours is: Salt (2010) EXAMPLE CODE Structures are a feature that can be used to represent databases, especially if we consider the possibility of building arrays of them. The following are c++ code that makes use of array of structures to save records. EXERCISE In GCUC you are only allowed onto the campus if you provide your username and password and it exists in our database. Write a program and manually save usernames and passwords of 5 different students. Ask a user to give u a password and compare if the name and password exist in your database. If it does, print out “You are admitted” otherwise print out “you are not a student.” CODE TO ACCEPT MOVIES // array of structures #include <iostream> #include <string> using namespace std; struct movies_t { string title; int year; }; const int N = 3; void printmovie (movies_t movie); int main () { movies_t mov[N]; for (int n=0; n<N; n++) { cout << "Enter title: "; cin>>mov[n].title; cout << "Enter year: "; cin>>mov[n].year; } cout << "\nYou have entered these movies:\n"; for (n=0; n<N; n++) printmovie (mov[n]); system(“PAUSE”);return 0; } void printmovie (movies_t movie) { cout << movie.title; cout << " (" << movie.year << ")\n"; } WHEN THE ABOVE CODE IS RUN, THESE WILL BE THE RESULTS. Enter title: Blade Runner Enter year: 1982 Enter title: Matrix Enter year: 1999 Enter title: Taxi Driver Enter year: 1976 You have entered these movies: Blade Runner (1982) Matrix (1999) Taxi Driver (1976) Another code example to receive points . #include<iostream> using namespace std; struct StudentRecord{ string name; double hw[3]; double exam; }; int main(){ StudentRecord abc[5]; for(int i=0; i<5; i++){ cout<<"enter name of the next Student: "; cin>>abc[i].name; for(int j = 0; j<3; j++){ cout<<"enter mark for homework: "<<j+1; cin>>abc[i].hw[j]; cout<<endl; } cout<<"\nEnter Exam mark: "; cin>>abc[i].exam; cout<<endl; // abc[i].emp = new Employee; } cout<<"the data you entered are as follows"; cout<<"\n\t Name\tHW1\tHW2\tHW3\tExam\n"; for(int i=0; i<5; i++){ cout<<i<<".\t"<<abc[i].name<<"\t"<<abc[i].hw[0]<<"\t" <<abc[i].hw[1]<<"\t"<<abc[i].hw[2]<<"\t"<<abc[i].exam<<endl; } system("PAUSE"); return 0; } ASSIGNMENT Write a program that makes use of structures to accept the names and 12-month salary of 5 employees and prints out (i) the name and average salary of each employee (ii) If the end of year bonus is 10% of the annual sarlary of an employee plus 30, calculate the end of year bonus of each employee and print it out. CLASSES Class definitions include data members and member functions Member function = method A Class can be used to define a Data Type Objects created from the class are the data items Methods define the operations Data members store attributes of the items CLASS EXAMPLE - TIC TAC TOE BOARD class TTTBoard{ class className { public: class body TTTBoard(); }; int markX(int r, int c); int markO(int r, int c); Constructor void showBoard(void); Member Access int gameOver(void); Specifiers private: Member functions char last_mark; char board[9]; Data Members int done; }; CONSTRUCTING A TTTBOARD TTTBoard::TTTBoard(){//constructor last_mark=' '; //space means no mark for (int j=0;j<9;j++) board[j]=' '; done = 0; //not done - just beginning! } Executed when a TTTBoard object is instantiated Forces consistent initialization MAKING YOUR MARK int TTTBoard::markX(int r, int c){ if (done || last_mark=='X') return 0; //Not X's turn int j=(r-1)*3+(c-1); //translate row column if (board[j]<>' ') return 0; //spot taken last_mark = board[j]= 'X'; //remember last mark gameOver(); //check state of game return 1; //successful marking } Methods can enforce correct marking of the board SHOWING THE BOARD void TTTBoard::showBoard(void){ for (int j=0; j<9; j+=3){ for (int c=0; c<3; c++) cout << board[j+c]; cout << endl; } } Methods provide high-level operations Changing printed format will be easy MEMBER ACCESS SPECIFIERS public: Accessible using the member access operators as in object.public_member private: //this is the default for classes Only accessible from within member functions or within friend functions protected: Applies when using inheritance CONSTRUCTOR A constructor is a public member function with the same name as the class No return type is specified Overloading is common Used to initialize data members Called whenever an object is instantiated Do not call constructors from your code SPECIAL CONSTRUCTOR TYPES Default constructor No parameters or all Copy constructor parameters have Makes a copy of an object defaults - creates an of the same class object out of nothing Used if an object is passed If you do not specify by value any constructor, then a default constructor that does nothing is automatically included A FRACTION CLASS EXAMPLE Default - Creates a class Fraction{ fraction of value 0 public: Conversion - Creates a Fraction(); fraction n/1 Fraction(int n); Copy - Creates a copy Fraction(Fraction & f); of f Fraction(int n, int d); Other - Creates a //other stuff later fraction n/d private: int num, den; }; DESTRUCTOR Called when the memory for an object is about to be reclaimed by the operating system ~className(void); No return type, no arguments Often not needed - if omitted, a default destructor (that does nothing) is added Important when data members use dynamically allocated storage SCOPE Class data members and member functions have class scope They can be freely used from within member and friend functions They can be used outside of this setting if they are attached to an object with a member access operator (. Or ->) and access is allowed INTERFACE VS. IMPLEMENTATION Class definitions belong in header files Member function definitions belong in associated source (.cpp) files This keeps the implementation hidden from the clients Implementation changes may be accomplished without recompiling other parts of the application HEADER FILES //Fraction.h Modules needing the #ifndef FRACTION_H Fraction class definition should use #define FRACTION_H #include "Fraction.h" class Fraction{ The conditional # … directives prevent }; duplicate definitions #endif in large projects SET AND GET FUNCTIONS Member Functions that provide access to private data members of a class void setDenominator(int d); int getDenominator(void); Set functions can control changes to data members setDenominator(0) should be ignored These functions may "translate" data UTILITY FUNCTIONS Private member functions are called only from other functions in the class (or from friends) Tasks shared by several member functions Subtasks created to reduce complexity CONSTRUCTOR INVOCATIONS Fraction F; Default Fraction G(-3); Conversion Fraction H(1,2); Other Fraction I(G); Copy F = 3; Conversion foo(G); Copy (assuming call by value) OBJECTS UNDER CONSTRUCTION Global objects Constructed first - before main starts - and destructed last - at program termination Automatic local objects Constructed when execution enters the block in which they are defined and are destructed when execution leaves their scope Static local objects Constructed once when execution first reaches their definition and destructed at program termination CLASS COMPOSITION AND CONSTRUCTOR CALLS class Time{ Time(int, int, int); Date x; int h, m, s; Add member initializer Date():t(0,0,0){… } class Date{ Date(); Time z(3,30,17); Date(int, int, int, Date y(11,9,53,z); Time); Using member initializers Time t; Date(int m,int d,int y,Time c) :t(c),mo(m),da(d),ye(y){ int mo, da, ye; } ABSTRACT DATA TYPE Abstract Data Type (ADT) consists of Data Structure declaration Operations performed on the data structure E.g. Create, destroy, or manipulate Provides data encapsulation (information hiding) An ADT is implemented as a Class in languages such as C++ and Java ADT SPECIFICATION The specification of an ADT describes how the operations (functions, procedures, or methods) behave in terms of inputs and outputs. A specification of an operation consists of: Calling prototype Preconditions Postconditions The calling prototype includes Name of the operation Parameters and their types Return value and its types The preconditions are statements Assumed to be true when the operation is called The postconditions are statements Assumed to be true when the operation returns. OPERATIONS FOR ADT Constructors Create a new object and return a reference to it. Access functions Return information about an object, but do not modify it. Manipulation procedures Modify an object, but do not return information. Destructors Deallocate an object. MORE ON ADT’S State of an object : is the current value of the object’s data DATA STRUCTURES (DS) A data structure represent the way an ADT is implemented. We will look at simple data structures for representing dynamic sets. These data structures are often implemented using dynamically allocated objects and pointers. The main operations on these data structures are the insertion and the deletion of an element, searching for an element, finding the minimum or maximum element, finding the successor or predecessor of an element, etc. 1.1 STACKS A stack is a linear structure in which insertions and deletions are always made at one end, called the top. Objects can be inserted into a stack at any time, but only the most recently inserted (“last”) object can be removed at any time. E.g., Internet Web browsers store the address of recently visited sites on a stack, the undo function in Microsoft Word. The updating policy of Stacks is called Last In First Out (LIFO) STACKS CONT. Push Pop STACK ABSTRACT DATA TYPE CONT. A stack is an abstract data type (ADT) supporting the following two methods Push(x): insert object x at the top of the stack. Pop(): remove from the stack and return the top object on the stack. An error occurs if the stack is empty. Stack supporting methods Size(): return the number of objects in the stack. isEmpty(): return a boolean indicating if the stack is empty. Top(): return the top object on the stack, without removing it. An error occurs if the stack is empty. APPLICATIONS OF STACKS Page visited history in a web browser Undo sequence in a text editor. Chain of method calls in the C++ runtime environment or the Java Virtual Machine. ARRAY IMPLEMENTATION OF STACK A stack can be implemented with an N element array S, with elements stored from S[0] to S[t], where t is an integer that gives the index of the top element in S. Algorithm push(obj): If size() = N then indicate that a stack-full error has occurred Else t = t +1 S[t] = obj Algorithm pop(): If isEmpty() then indicate that a stack-empty error has occurred Else a = S[t] s[t] = null t = t-1 return a Write the algorithm for Top() Size() IsEmpty() Push() Pop() ARRAY IMPLEMENTATION OF STACK CONT. In a push operation, when the array is full, we can replace the array with a larger one. How large should the new array be? 2 strategies: Incremental strategy: increase the size by a constant c Doubling strategy: double the size 1.2 QUEUES A queue is a container of objects that are inserted according to the first in first out (FIFO) principle. Objects can be inserted into a queue at any time, but only the element that was in the queue the longest can be removed at any time. We say that elements enter the queue at the rear and are removed from the front. Examples: queues in banks, arrangement of plates in restaurants, etc. QUEUES CONT. enqueue dequeue QUEUE ADT The queue ADT supports the following two fundamental methods enqueue(o) : insert object o at the rear of the queue dequeue(o) : remove and return from the queue the object at the front; an error occurs if the queue is empty Queue supporting methods size() : return the number of objects in the queue isEmpty() : return a Boolean value indicating whether the queue is empty front() : return, but do not remove, the front object in the queue; an error occurs if the queue is empty APPLICATIONS OF QUEUES Waiting lines Access to shared resources (e.g., printer) ARRAY IMPLEMENTATION OF QUEUE A queue can be implemented in an N element array Q, with elements stored from Q[f] to Q[r] (mod N). f is an index of Q storing the first element of the queue (if not empty), r is an index to the next available array cell in Q (if Q is not full). ARRAY IMPLEMENTATION OF QUEUE Algorithm dequeue(): if isEmpty() then throw a QueueEmptyException else temp = Q(f) Q(f) = null f = (f + 1) mod N return temp Algorithm enqueue(obj): if size()= N-1 then throw a QueueFullException else Q[r] = obj; r = (r + 1) mod N LINKED LISTS Simple data structures such as arrays, sequential mappings, have the property that successive nodes of the data object are stored a fixed distance apart. These sequential storage schemes proved adequate given the functions one wished to perform (access to an arbitrary node in a table, insertion or deletion of nodes within a stack or queue). However, when a sequential mapping is used for ordered lists, operations such as insertion and deletion of arbitrary elements become expensive. E.g consider the following list of all the three letter English words ending at AT: (BAT, CAT, EAT, FAT, HAT, JAT, LAT, MAT, OAT, PAT, RAT, SAT, TAT, VAT, WAT) LINKED LISTS CONT. Suppose we want to add the word GAT. If we are using an array to keep this list, then the insertion of GAT will require us to move elements already in the list either one location higher or lower. We must either move HAT, JAT, LAT,..., WAT or else move BAT, CAT, EAT, FAT. If we have to do many such insertions into the middle, then neither alternative is attractive because of the amount of data movement. Sequential representation is achieved by using linked representations. Unlike a sequential representation where successive items of a list are located a fixed distance apart, in a linked representation these items may be placed anywhere in memory. To access elements in the list in the correct order, with each element we store the address or location of the next element in that list. Thus associated with each data item in a linked representation is a pointer to the next item. LINKED LISTS CONT. It is much more easier to make an arbitrary insertion or deletion using a linked list rather than a sequential list. To insert the data item GAT between FAT and HAT the following steps are adequate: get a node which is currently unused; let its address be x; set the data field of this node to GAT; set the link field of x to point the node after FAT which contains HAT; set the link field of the node containing FAT to x. The important thing is that when we insert GAT we do not have to move any other elements which are already in the list. We have overcome the need to move data at the expense of the storage needed for the second field, link. 1.3 SINGLY LINKED LIST A singly linked list is a concrete data structure consisting of a sequence of nodes. Each node stores: element link to the next node Traversing in only one direction is possible head nu er ll A B C D • We can implement a queue with a singly linked list. The front element is stored at the first node. The rear element is stored at the last node. LIST ADT (SEQUENCE OF ELEMENTS) List ADT supports the referring methods: First(): return the position of the first element; error occurs if list S is empty Last(): return the position of the last element; error occurs if S is empty isFirst(): return a Boolean indicating whether the given position is the first one isLast() : return a Boolean indicating whether the given position is the last one before(p) : return the position of the element in S preceding the one at position p; error if p is first after(p) : return the position of the element in S succeeding the one at position p; error if p is last LIST ADT (SEQUENCE OF ELEMENTS) CONT. List ADT supports the following update methods: replaceElement(p,e) : p – position, e -element swapElements(p,q) : p,q - positions insertFirst(e) : e - element insertLast(e) : e - element insertBefore(p,e) : p – position, e - element insertAfter(p,e) : p – position, e - element remove(p) : p – position We will write algorithms for the methods above. 1.4 DOUBLY LINKED LIST A node in a doubly linked list stores two references – a next link, and a prev link which points to the previous node in the list (traversing in two directions is possible). header tailer nu nu ll ll A B C D DOUBLY LINKED LIST CONT. Element Insertion: To insert a new node after a certain node Pseudocode for insertAfter(p,e) Algorithm insertAfter(p,e): Create a new node v v.element = e v.prev = p //link v to its predecessor v.next = p.next //link v to its successor (p.next).prev = v //link p’s old successor to v p.next = v //link p to its new successor, v return v //the position for the element e. DOUBLY LINKED LIST CONT. Elementremoval: To remove a node from the list The pseudocode for remove (p) Algorithm remove(p): t = p.element (p.prev).next = p.next (p.next).prev = p.prev p.prev = null p.next = null return t; LIST UPDATE COMPLEXITY What is the cost (complexity) of both insertion and removal update? If the address of element at position p is known, the cost of an update is O(1). If only the address of a header is known, the cost of an update is O(p) (we need to traverse the list from position 0 up to p). SEARCHING A LINKED LIST The following procedure finds the first element with key k in list L. It returns a pointer to that element. If no element with key is found, the special pointer NIL is returned. List_Search(L,k) x := head[L] while x!=NIL and key[x]!=k do x := next[x] return x ANALYSIS OF ALGORITHMS There are often several different algorithms which correctly solve the same problem. How can we choose among them? There can be several different criteria: Ease of implementation Ease of understanding Efficiency in time and space The first two are somewhat subjective. However, efficiency is something we can study with mathematical analysis, and gain insight as to which is the fastest algorithm for a given problem. To analyze an algorithm is to determine the amount of resources (such as time and storage) necessary to execute it ANALYSIS OF ALGORITHMS: TIME EFFICIENCY Time efficiency is analyzed by determining the number of repetitions of the basic operation as a function of input size. Basic operation: the operation that contributes most towards the running time of the algorithm. Several factors affect the running time of a program. Some, such as the compiler and computer used, are obviously beyond the scope of any theoretical model. The other main factors are the algorithm used and the input to the algorithm. Typically, the size of the input is the main consideration. We define two functions, Tavg(n) and Tworst(n), as the average and worst-case running time, respectively, used by an algorithm on input of size n. Clearly, Tavg(n) <= Tworst(n). Running Time • Usually a function of the input size – running time for input of size n is T(n) Growth rates of Algorithms An algorithm can be said to exhibit a growth rate on the order of a mathematical function if beyond a certain input size n, the function f(n) times a positive constant provides an upper bound or limit for the run- time of that algorithm. In other words, for a given input size n greater than some n0 and a constant c, the running time of that algorithm will never be larger than c × f(n). The existing growth rates are: constant growth rate: T(n) = c linear growth rate : T(n) = c*n quadratic growth rate : T(n) = c*n2 exponential growth rate : T(n) = c*2n logarithmic growth rate : T(n) = c*log n Growth Rates logarithmic Running Time linear quadratic cubic exponential Input Size (n) Best-case, Worst-Case, Average- Case • Hold the size of the program fixed – Best case • input that requires the fewest basic operations to obtain results – Worst case • input that requires the most basic operations to obtain result – Average case • expected number of steps for an arbitrary input Asymptotic Analysis • It is a method of analyzing algorithms that Ignores constants in the running time T(n) and focuses on analyzing T(n) as n "gets large" • This is the most general measure of efficiency of an algorithm • It is commonly used when comparing algorithms designed to solve the same problem Upper Bounds • Is the highest growth rate for an algorithm's running time • We classify T(n)'s in groups with similar growth rates • The names of these groups are called big-O groups – Each is named after a standard representative of the group O(F(N)) Definition: T(n) is in the set O(f(n)) if and only if there exist positive constants c and n0 such that |T(n)| <= c|f(n)| for all n > n0 In practice, f(n) is a standard upper bound function: n, log n, n2, etc. BIG-O NOTATION Describes an upper bound for the running time of an algorithm Upper bounds for sequential search running times: worst case: O(n) T(n) = c1*n + c2 best case: O(1) T(n) = c1 average case: O(n) T(n) = c1*n/2 + c2 EXAMPLE OF BIG O NOTATION In typical usage, the formal definition of O notation is not used directly; rather, the O notation for a function f(x) is derived by the following simplification rules: (i) If f(x) is a sum of several terms, the one with the largest growth rate is kept, and all others are omitted. (ii). If f(x) is a product of several factors, any constants (terms in the product that do not depend on x) are omitted. For example, let f(x) = 6x4 − 2x3 + 5 using O notation, to describe its growth rate as x approaches infinity. This function is the sum of three terms: 6x4, −2x3, and 5. Of these three terms, the one with the highest growth rate is the one with the largest exponent as a function of x, namely 6x4. the first factor does not depend on x. Omitting this factor results in the simplified form x4. Thus, we say that f(x) is a big-oh of (x4) or mathematically we can write f(x) = O(x4) RUNNING TIME FOR MATRIX ADDITION for (r=0; r<n; r++) for (c=0; c<m; c++) c[r][c] = a[r][c] + b[r][c]; Inner loop: 3 array dereferences, add, assign, loop comparison and increment m * O(1) == O(m) Outer loop: Inner loop + loop comparison and increment n * O(m) == O(m*n) RUNNING TIME FOR LIST INSERTATFRONT (ARRAY) Depends on current list size (n) Must shift array contents - two array dereferences and an assignment n * O(1) What if list elements are strings or linked lists and assignment must copy the data? O(1)??? The answer depends on what we are trying to measure! After shifting, we dereference the array and assign - total cost: O (n) RUNNING TIME CALCULATIONS A Simple Example Here is a simple program fragment to calculate unsigned int sum( int n ) { unsigned int i, partial_sum; /*1*/ partial_sum = 0; /*2*/ for( i=1; i<=n; i++ ) /*3*/ partial_sum += i*i*i; /*4*/ return( partial_sum ); } The analysis of the above program is simple. The declarations count for no time. Lines 1 and 4 count for one unit each. Line 3 counts for three units per time executed (two multiplications and one addition) and is executed n times, for a total of 3n units. Line 2 has the hidden costs of initializing i, testing i n, and incrementing i. The total cost of all these is 1 to initialize, n + 1 for all the tests, and n for all the increments, which is 2n + 2. We ignore the costs of calling the function and returning, for a total of 5n + 4. Thus, we say that this function is O (n). GENERAL RULES RULE 1-FOR LOOPS: The running time of a for loop is at most the running time of the statements inside the for loop (including tests) times the number of iterations. The Loop below is O(n) for(i=0; i<n; i++) j++; } RULE 2-NESTED FOR LOOPS: Analyze these inside out. The total running time of a statement inside a group of nested for loops is the running time of the statement multiplied by the product of the sizes of all the for loops. As an example, the following program fragment is O(n2): for( i=0; i<n; i++ ) for( j=0; j<n; j++ ) k++; RULE 3-CONSECUTIVE STATEMENTS: These just add (which means that the maximum is the one that counts ). As an example, the following program fragment, which has O(n) work followed by O (n2) work, is also O (n2): for( i=0; i<n; i++) a[i] = 0; for( i=0; i<n; i++ ) for( j=0; j<n; j++ ) a[i] += a[j] + i + j; RULE 4-lF/ELSE: For the fragment if( cond ) S1 else S2 the running time of an if/else statement is never more than the running time of the test plus the larger of the running times of S1 and S2. GENERAL RULES CONTINUED The previous rules also affects the other loops as well(i.e. While, do....while) For example, What is the running time for the following loops 1. i=0 do{ print i i++ }while(i<n) 2. while(i<n){ for(j=0; j<n;j++){ while(k<n){ cout<<i+j+k; k++; } } i++; }