VIEWS: 58 PAGES: 245 POSTED ON: 1/10/2012
E-528-529, sector-7, Dwarka, New delhi-110075 (Nr. Ramphal chowk and Sector 9 metro station) Ph. 011-47350606, (M) 7838010301-04 www.eduproz.in Educate Anytime...Anywhere... "Greetings For The Day" About Eduproz We, at EduProz, started our voyage with a dream of making higher education available for everyone. Since its inception, EduProz has been working as a stepping-stone for the students coming from varied backgrounds. The best part is – the classroom for distance learning or correspondence courses for both management (MBA and BBA) and Information Technology (MCA and BCA) streams are free of cost. Experienced faculty-members, a state-of-the-art infrastructure and a congenial environment for learning - are the few things that we offer to our students. Our panel of industrial experts, coming from various industrial domains, lead students not only to secure good marks in examination, but also to get an edge over others in their professional lives. Our study materials are sufficient to keep students abreast of the present nuances of the industry. In addition, we give importance to regular tests and sessions to evaluate our students’ progress. Students can attend regular classes of distance learning MBA, BBA, MCA and BCA courses at EduProz without paying anything extra. Our centrally air-conditioned classrooms, well-maintained library and well- equipped laboratory facilities provide a comfortable environment for learning. Honing specific skills is inevitable to get success in an interview. Keeping this in mind, EduProz has a career counselling and career development cell where we help student to prepare for interviews. Our dedicated placement cell has been helping students to land in their dream jobs on completion of the course. EduProz is strategically located in Dwarka, West Delhi (walking distance from Dwarka Sector 9 Metro Station and 4-minutes drive from the national highway); students can easily come to our centre from anywhere Delhi and neighbouring Gurgaon, Haryana and avail of a quality-oriented education facility at apparently no extra cost. Why Choose Edu Proz for distance learning? • Edu Proz provides class room facilities free of cost. • In EduProz Class room teaching is conducted through experienced faculty. • Class rooms are spacious fully air-conditioned ensuring comfortable ambience. • Course free is not wearily expensive. • Placement assistance and student counseling facilities. • Edu Proz unlike several other distance learning courses strives to help and motivate pupils to get high grades thus ensuring that they are well placed in life. • Students are groomed and prepared to face interview boards. • Mock tests, unit tests and examinations are held to evaluate progress. • Special care is taken in the personality development department. "HAVE A GOOD DAY" Karnataka State Open University (KSOU) was established on 1st June 1996 with the assent of H.E. Governor of Karnataka as a full fledged University in the academic year 1996 vide Government notification No/EDI/UOV/dated 12th February 1996 (Karnataka State Open University Act – 1992). The act was promulgated with the object to incorporate an Open University at the State level for the introduction and promotion of Open University and Distance Education systems in the education pattern of the State and the country for the Co-ordination and determination of standard of such systems. Keeping in view the educational needs of our country, in general, and state in particular the policies and programmes have been geared to cater to the needy. Karnataka State Open University is a UGC recognised University of Distance Education Council (DEC), New Delhi, regular member of the Association of Indian Universities (AIU), Delhi, permanent member of Association of Commonwealth Universities (ACU), London, UK, Asian Association of Open Universities (AAOU), Beijing, China, and also has association with Commonwealth of Learning (COL). Karnataka State Open University is situated at the North–Western end of the Manasagangotri campus, Mysore. The campus, which is about 5 kms, from the city centre, has a serene atmosphere ideally suited for academic pursuits. The University houses at present the Administrative Office, Academic Block, Lecture Halls, a well-equipped Library, Guest House Cottages, a Moderate Canteen, Girls Hostel and a few cottages providing limited accommodation to students coming to Mysore for attending the Contact Programmes or Term-end examinations. Unit1 Arrays, Pointers and Structures This unit covers the Definitions and concept of an array, single and double dimension array, Definition of pointers, Declaring pointer variable, pointer operators, Pointers and Arrays, Pointers and Functions, structures, Declaring initializing of structure, processing of structure, Structure with array. Introduction The need of the arrays can be best understood by thinking, what if its not there. If we had to write a program to add three integers, we may declare a,b,c as three integers and display the result as “a+b+c”. What if we had to add 100 numbers only after accepting all of them? We would need 100 different variables with 100 different identifiers with no of variable locations as relative of each other. This approach is very cumbersome and lengthy. Hence is the need of arrays. Arrays are the basis for creating any new data structures; understanding of arrays is essential and becomes the backbone in this field. Using arrays one can declare and define multiple variables of same type with one identifier. For e.g. int a[10] is a declaration of 10 variables which are clustered together. The pointer are special type of variables that hold the address value. Pointers are special variables as they simply point to other variable. The pointers are the powerful tool for instance, are used to setup complicated data structure and are used to link variables together. In general variables are defined with its type and these can be find with the addresses for ordinary variables by using the address operator ‘&’. Thus in C and C++ variables that tell a computer where data is placed that are know as pointer variable. Objectives At the end of this unit, you will be able to understand the: · Arrays and its usage in programming languages · Brief introduction of pointers · Pointer operators and · Implementation concepts of Pointers using Arrays, Functions · Brief about the structure and its usages. 1.1 Definition and Concept of an Array Array is a list or collection of data items which are stored in the form of a table, referred to by a common variable name. i.e., called “Array name” or “Subscript variable name”. · Advantage of using an array: 1) Multi huge quantity of data items can be stored under single variable name 2) Arrays saves the memory space 3) Arrays helps to arrange the data (Sorting) items in particular order [Ascending / descending] 4) Data items searching is faster. Concept of an Array: Concept of an array can be classified into two types that are : 1. Single dimensional array 2. Double/ multi dimensional array or matrix. · Concept of Single dimensional Array. Data items are stored in one single column with specified subscript variable name by using subscript range. Example: Note : In the single dimensional Array , values are stored or arranged Column wise with its respective range and each element range addressed by common single variable name. Concept of two Dimensional or Double Dimension Array In double dimensional array, values are referred with respect to specified Row & column using subscript variable name. double dimensional array also called as matrix. Example : Holding the Different Brands Electrical Bulbs quantity with respect to its wattages(W). Above Example indicates the 4 / 5 Matrix, all cell values are referred by the subscript name (Array name) QTY with respect to row and column. Array Used in ‘C’ Language In any High Level Language[HLL] when we used an array the following 3 sequence steps must be followed. 1. Defining or Declaring an array (Creating a specified blank table in main memory) 2. Storing values in an Array (By the source accepting values and storing) 3. Reading or retrieving a values from an array for any process task. Single – Dimensional Arrays (One Dimensional Array) · Declaration or Defining an Array. <data type><array name or subscript variable> [< array Range or order >] (or array type) Example int num [10]; char name [20]; float avg [100]; · Initialize An Array: Example: int num [10] = {40, 50, 10, 20, 11, 20, 15, 100, 90, 17} char name [6] = {‘s’, ‘h’, ‘a’, ‘k’, ‘t’, ‘i’} char Colorx [5] = {‘W’, ‘h’, ‘i’, ‘t’, ‘e’} float avg[3] = {55.99,78.50,80.70} Example: Write a program to store 10 salesmen’s amount in an array and find out total sale & best sales amount. /* Storing 10 sales amount in an array and find out total sale & Best sale amount */ #include <iostream.h> #include <conio.h> main ( ) { const int n = 10; int sale_amt [100]; tot_amt; best = 0; clrscr ( ) /* storing values in an array */ for (i = 0; i< = n ; i + +) { printf(“Enter sales amount: \n”); scanf(“%d”, &sale_amt[i]); } /* Calculate the total sale & find out best amount */ tot_sale = 0; best = 0; for (i = 0; i<n; i+ +) { prntf(“ \n sale amount %d , = %d ” , i, sale_amt [i]); tot_amt = tot_amt + sale_amt[i]; if sale_amt[i] > best best = sale_amt [i]; } /* Printing Total and Best Sale amount */ printf( “ \n Total sale amount = %d ”, tot_amt); printf(“ \n Best sale amount = %d ” ,best); } Sorting an array : [ Using Bubble sorting Technique] Example : Write program to read N number of observations and print them in ascending order. /* sorting an array. */ #include<iostream.h> #include <conio.h> main() { int num [100], i, j, temp, n; clrscr( ); printf( “Enter the number of observations:”); scanf(“%d “, &n); /* Entering value of observation */ printf( “Enter the Observations = \n”; for (i=0; i<n; i+ +) { scanf(“%d”, &num [i]) ; } /* sorting observations in ascending order */ for (i = 0; i < n; i + +) { for (j = i +1; j < n; j + +) { if num [i]>num [j] { temp = num [i]; num [i] = num [j]; num [j] = temp; } } } / * Printing sorted array */ printf( “observations in Ascending order \n”); for (i = 0; i < n; i + +) printf(“%d \n “, num[i] ); } Example: Write a program to search a given number from an array and display appropriate message. [linear search] #include < iostream.h> #include <conio.h> #include <stdlib.h> #include <iomanip.h> { /* Program for linear search for given number */ int a[100], i, n, sc, key ,pos; clrscr( ); printf( “Enter the array limit: \n”); scanf(“%d”, &n); /* Enter array values */ printf( “Enter the elements value \n”); for (i = 0; i< n; i + +); scanf(“%d”, &a[i]); printf( “Enter the key value for searching: \n ”); scanf(“%d”, &key); /* linear search */ sc=0; for (i = 0; i< n; i + +); { if (a[i]==key ) { sc = 1; pos=i; } } if (sc ==1) printf( “ \n The element is found in location : %d ”, pos+1).; else printf( “ \n given value is not found” ) ; } 1.1.3 Two Dimensional Arrays. [Matrix] It is possible for array to have two or more dimensions. We shall go through two- dimensional array only. Two dimensional array is called matrix. · Declaration of TWO dimensional array <type> < subscript Name> [ Row Range ] [col Range ]; Example: int a[3][3]; /* declaration of 3/3matrix int type */ float x[3][3]; /* declaration of 3/3matrix float type */ Initialization of an array. Example: int a[3][2] = { {10, 15}; {25, 11}; {9, 3}; } Storing values in an array with help of two for() statements forming nested loop, we can store the element values in an array Reading values from an array for process just printing array values. Key Note for array in C : 1. Selection of array name is similar to selecting a variable name or identifiers in C. 2. The range of subscript start from zero (0) up to specified final value. 3. Subscript range must be +ve integer constant . Example: Write a program to define order of matrix and find the sum of all elements . #include < iostream.h> #include <conio.h> #include <stdlib.h> #include <iomanip.h> main ( ) { /* Array declaration */ int a[10][10]; int i, j, sum = 0, m, n; clrcsr ( ); printf( “Enter the order of matrix \n” ); printf( “Enter Row Range: \n” ); scanf(“%d “,&m); printf( “Enter Col Range: \n”); scanf(“%d”,&n); /* storing values in an array */ printf( “Enter Elements values \n ”); for (i = 0; i < m; i + +) { for (j = 0; j < n; j + +) scanf(“%d”,&a[i] [j]); } /* Printing matrix &finding sum of elements */ printf( “printing given matrix \n”) for (i = 0; i < m; i + +) { for (j = 0; j < n; j + +) { printf(“%d ”,&a[i] [j]); sum = sum + a[i] [j]; } printf(“\n”); } printf( “sum of all element = %d ”, sum); } Self Assessment Questions 1. Define an array? Write its advantages using in Program. 2. Write the syntax of declaration of an array with example of each. 3. What are the points should remember using array in program. 1.2 Pointers Definition of pointer “A pointer is a variable that can hold the address of the variables, structures and functions that are used in the program. It contains only the memory location of the variable rather than its containts”. Pointers are used with followings: 1. Basic data type variable. 2. Array Subscript variable. 3. Function names. 4. Structure and Union names. Advantages of Pointers: 1. Pointers are pointing to different data types and structures 2. Manipulation of data at different memory locations is easier. 3. To achieve a clarity and simplicity 4. More compact and efficient coding. 5. To return multiple value via functions. 6. Dynamic memory allocations. Declaring a pointer variable Pointers are declared similar to normal variables, but we must specify when we declare them what they are going to point to it. We declare a pointer to point to an integer, then it cant be used to point a floating-point value etc. Pointer Operators: To declare and refer a pointer variable, provides two special operators & and *. Types of pointer variable declaration: Example : char *cptr; pointer to character type variables int *iptr; *num pointer to integer type variables float *fptr; pointer to float type variables char *name[15] pointer to character array Note: * symbol is part of the variables type. Example : long int *x, *y; float *avg, *ratio; etc. Example: Program to assign the pointer values. (using operator & and *) #include< iostream.h> #include<conio.h> main( ) { int *x, y; /* xis pointer to integer variable */ clrscr ( ); y = 10; x = &y; /* y value stored in pointer x.*/ printf( “Address of y = %d \n ” , &y); printf (“value of y = %d \n” , y); printf( “Address of y = %d \n ” , x); printf( “value of y = %d \n ”, *x); } output Address of y = 65555 Value of y = 10 Address of y = 65555 Value of y = 10 Note: i) 65555 is a address of &y it should be unsigned +ve. ii) last statement value of y indirectly by using *x. *x-value at address stored by x. Therefore * is called indirection operator when used in conjunction with pointers. Example: Program to assign the values using operator *and &. #include <iostream.h> #include <conio.h> main() { int x, y, *ipt; /* ipt is a pointer to integer variable */ clrscr ( ); x = 8; ipt = & x; /*Address of x is stored in ipt */ y = *ipt; /* Content of pointer goes to y */ printf( “The value of y is = %d \n “, y); } output The value of y is = 8 Note: Variable y is assigned to value at the address stored in ipt. since ipt contains address of x, the value at address of x is 8, so * ipt is equal to 10. Example: Program to use arithmetic operations with pointers. #include <iostream.h> #include <conio.h> main ( ) { int a, *ipt; /* ipt is a pointer to integer variable. */ int m, n, k; clrscr( ); a = 150; ipt = &a; /* address of a is assign to pointer */ m = (*ipt) + +; n = (*ipt) – -; k = (*ipt) + +; print( “value of m = %d \n” ,m); print( “value of n = %d \n ” , n); print( “value of k = %d \n ”,k); } Pointers and Arrays There is a close association between pointers and arrays, array elements can be accessed using pointers. Example: Program to reads 10 array elements & prints the elements using pointer technique. #include <iostream.h> #include <conio.h> #include <iomanip.h> main ( ) { int a[10], *arpt, i; clrscr( ); printf( “Enter arry values\n”); for (i = 0; i < 10; i + +) scanf(“%d \n ”,&a[i]); /* arpt points to array */ arpt = a; /* printing by technique 1 */ for (i = 0; i < 10; i + +) printf(“%d \n “ arpt +i); /*printing by technique 2 */ for (i = 0; i < 10; i + +) printf(“%d” , *(arpt + +); } Note: arpt is a pointer variable, in the first technique, in the for loop *(arpt + i) it start from 0 element i.e. *(arpt = 0). In the second technique (*arpt = 0) in first cycle then increment operation i is used with the pointer instead of adding loop index to the pointer. Example: Program to read n number of element and find the biggest elements among them. #include <iostream.h> #include <conio.h> main ( ) { int a[100], *arpt, i; big, n; clrscr(); printf( “ Enter number of elements: \n”); sacnf(“%d “, &n); printf( “Enter number of elements: \n”); for (i = 0; i < n; i + +) scanf(“ %d”, &a[i]); /*the first element address stored in arpt */ arpt = a; big =*arpt /* first element value stored in big */ for (i = 1; i < n; i + +) { if big <*(arpt + i) big = *(arpt + i); } printf( “The biggest among the elements = \n ”, big ); } Pointers used in function It is mechanism by which pointers can be passed as arguments to the function. Thus, the data items of the calling program can be accessed by the called program. No values is copied when pointers are passed as arguments, as in the called by value method. Another important point is that, if the values are changed in the function this will modify the original contents of the actual parameters, this is not true in case of call by value method. When the pointers are passed as an argument we must follow the following points. a. In the calling program, the function is invoked with a function name and addresses of actual parameters enclosed within the parenthesis. Example : < Function Name>(&var1,&var2,&var3………….&var n) var à all are actual parameters. b. In the called program parameter list, each & every formal parameter (pointers) must be preceeded by an indirection operatore(*) Example : <data type> <function Name>(*v1,*v2,*v3…………*vn ) v –> all are formal parameters (pointers) Example : Program to illustrate the call by reference method to interchange the value of 2 integer variable. main() { int num1,num2; int interchange( int *n1, int *n2); printf( “Enter any Two integer number\n”); scanf(“%d %d “, &num1,&num2); printf(“before interchanging \n); printf(“num1 = %d and num2 = %d”,num1,num2); interchange(&num1,&num2); printf(“after interchanging \n); printf(“num1 = %d and num2 = %d” num1, num2); } int interchange(int *n1, int *n2) { int temp; temp=*n1; * n1=*n2; *n2=temp; } Pointers used in an Array Pointers can be used with array to increase the efficiency of the execution of the program. Pointers can be used with single dimensional or multi-dimensional arrar. Pointers using in Single Dimensional Array.: The name of a array itself designates some memory location & this location in memory is the address of the very first element of an array, the address of the first element of array & num[0], where num is an array name. Example : Write a program to use an array of 5 elements & illustrate the relationship between elements of an array & their address. main() { int arrlist[5]; int *ptr,index,value=3; ptr = arrlist; for(index=0; index<5; index++) { *(ptr+index)=value++; printf(“*(ptr+index)=%d\tarrlist(index)=%d \n”,*(ptr+index),arrlist[index]); } } Output : *(ptr+index)= 3 arrlist(index)= 3 *(ptr+index)= 4 arrlist(index)= 4 *(ptr+index)= 5 arrlist(index)= 5 *(ptr+index)= 6 arrlist(index)= 6 *(ptr+index)= 7 arrlist(index)= 7 Example : Write a program to find the sum of 5 elements static in nature using pointers with function. main() { static int array[5]={200,400,600,800,1000}; int addnum(int *ptr); / * function protype */ int sum; sum = addnum(array); printf(“ Sum of all array elements = %d \n”,sum); } int addnum(int *ptr) { int total = 0, index; for(index=0; index<5; index++) total +=(ptr+index); return(total); } Self Assessment Questions [ 1.2 to 1.6 ] 1. Define pointer? 2. Write a advantages of using pointers in programs. 3. Explain with an example of Pointers operators. Structures Definitions : Structure is a meaningful organized Collection of data items of different type under a unique name that name we called as structure name. In ‘C’ declaration of such related data items or fields of different types by using reserve word ‘struct’ . 1.7.1 Declaration of structure Each and every structure must be defined or declared before it appears or using in program. Syntax: struct <structurer name> { <type1> <field/data1> <type2> <field/data2> <type3> <field/data3> ……………………… ………………………. <type n> <field/data n> }; Example : Struct student { int rollno; char name[30]; char address[30]; char city[15]; float marks; }; 1.7.2 Initialization of structure Initializing a structure description of structure member is similar to initializing static type declaration. Example : structure student={122,”Sakshi”, “Arvind Appt.”,”Manipal”,560}; Embedded Structure declaration : [Nested] It means that, Structure within the another structure is called an embedded structure. These type of structure declared mainly in two ways that are: a) Structure may completely defined within the another structure. b) There may be a separate structure, the embedded structure declared first and the other structure declared next. Example: 1.7.3 Processing of Structure The process of structure is mainly concerned with the accessing structure member. Each member of a structure is accessed with .(dot) operator to access a particular member of the structure, the dot operator must be placed between the name of the structure & the name of the structure member. Examples : emp.emp_name, emp.empno , emp.salary etc. 1. Write a program to accept the student details as roll_no, name, city, marks using structure and print the details. struct std { int rollno; char name[30]; char city[15]; int marks; } st; /* structure definition */ /* st -> is the structure point */ main() /*main program */ { printf(“enter the Roll no \n”); scanf(“%d “, &st.rollno); printf(“enter the Name \n”); scanf(“%s “, st.name); printf(“enter the city \n”); scanf(“%d “, st.city); printf(“enter the Marks \n”); scanf(“%d “, &st.marks); /* printing details */ printf(“Roll Number : %d”,st.rollno); printf(“Name : %s”, st.name); printf(“City : %s”, st.city); printf(“Marks : %d”,st.marks) } 1.7.4 Structure used with an Array However we know that different type of data sets cannot be stored an array, So, to overcome this disadvantage structure can be stored along with its members in array structure. Example: Storing 10 students details structure in an array. Self Assessment Questions 1. Define Structure ? 2. Write a advantages of Structure over Arrays using in programs. 3. Give one suitable example of Structure using an array. Summary Arrays are the basis for creating any new data structures; understanding of arrays is essential and becomes vital role of programmer while implementing the codes. Using arrays one can declare and define multiple variables of same type with one identifier. For e.g. int a[10] is a declaration of 10 variables which are clustered together. It also saves a memory space, easy for sorting and searching the homogeneous type of data. The pointer are special type of variables that hold the address value. Pointers are special variables as they simply point to other variable. The pointers are the powerful tool for instance, are used to setup complicated data structure and are used to link variables together. 1.9 Terminal Questions 1. Define Array ? Write the Syntax with example of declaring a single and double dimension array in ‘C’. 2. Write a ‘C’ program to read N number of observations and print them in ascending order. 3. Accept an array of elements and divide each element in array by 3. 4. Find total occurrence of the given number ‘n’ in an array of 10 numbers entered by the user. 5. Numbers in array are stored in linear fashion, find the biggest and the smallest of 10 numbers in the given array. 6. Array elements are stored from 0th location, relocate the elements to start from 4th location 7. Find occurrence of each number in the array. 8. Check whether the given array is a palindrome or not. 9. Reverse the given array without using extra memory. 10. Store a string in an array and find the frequency of occurrence of each character in the array. 11. Without using string functions find the length of the string. 12. Define Pointer? Discuss the advantages of using pointers in Program. 13. Explain the pointer operators with an example of each. 14. Illustrates the ‘C’ programs which is represents the pointers with array and pointers with functions. 15. Define Structure ? Write Syntax with appropriate example for declaration of Structure. Unit2 Overview of Data Structures This unit cover the overview of the Data structure, Definition of Data structure, Data types and Structured data type, Abstract data type, pre and post conditions, Linear Data structure, and also discussed the implementation methods using C, Non linear data structures. Introduction Data structures represent places to store data for use by a computer program. As you would imagine, this describes a spectrum of data storage techniques, from the very simple to the very complex. We can look at this progression, from the simple to the complex, in the following way. At the lowest level, there are data structures supplied and supported by the CPU (or computer chip), itself. These vary from chip to chip, but are almost always of the very primitive sort. They typically include the simple data types, such as integers, characters, floating point numbers, and bit strings. To some extent, the data types supported by a chip reflect the hardware design of the chip. Things such as, how wide (how many bits) are the registers, how wide is the data bus, does the ALU have an accumulator, does the ALU support floating point operations? At the second level of the data structures spectrum are the data structures supported by particular programming languages. These vary a lot from language to language. Most languages offer arrays, and many offer arrays of arrays (matrices). Most of the popular languages provide support for some sort of record structure. In C these are structs and in Pascal these are records. A few offer strings as a first class data type (e.g. C++ and Java). A few languages support linked lists directly in the language (e.g. Lisp and Scheme). Object oriented languages often offer general lists, stacks, and even trees. At the top level of this taxonomy are those data structures that are created by the programmer, using a particular programming language. In this regard, it is important to note what tools are provided by a language to facilitate the implementation of complex data structures envisioned by a programmer. Things such as arrays, arrays of arrays, pointers, record structures are all helpful in this regard. Using the available tools, a programmer can build general lists, stacks, queues, dequeues, tress (of many types), graphs, sets, and much, much more. In this book we will focus on those data structures in the top level, those that are usually created by the application programmer. These are the data structures that. generally, impact the problem solution and implementation in the most dramatic ways: size, efficiency, readability , and maintainability . Objectives At the end of this unit, you will be able to understand the: · Meaning and brief introduction of Data Structure · Discussed the various types of abstract levels · Brief introduction of Abstract data type and its properties · Operations and implementations of methods of Pre and Post Conditions. · Concepts and methods of Linear and Non Linear Data structure. 2.1.1 What is a Data Structure? A data structure is the organization of data in a computer’s memory or in a file. The proper choice of a data structure can lead to more efficient programs. Some example data structures are: array, stack, queue, linked list, binary tree, hash table, heap, and graph. Data structures are often used to build databases. Typically, data structures are manipulated using various algorithms. Based on the concept of Abstract Data Types (ADT), we define a data structure by the following three components. 1) Operations: Specifications of external appearance of a data structure 2) Storage Structures: Organizations of data implemented in lower-level data structures 3) Algorithms: Description on how to manipulate information in the storage structures to obtain the results defined for the operations Working with and collecting information on any subject, it doesn’t take very long before you have more data than you know how to handle. Enter the data structure. In his book Algorithms, Data Structures and Problem Solving with C, Mark Allen Weiss writes “A data structure is a representation of data and the operations allowed on that data.” Webopedia states, “the term data structure refers to a scheme for organizing related pieces of information.” Definition of data structure “a specification, an application and an implementation view of a collection of one or more items of data, and the operations necessary and sufficient to interact with the collection. The specification is the definition of the data structure as an abstract data type. The specification forms the programming interface for the data structure. The application level is a way of modeling real-life data in a specific context. The implementation is a concrete data type expressed in a programming language. There may be intermediate levels of implementation, but ultimately the data structure implementation must be expressed in terms of the source language primitive data types”. The Abstract Level The abstract (or logical) level is the specification of the data structure -the “what” but not the “how.” At this level. the user or data structure designer is free to think outside the bounds of anyone programming language. For instance. a linear list type would consist of a collection of list nodes such that they formed a sequence. The operations defined for this list might be insert. delete, sort and retrieve. The Application Level At the application or user level, the user is modeling real-life data in a specific context. In our list example. we might specify what kind of items were stored in the list and how long the list is. The context will determine the definitions of the operations. For example, if the list was a list of character data, the operations would have a different meaning than if we were talking about a grocery list. Implementation Level The implementation level is where the model becomes compilable, executable code. We need to determine where the data will reside and allocate space in that storage area. We also need to create the sequence of instructions that will cause the operations to perform as specified. Self Assessment Questions 1. Define data Structure? Explain its three components. 2. Discuss the data structure implementation in terms of the source language primitive data type. Data Types and Structured Data Type The definition for the term data type and structured data type and data type consists of • a domain(= a set of values) • a set of operations. Example : Boolean or logical data type provided by most programming languages. • two values : true, false. • Many operations including: AND , OR, NOT etc. Structural and Behavioral Definitions There are two different approaches to specifying a domain : we can give a structural definition or can give a behavioral definition. Let us see what these two are like. Behavioral Definition of the domain for ‘Fraction’ The alternative approach to defining the set of values for fractions does not impose any internal structure on them. Instead it Just adds an operation that creates fractions out of other things. such as CREATE_FRACTION(N.D) where N is any integer. D is any non- zero integer. The values of type fraction are defined to be the values that are produced by this function for any valid combination of inputs. The parameter names were chosen to suggest its intended behavior: CREATE_FRACTION(N.D) should return a value representing the fraction N/D (N for numerator. D for denominator). You are probably thinking. this is crazy. CREATE_FRACTION could be any old random function. how do we guarantee that CREATE_FRACTION(N,D) actually returns the fraction N/D? The answer is that we have to constrain the behavior of this function. by relating it to the other operations on fractions. For example, One of the key properties of multiplication is that: NORMALIZE ((N/D) .(DIN)) = 1/1 This turns into a constraint on CREATE_FRACTION: NORMALIZE (CREATE_FRACfION(N,D) * CREATE_FRACfION(D,)) = CREATE_FRACTION(1,1) So you see CREATE_FRACTION cannot be any old function, its behavior is highly constrained, because we can write down lots and lots of constraints like this. And that’s the reason we call this sort of definition behavioral, because the definition is strictly in terms of a set of operations and constraints or axioms relating the behavior of the operations to one another. In this style of definition, the domain of a data type -the set of permissible values -plays an almost negligible role. Any set of values will do, as long as we have an appropriate set of operations to go along with it. Common Structures Let us stick with structural definitions for the moment. and briefly survey the main kinds of data types, from a structural point of view. • Atomic Data Types First of all, there are atomic data types. These are data types that are defined without imposing any structure on their values. Boolean, our first example, is an atomic type. So are characters, as these are typically defined by enumerating all the possible values that exist on a given computer. • Structured Data Types The opposite of atomic is structured. A structured data type has a definition that imposes structure upon its values. As we saw above, fractions normally are a structured data type. In many structured data types, there is an internal structural relationship, or organization, that holds between the components. For example, if we think of an array as a structured type, with each position in the array being a component, then there is a structural relationship of ‘followed by’: we say that component N is followed by component N+ 1. • Structural Relationships Not all structured data types have this sort of internal structural relationship. Fractions are structured, but there is no internal relationship between the sign, numerator, and denominator. But many structured data types do have an internal structural relationship, and these can be classified according to the properties of this relationship. • Linear Structure: The most common organization for components is a linear structure. A structure is linear if it has these 2 properties: Property P1 Each element is ‘followed by’ at most one other element. Property P2 No two elements are ‘followed by’ the same element. ‘An array is an example of a linearly structured data type‘. We generally write a linearly structured data type like this: A->B->C->D (this is one value with 4 parts). - counter example 1 (violates Pl): A points to B and C B<-A->C - counter example 2 (violates P2): A and B both point to C A->C<-B 2.2.2 Abstract Data Types Handling Problems This implies that the model focuses only on problem related stuff and that you try to define properties of the problem. These properties include: • the data which are -affected and • the operations which are identified by the problem It is said that “computer science is the science of abstraction.” But what exactly is abstraction? Abstraction is “the idea of a quality thought of apart from any particular object or real thing having that quality. For example. we can think about the size of an object without knowing what that object is. Similarly, we can think about the way a car is driven without knowing Its model or make. As an example consider the administration of employees in an institution. The head of the administration comes to you and ask you to create a program which allows to administer the employees. Well. this is not very specific. For example, what employee information is needed by the administration? What tasks should be allowed? Employees are real persons who can be characterized with many properties; very few are: name. size. date of birth. shape. social number, room number. hair color, hobbies. Certainly not all of these properties are necessary to solve the administration problem. Only some of them are problem specific. Consequently you create a model of an employee for the problem. This model only implies properties which are needed to fulfill the requirements of the administration. for instance name, date of birth and social number. These properties are called the data of the (employee) model. Now you have described real persons with help of an abstract employee. Of course, the pure description is not enough. There must be some operations defined with which the administration is able to handle the abstract employees. For example there must be an operation which allows you to create a new employee once a new person enters the institution. Consequently, you have to identify the operations which should be able to be performed on an abstract employee. You also decide to allow access to the employees’ data only with associated operations. This allows you to ensure that data elements are always in a proper state. For example you are able to check if a provided date is valid. Abstraction is used to suppress irrelevant details while at the same time emphasizing relevant ones. The benefit of abstraction is that it makes it easier for the programmer to think about the problem to be solved. To sum up. abstraction is the structuring of a nebulous problem into well-defined entities by defining their data and operations. Consequently, these entities combine data and operations. They are not decoupled from each other. • Abstract Data Types A variable in a procedural programming language such as Fortran, Pascal, C, etc. is an abstraction. The abstraction comprises a number of attributes -name. address. value. lifetime. scope. type, and size. Each attribute has an associated value. For example, if we declare an integer variable in C & C++. int x, we say that the name attribute has value “x” and that the type attribute has value “int”. Unfortunately, the terminology can be somewhat confusing: The word “value” has two different meanings-in one instance it denotes one of the attributes and in the other it denotes the quantity assigned to an attribute. For example, after the assignment statement x = 5, the value attribute has the value five. The name of a variable is the textual label used to refer to that variable in the text of the source program. The address of a variable denotes is location in memory. The value attribute is the quantity which that variable represents. The lifetime of a variable is the interval of time during the’ execution of the program in which the variable is said to exist. The scope of a variable is the set of statements in the text of the source program in which the variable is said to be visible. The type of a variable denotes the set of values which can be assigned to the value attribute and the set of operations which can be performed on the variable. Finally. the size attribute denotes the amount of storage required to represent the variable. The process of assigning a value to an attribute is called binding. When a value is assigned to an attribute. that attribute is said to be bound to the value. Depending on the semantics of the programming language, and on the attribute in question. The binding may be done statically by the compiler or dynamically at run-time. For example. in Java the type of a variable is determined at ‘compile time-static binding’. On the other hand, the value of a variable is usually not determined until ‘run-time-dynamic binding’.. Here we are concerned primarily with the type attribute of a variable. The type of a variable specifies two sets: o a set of values; and, o a set of operations. For example, when we declare a variable, say x, of type int, we know that x can represent an integer in the range (-231, 231-1) and that we can perform operations on x such as addition, subtraction, multiplication, and division. The type int is an abstract data type in the sense that we can think about the qualities of an int apart from any real thing having that quality. In other words, we don’t need to know how ints are represented nor how the. operations are implemented to be able to be. able to use them or reason about them. In designing object-oriented programs, one of the primary concerns of the programmer is to develop an appropriate collection of abstractions for the application at hand, and then to define suitable abstract data types to represent those abstractions. In so doing, the programmer must be conscious of the fact that defining an abstract data type requires the specification of both a set of values and a set of operations on those values. Indeed, it has been only since the advent of the so-called object-oriented programming languages that the we see programming languages which provide the necessary constructs to properly declare abstract data types. For example, in Java, the class construct is the means by which both a set of values and an associated set of operations is declared. Compare this with the struct construct of C or Pascal’s record, which only allow the specification of a set of values! Properties of Abstract Data Types The example of the quoted before shows, that with abstraction you create a well- defined entity which can be properly handled. These entities define the data structure of a set of items. For example, each administered employee has a name, date of birth and social number. The data structure can only be accessed with defined operations. This set of operations is called interface and abstract data type is exported by the entity. An entity with the properties just described is called an abstract data type (ADT). Let’s try to put the characteristics of an ADT in a more formal way: Definition An abstract data type (ADT) is characterized by the following properties: 1. It exports a type. 2. It exports a set of operations. This set is called interface. 3. Operations of the interface are the one and only access mechanism to the type’s data structure. 4. Axioms and preconditions define the application domain of the type. With the first property it is possible to create more than one instance of an ADT as exemplified with the employee example. Example of the fraction data type, how might we actually implement this data type in C? Implementation 1: typedef struct { int numerator, denominator; } fraction; main() { fraction f; f.numerator = 1; f.denominator = 2; …………… } Implementation 2 : #define numerator 0 #define denominator 1 typedef int fraction[2]; main() { fraction f; f[numerator] = 1; f[denominator] = 2; …………… } These are just 2 of many different possibilities. Obviously these differences are in some sense extremely trivial -they do not affect the domain of values or meaning of the operations of fractions. Generic Abstract Data Types ADTs are used to define a new type from which instances can be created. For instance, one of lists of apples, cars or even lists. The semantically the definition of a list is always the same. Only the type of the data elements change according to what type the list should operate on. This additional information could be specified by a generic parameter which is specified at instance creation time. Thus an instance of a generic ADT is actually an instance of a particular variant the ADT. A list of apples can therefore be declared as follows: List<Apple> listOfApples; The angle brackets now enclose the data type for which a variant of the generic ADT List should be created. ListOf Apples offers the same interface as any other list, but operates on of type Apple. Notation : As ADTs provide an abstract view to describe properties of sets of entities, their use is independent from a particular programming language. We therefore introduce a notation here. Each ADT description consists of two parts: o Data: This part describes the structure of the data used in the ADT in an informal way. o Operations: This part describes valid operations for this ADT, hence, it describes its interface. We use the special operation constructor to describe the actions which are to be performed once an entity of this ADT is created and destructor to describe the actions which are to be performed once an entity is destroyed. For each operation the provided arguments as well as preconditions and postconditions are given. As an example the description of the ADT Integer is presented. Let k be an integer expression: o ADT integer is Data A sequence of digits optionally prefixed by a plus or minus sign. We refer to this signed whole number as N. Operations Constructor Creates a new integer. add(k) Creates a new integer which is the sum of N and k. Consequently, the postcondition of this operation is sum = N+k. Don’t confuse this with assign statements as used in programming languages, It is rather a mathematical equation which yields “true” for each value sum, N and k after add has been performed. sub(k) similar to add. this operation creates a new integer of the difference of both integer values. Therefore the postcondition for this operation is sum = N-k. Set(k) Set N to k. The postcondition for this operation is N = k …… end The description above is a specification for the ADT Integer. Please notice, that we use words for names of operations such as “add”. We could use the more intuitive “+” sign instead, but this may lead to some confusion: You must distinguish the operation “+” from the mathematical use of “+” in the postcondition. The name of the operation is just syntax whereas the semantics is described by the associated pre- and postconditions. However, it is always a good idea to combine both to make reading of ADT specifications easier. Real programming languages are free to choose an arbitrary implementation for an ADT. For example, they might implement the operation add with the infix operator “+” leading to more intuitive look for addition of integers. Programming with Abstract Data Types By organizing our program this way -i.e. by using abstract data types – we can change implementations extremely quickly: all we have to do is re-implement three very trivial functions. No matter how large our application is. In general terms, an abstract data type is a. specification of the values and the operations that has 2 properties: 1. it specifies everything you need to know in order to use the datatype 2. it makes absolutely no reference to the manner in which the datatype will be implemented. When we use abstract data types, our programs into two pieces: The Application: The part that uses the abstract datatype. The implementation: The part that implements the abstract data type. These two pieces are completely independent. It should be possible to take the implementation developed for one application and use it for a completely different application with no changes. If programming in teams, implementers and application-writers can work completely independently once the specification is set. Specification Let us now look in detail at how we specify an abstract datatype. We will use ’stack’ as an example. The data structure stack is based on the everyday notion of a stack, such as a stack of books, or a stack of plates. The defining property of a stack is that you can only access the top element of the stack, all the other elements are underneath the top one and can’t be accessed except by removing all the elements above them one at a time. The notion of a stack is extremely useful in computer science, it has many applications, and is so widely used that microprocessors often are stack-based or at least provide hardware implementations of the basic stack operations. First, let us see how we can define, or specify, the abstract concept of a stack. The main thing to notice here is how we specify everything needed in order to use stacks without any mention of how stacks will be implemented. Self Assessment Questions 1. Define Structural and Behavioral definitions. 2. Define abstract data type? 3. Discuss the properties of ADT? Pre and Post Conditions Preconditions These are properties about the inputs that are assumed by an operation. If they are satisfied by the inputs, the operation is guaranteed to work properly. If the preconditions are not satisfied, the operation’s behavior is unspecified: it might work properly (by chance), it might return an incorrect answer, it might crash. Postconditions Specify the effects of an operation. These are the only things you may assume have been done by the operation. They are only guaranteed to hold if the preconditions are satisfied. Note: The definition of the values of type ’stack’ make no mention of an upper bound on the size of a stack. Therefore, the implementation must support stacks of any size. In practice, there is always an upper bound -the amount of computer storage available. This limit is not explicitly mentioned, but is understood -it is an implicit precondition on all operations that there is storage available, as needed. Sometimes this is made explicit, in which case it is advisable to add an operation that tests if there is sufficient storage available for a given operation. Operations The operations specified before are core operations -any other operation on stacks can be defined in terms of these ones. These are the operations that we must implement in order to implement ’stacks’, everything else in our program can be independent of the implementation details. lt is useful to divide operations into four kinds of functions: 1. Those that create stacks out of non-stacks, e.g. CREATE_STACK, READ_STACK, CONVERT_ARRAY _TO_STACK 2. Those that ‘destroy’ stacks (opposite of create) e.g. DESTROY_STACK 3. Those that ‘inspect’ or ‘observe’ a stack, e.g. TOP, IS_EMPTY, WRITE_STACK 4. Those that takes stacks (and possibly other things) as input and produce other stacks as output, e.g. PUSH, POP A specification must say what an operation’s input and outputs are, and definitely must mention when an input is changed. This falls short of completely committing the implementation to procedures or functions (or whatever other means of creating ‘blocks’ of code might be available in the programming language). Of course, these details eventually need to be decided in order for code to actually be written. But these details do not need to be decided until code-generation time; throughout the earlier stages of program design, the exact interface (at code level) can be left unspecified. Checking Pre Conditions It is very important to state in the specification whether each precondition will be checked by the user or by the implementer. For example, the precondition for POP may be checked either by the procedure(s) that call POP or within the procedure that implements POP? Either way is possible. Here are the pros and cons of the 2 possibilities: User Guarantees Preconditions The main advantage, if the user checks preconditions -and therefore guarantees that they will be satisfied when the core operations are invoked -is efficiency. For example, consider the following: PUSH(S, 1); POP(S); It is obvious that there is no need to check if S is empty -this precondition of POP is guaranteed to be satisfied because it is a postcondition of PUSH. Implementation Checks Preconditions There are several advantages to having the implementation check its own preconditions: 1. It sometimes has access to information not available to the user (e.g. implementation details about space requirements), although this is often a sign of a poorly constructed specification. 2. Programs won’t bomb mysteriously -errors will be detected (and reported?) at the earliest possible moment. This is not true when the user checks preconditions, because the user is human and occasionally might forget to check, or might think that checking was unnecessary when in fact it was needed. 3. Most important of all, if we ever change the specification, and wish to add, delete, or modify preconditions, we can do this easily, because the precondition occurs in exactly one place in our program. There are arguments on both sides. The literatures specifies that procedures should signal an error if their preconditions are not satisfied. This means that these procedures must check their own preconditions. That’s what our model solutions will do too. We will thereby sacrifice some efficiency for a high degree of maintainability and robustness. An additional possibility is to selectively include or exclude the implementation’s condition checking code, e.g. using #ifdef: #ifdef SAFE if (! condition) error(”condition not satisfied”); #endif This code will get included only if we supply the DSAFE argument to the compiler (or otherwise define SAFE). Thus, in an application where the user checks carefully for all preconditions, we have the option of omitting all checks by the implementation. Self Assessment Questions 1. Explain the pre and Post conditions with an suitable example. 2. Discuss the advantages of implementation checks preconditions. Linear Data Structure The Array Data Structure As an example, most programming languages have an array type as one of the built-in types. We will define an array as a homogeneous, ordered, finite, fixed-length list of elements. To further define these terms in the context of an array: a) homogeneous -every element is the same b) ordered -there is a next and previous in the natural order of the structure c) finite -there is a first and last element d) fixed-length -the list size is constant Mapping the array to the three levels of a data structure: 1. At the abstract level · Accessing mechanism is direct, random access · Construction operator · Storage operator · Retrieval operator 2. At the application level · Used to model lists (characters, employees. etc). 3. At the implementation level · Allocate memory through static or dynamic declarations · Accessing functions provided -[ ] and =. Using an Array and Lists as a Data Structure An array can be used to implement containers. Given an index (i.e. subscript), values can be quickly fetched and/or stored in an array. Adding a value to the end of an array is fast (particularly if a variable is used to indicate the end of the array); however, inserting a value into an array can be time consuming because existing elements must be rotated. Since array elements are typically stored in contiguous memory locations, looping through an array can be done easily and efficiently. When elements of an array are sorted, then binary searching can be used to find particular values in the array. If the array elements are not sorted, then a linear search must be used. After an array has been defined, its length (i.e. number of elements) cannot be changed. Arrays: Fast and Slow The following are some comments on the efficiency of arrays: a) Changing the length of an array can be slow. b) Inserting elements at the end of an array is fast (assuming the index of the end-of array is stored; if you have to search for the end-of-array, then this operation is slow). c) Inserting elements near the beginning of an array can be slow. d) Accessing an array element using an index is fast. e) Searching a non-sorted array for a value can be slow. f) Searching a sorted array for a value can be fast. Elementary Data Structures “Mankind’s progress is measured by the number of things we can do without thinking.” Elementary data structures such as stacks, queues, lists, and heaps will be the “of-the- shelf’ components we build our algorithm from. There are two aspects to any data structure: 1) The abstract operations which it supports. 2) The implementation of these operations. The fact that we can describe the behavior of our data structures in terms of abstract operations explains why we can use them without thinking, while the fact that we have different implementation of the same abstract operations enables us to optimize performance. In this book we consider a variety of abstract data types (ADTs), including stacks, queues, deques, ordered lists, sorted lists, hash tables, trees, priority queues. In just about every case, we have the option of implementing the ADT using an array or using some kind of linked data structure. Because they are the base upon which almost all of the ADTs are built, we call the array and the linked list the foundational data structures. It is important to understand that we do not view the array or the linked list as ADTs, but rather as alternatives for the implementation of ADTs. Arrays Probably the most common way to aggregate data is to use an array. In C an array is a variable that contains a collection of objects, all of the same type. For example, int a[5]; allocates an array of five integers and assigns it to the variable a. The elements of an array are accessed using integer-valued indices. In C the first element of an array always has index zero. Thus, the five elements of array a are a[0] ,a[1]…..a[4]. All arrays in C have a length, the value of which is equal to the number of array elements. How are C arrays represented in the memory of the computer? The specification of the C language leaves this up to the system implementers. However, Figure illustrates a typical implementation scenario. The elements of an array typically occupy consecutive memory locations. That way given i, it is possible to find the position of a[I] in constant time. On the basis of Figure. we can now estimate the total storage required to represent an array. Let S(n) be the total storage (memory) needed to represent an array of n ints. S(n) is given by S(n) ³ size of (int[n]) ³ (n+ 1) size of (int.) where the function size of (x) is the number of bytes used for the memory representation of an instance of an object of type x. In C the sizes of the primitive data types are fixed constants. Hence size of (int.) = 0(1) In practice. an array object may contain additional fields. For example. it is reasonable to expect that there is a field which records the position in memory of the first array element. In any event the overhead associated with a fixed number of fields is 0(1). Therefore, S(n)=O(n). Multi-Dimensional Arrays A multi-dimensional array of dimension n (i.e. an n-dimensional array or simply n-D array) is a collection of items which is accessed via n subscript expressions. For example. in a language that supports it. (i, j)th the element of the two-dimensional array x is accessed by writing x[i,j]. The C programming language does not really support multi-dimensional arrays. It does however support arrays of arrays. In C a two-dimensional array x is really an array of one- dimensional arrays: int x[3][5]; The expression x[i] selects the ith one-dimensional array; the expression x[i][j]selects the j th element from that array. The built-in multi-dimensional arrays suffer the same indignities that simple one- dimensional arrays do: Array indices in each dimension range from zero to length –1, where length is the array length in the given dimension. There is no array assignment operator. The number of dimensions and the size of each dimension is fixed once the array has been allocated. Self Assessment Questions 1. Write the advantages of linear data structure. 2. Write points on the efficiency of arrays in contact to data structure. What the application needs ? Terms describing the data structure from the point of view of the application. which only cares how it behaves and not how it is implemented. List Generic term for a collection of objects. May or may not contain duplicates. Application may or may not require that it be kept in a specified order. Ordered list A list in which the order matters to the application. Therefore for example. the implementer cannot scramble the order to improve efficiency. Set List where the order does not matter to the application (implementer can pick order so as to optimize performance) and in which there are no duplicates. Multi-set Like a set but may contain duplicates. Double-ended queue (dequeue) An ordered list in which insertion and deletion occur only at the two ends of the list. That is elements cannot be inserted into the middle of the list or deleted from the middle of the list. Stack An ordered list in which insertion and deletion both occur only at one end (e.g. at the start). Queue An ordered list in which insertion always occurs at one end and deletion always occurs at the other end. Ordered Lists and Sorted Lists The most simple yet one of the most versatile containers is the list. In this section we consider lists as abstract data types. A list is a series of items. In general, we can insert and remove items from a list and we can visit all the items in a list in the order in which they appear. In this section we consider two kinds of lists-ordered lists and sorted lists. In an ordered list the order of the items is significant. The order of the items in the list corresponds to the order in which they appear in the book. However, since the chapter titles are not sorted alphabetically, we cannot consider the list to be sorted. Since it is possible to change the order of the chapters in book, we must be able to do the same with the items of the list. As a result, we may insert an item into an ordered list at any position. On the other hand, a sorted list is one in which the order of the items is defined by some collating sequence. For example, the index of this book is a sorted list. The items in the index are sorted alphabetically. When an item is inserted into a sorted list, it must be inserted at the correct position. Ordered Lists An ordered list is a list in which the order of the items is significant. However, the items in an ordered lists are not necessarily sorted. Consequently, it is possible to change the order of items and still have a valid ordered list. A searchable container is a container that supports the following additional operations: 1) insert: used to put objects into the container; 2) withdraw: used to remove objects from the container; 3) find: used to locate objects in the container; 4) isMember: used to test whether a given object instance is in the container. Sorted Lists The next type of searchable container that we consider is a sorted list. A sorted list is like an ordered list: It is a searchable container that holds a sequence of objects. However, the position of an item in a sorted list is not arbitrary .The items in the sequence appear in order, say, from the smallest to the largest. Of course, for such an ordering to exist, the relation used to sort the items must be a total order. Lists-Array Based Implementation : Deleting and inserting an item requires moving up and pushing down the existing items (O(n) in the worst case) Linked Lists Makes use of pointers, and it is dynamic. Made up of series of objects called the nodes. Each node contains a pointer to the next node. This is remove process (insertion works in the opposite way). Comparison of List Implementations Array-Based Lists: [Average and worst cases] · Insertion and deletion are O(n). · Direct access is O(1) · Array must be allocated in advance · No overhead if all array positions are full Linked Lists: · Insertion and deletion O(1) · Direct access is O(n) · Finding predecessor is O(n) · Space grows with number of elements · Every element requires overhead. Linked Lists Elements of array connected by contiguity · Reside in contiguous memory · Static (compile time) allocation (typically) Elements of linked list connected by pointers · Reside anywhere in memory · Dynamic (run time) allocation Implementation methods There are a variety of options for the person implementing a list (or set or stack or whatever). a) array We all know what arrays are. Arrays are included here because a list can be implemented using a I D array. If the maximum length of the list is not known in advance. code must be provided to detect array overflow and expand the array. Expanding requires allocating anew, longer array, copying the contents of the old array, and deallocating the old array. Arrays are commonly used when two conditions hold. First the maximum length of the list can be accurately estimated in advance (so array expansion is rarely needed). Second, insertion and deletion occur only at the ends of the list. (Insertion and deletion in the middle of an array-based list is slow.) b) linked list A list implemented by a set of nodes, each of which points to the next. An object of class (or struct) “node” contains a field pointing to the next node, as well as any number of fields of data. Optionally, there may be a second “list” class (or struct) used as a header for the list. One field of the list class is a pointer to the first node in the list. Other fields may also be included in the “list” object, such as a pointer to the last node in the list, the length of the list, etc. Linked lists are commonly used when the length of the list is not known in advance and/or when it is frequently necessary to insert and/or delete in the middle of the list. c) doubly-linked vs. singly-linked lists In a doubly-linked list, each node points to the next node and also to the previous node. In a singly-linked list, each node points to the next node but not back to the previous node. d) circular list A linked list in which the last node points to the first node. If the list is doubly-linked, the first node must also point back to the last node. Non Linear Data Structures Trees we consider one of the most Important non-linear Information structures- trees. A tree Is often used to represent a hierarchy. This is because the relationships between the Items In the hierarchy suggest the branches of a botanical tree. For example, a tree-like organization charts often used to represent the lines of responsibility in a business as shown in Figure. The president of the company is shown at the top of the tree and the vice-presidents are indicated below her. Under the vice- presidents we find the managers and below the managers the rest of the clerks. Each clerk reports to a manager. Each manager reports to a vice-president, and each vice-president reports to the president. It just takes a little imagination to see the tree in Figure. Of course. The tree is upside- down. However, this is the usual way the data structure is drawn. The president is called the root of the tree and the clerks are the leaves. A tree is extremely useful for certain kinds of computations. For example. Suppose we wish to determine the total salaries paid to employees by division or by department. The total of the salaries in division A can be found by computing the sum of the salaries paid in departments Al and A2 plus the salary of the vice-president of division A. Similarly. The total of the salaries paid in department Al is the sum of the salaries of the manager of department Al and of the two clerks below her. Clearly, in order to compute all the totals. It is necessary to consider the salary of every employee. Therefore, an implementation of this computation must visit all the employees in the tree. An algorithm that systematically visits all the items in a tree is called a tree traversal. In the same chapter we consider several different kinds of trees as well as several different tree traversal algorithms. In addition. We show how trees can be used to represent arithmetic expressions and how we can evaluate an arithmetic expression by doing a tree traversal. The following is a mathematical definition of a tree: Definition (Tree) A tree T is a finite. Non-empty set of nodes , T = {r} U TI, U T2 U …U Tn with the following properties: 1. A designated node of the set, r, is called the root of the tree: and 2. The remaining nodes are partitioned into n≥ O subsets T, T. …Tn each of which is a tree for convenience, we shall use the notation T= {r. T, T, …T} denote the tree T. Notice that Definition is recursive-a tree is defined in terms of itself! Fortunately, we do not have a problem with infinite recursion because every tree has a finite number of nodes and because in the base case a tree has n=0 subtrees. It follows from Definition that the minimal tree is a tree comprised of a single root node. For example Ta = {A}. Finally. The following Tb = {B, {C}} is also a tree Ta = {D, {E. {F}}, {G.{H,II}}, {J, {K}. {L}}, {M}}} How do Ta Tb. & Tc resemble their arboreal namesake? The similarity becomes apparent when we consider the graphical representation of these trees shown in Figure. To draw such a pictorial representation of a tree, T = {r. T1 ,T2, …Tn, beside each other below the root. Finally, lines are drawn from rto the roots of each of the subtrees. T1T2…….Tn Figure : Examples of trees. Of course, trees drawn in this fashion are upside down. Nevertheless, this is the conventional way in which tree data structures are drawn. In fact, it is understood that when we speak of “up” and “down,” we do so with respect to this pictorial representation. For example, when we move from a root to a subtree, we will say that we are moving down the tree. The inverted pictorial representation of trees is probably due to the way that genealogical lineal charts are drawn. A lineal chart is a family tree that shows the descendants of some person. And it is from genealogy that much of the terminology associated with tree data structures is taken. Figure shows one representation of the tree Tc defined in Equation. In this case, the tree is represented as a set of nested regions in the plane. In fact, what we have is a Venn diagram which corresponds to the view that a tree is a set of sets. Figure: An alternate graphical representation for trees. Binary Tree Used to implement lists whose elements have a natural order (e.g. numbers) and either (a) the application would like the list kept in this order or (b) the order of elements is irrelevant to the application (e.g. this list is implementing a set). Each element in a binary tree is stored in a “node” class (or struct). Each node contains pointers to a left child node and a right child node. In some implementations, it may also contain a pointer to the parent node. A tree may also have an object of a second “tree” class (or struct) which as a header for the tree. The “tree” object contains a pointer to the root of the tree (the node with no parent) and whatever other information the programmer wants to squirrel away in it (e.g. number of nodes currently in the tree). In a binary tree, elements are kept sorted in left to right order across the tree. That is if N is a node, then the value stored in N must be larger than the value stored in left-child(N) and less than the value stored in right-child(N). Variant trees may have the opposite order (smaller values to the right rather than to the left) or may allow two different nodes to contain equal values. Hash Tables A very common paradigm in data processing involves storing information in a table and then later retrieving the information stored there. For example, consider a database of driver’s license records. The database contains one record for each driver’s license issued. Given a driver’s license number. we can look up the information associated with that number. Similar operations are done by the C compiler. The compiler uses a symbol table to keep track of the user-defined symbols in a Java program. As it compiles a program, the compiler inserts an entry in the symbol table every time a new symbol is declared. In addition, every time a symbol is used, the compiler looks up the attributes associated with that symbol to see that it is being used correctly. Typically the database comprises a collection of key-and-value pairs. Information is retrieved from the database by searching for a given key. In the case of the driver’~ license database, the key is the driver’s license number and in the case of the symbol table, the key is the name of the symbol. In general, an application may perform a large number of insertion and/ or look-up operations. Occasionally it is also necessary to remove items from the database. Because a large number of operations will be done we want to do them as quickly as possible. Hash tables are a very practical way to maintain a dictionary. As with bucket sort, it assumes we know that the distribution of keys is fairly well-behaved. Once you have its index. A hash function is a mathematical function which maps keys to integers. In bucket sort, our hash function mapped the key to a bucket based on the first letters of the key. “Collisions” were the set of keys mapped to the same bucket. If the keys were uniformly distributed. then each bucket contains very few keys! The resulting short lists were easily sorted, and could just as easily be searched We examine data structures which are designed specifically with the objective of providing efficient insertion and find operations. In order to meet the design objective certain concessions are made. Specifically, we do not require that there be any specific ordering of the items in the container. In addition, while we still require the ability to remove items from the container, it is not our primary objective to make removal as efficient as the insertion and find operations. Ideally we would’ build a data structure for which both the insertion and find operations are 0(1) in the worst case. However, this kind of performance can only be achieved with complete a priori knowledge. We need to know beforehand specifically which items are to be inserted into the container. Unfortunately, we do not have this information in the general case. So, if we cannot guarantee 0(1) performance in the worst case, then we make it our design objective to achieve 0(1) performance in the average case. The constant time performance objective immediately leads us to the following conclusion: Our implementation must be based in some way K\h element of an array in constant time, whereas the same operation in a linked list takes O{k) time. In the previous section, we consider two searchable containers-the ordered list and the sorted list. In the case of an ordered list, the cost of an insertion is 0(1) and the cost of the find operation is O(n). For a sorted list the cost of insertion is O(n) and the cost of the find operation is O(log n) for the array implementation. Clearly, neither the ordered list nor the sorted list meets our performance objectives. The essential problem is that a search, either linear or binary, is always necessary. In the ordered list, the find operation uses a linear search to locate the item. In the sorted list, a binary search can be used to locate the item because the data is sorted. However, in order to keep the data sorted, insertion becomes O(n). In order to meet the performance objective of constant time insert and find operations. we need a way to do them without performing a search. That is, given an item x, we need to be able to determine directly from x the array position where it is to be stored. Hash Functions It is the job of the hash function to map keys to integers. A good hash function: 1. Is cheap to evaluate 2. Tends to use all positions from O…M with uniform frequency. 3. Tends to put similar keys in different parts of the tables (Remember the Shifletts!!) The first step is usually to map the key to a big integer, for example k=wth h = S 1284 x char (key[I]) 1=0 This last number must be reduced to an integer whose size is between 1 and the size of our hash table. One way is by h(k) = k mod M where M is best a large prime not too close to 2i -1, which would just mask off the high bits. This works on the same principle as a roulette wheel! Self Assessment Questions 1. Define Trees. Discuss its usage in different applications. 2. Write note on: a) Binary Tree b) Hash Tables Summary This unit covers all overview and concepts of data structure with its applications. Data structures represent places to store data for use by a computer program. As you would imagine, this describes a spectrum of data storage techniques, from the very simple to the very complex. We can look at this progression, from the simple to the complex, At the lowest level, there are data structures supplied and supported by the CPU (or computer chip), itself. These vary from chip to chip, but are almost always of the very primitive sort. They typically include the simple data types, such as integers, characters, floating point numbers, and bit strings. On these contacts discussed the various structured data types, Abstract data types, Linear and non linear data structure. Terminal Questions 1. Define Data Structure? Explain the types of structured data type. 2. Explain Abstract data types with its characteristics. 3. Discuss the linear data structure with suitable example. 4. Discuss the various types of data structure applications. 5. Write note on: a) Elementary Data Structures b) Ordered list c) Linked list d) Queue e) Slack f) Binary tree g) Hash tables Unit3 Overview of Stack In this unit discussed the Overview of Stack and its operations, related algorithms for push and pop, various stack implementation arrays, structures using C. Illustration of stack operation implementation using ‘C’, Introduction Definitions and operations: We know that in a cafeteria the plates are placed one above the other and every new plate is added at the top. When a plate is required, it is taken off from the top and it is used. We call this process as stacking of plates. Thus, the operations that can be performed if plates are stacked are: · Addition/insertion of plate at one end · Deletion of plate at the same end Using this analogy a stack is defined as a special type of data structure where items are inserted from one end called top of stack and items are deleted from the same end. Here, the last item inserted will be on top of stack. Since deletion is done from the same end, Last item Inserted is the First item to be deleted Out from the stack and so, stack is also called Last In First Out (LIFO) data structure. Objectives At the end of this unit, you will be able to understand the: • Stack Definition and its operations • POP and PUSH operation implementation in C • Various stack applications • Stack implementation using Arrays and Structure Operations of Stack The various operations that can be performed on stacks are: · Insert an item into the stack · Delete an item from the stack · Display the contents of the stack From the definition of stack it is clear that it is a collection of similar type of items and naturally we can use an array (An array is a collection of similar data types) to hold the items of stack. Since array is used, its size is fixed. So, let us assume that 5 items 30, 20, 25, 10 and 40 are to be placed on the stack. The items can be inserted one by one as shown in following figure. It is clear from this figure that initially stack is empty and top points to bottom of stack. As the items are inserted top pointer is incremented and it points to the topmost item. Here, the items 30, 20, 25, 10 and 40 are inserted one after the other. After inserting 40 the stack is full. In this situation it is not possible to insert any new item. This situation is called stack overflow. When an item is to be deleted, it should be deleted from the top as shown in following figure. Since items are inserted from one end, in stack deletions should be done from the same end. So, as the items are deleted, the item below the top item becomes the new top item and so the position of the top most item is decremented as shown in above figure The items deleted in order are 40, 10, 25, 20 and 30. Finally, when all items are deleted, top points to bottom of stack. When the stack is empty, it is not possible to delete any item and this situation is called under flow of stack. So, the main operations to be performed on stacks are insertion and deletion. Inserting an item into the stack when stack is not full is called push operation and deleting an item from the stack when stack is not empty is called pop operation. Other operations that can be performed are display the contents of the stack, check whether the stack is empty or not, etc., Let us see how push and pop operations are implemented. Self Assessment Questions 1. Define stack with its different operations. 2. Discuss the stack Insertion and deletion of element from/to stack with suitable example. Insert/Push operation To design a C function, to start with let us assume that three items are already added to the stack and stack is identified by s as shown in figure a. Here, the index top points to 30 which is the topmost item. Here, the value of top is 2. Now, if an item 40 is to be inserted, first increment top by 1 and then insert an item. The corresponding C statements are: top = top + 1; s[top] = item; These two statements can also be written as s[+ + top] = item But, as we insert an item we must take tare of the overflow situation i.e., when top reaches STACK_SIZE-l, stack results in overflow condition and appropriate error message has to be returned as shown below: if (top == STACK_SIZE -1) { printf(”Stack overflow\n”); return; } Here, ST ACK_SIZE should be #defined and is called symbolic constant the value of which cannot be modified. If the above condition fails, the item has to be inserted. Now, the C code to insert an item into the stack can be written as if (top == ST ACK_SIZE -1) { printf(”Stack overflow\n”); return; } s[ + + top] = item; It is clear from this code that as the item is inserted, the contents of the stack identified by s and top are affected and so they should be passed and used as pointers as shown in below example Example 1: C function to insert an integer item void push(int item, int *top, int s[]) { if (*top == STACK_SIZE -1) { printf(”Stack overflow\n”); return; } s[+ +(*top)] = item; /* Increment top and then insert an item */ } Note: In above Example inserts an item of integer data type into the stack. To insert an item of character data type, the changes done are provided in below example. Example 2: C function to insert a character item void push(char item, int *top, char s[]) { if (*top == ST ACK_SIZE -1) { printf(”Stack overflow\n”); return; } s[+ +(*top)] = item; /* Insert an item on the stack */ } Delete/Pop operation Deleting an element from the stack is called ‘pop’ operation. This can be achieved by first accessing the top element s[top] and then decrementing top by one as shown below: item = s[top--]; Each time, the item is deleted, top is decremented and finally, when the stack is empty the top will be -1 and so, it is not possible to delete any item from the stack. The above statement has be executed only if stack is not empty. Hence, the code to delete an item from stack can be written as if (top == -1) { return -1; /* Indicates empty stack */ } /* Access the item and delete */ item = s[top--]; . return item; As the value of top changes every time the item is deleted, top can be used as a pointer variable. The complete function is shown in below example 1. The example 2 shows how to delete a character item from the stack. Example 1: C function to delete an integer item int pop(int *top, int s[ ] ) { int item; if (*top == -1) { return 0; /* Indicates empty stack */ } item = s[(*top)--];/* Access the item and delete */ return item; /* Send the item deleted to the calling function */ } Example 2: C function to delete a character item char pop(int *top, chars[]) { char item; if(*top= =-1) { return 0; /* Indicates empty stack */ } item = s[(*top)--];/* Access the item and delete */ return item; /* Send the item deleted to the calling function */ } Display Assume that the stack contains three elements as shown below: The item 30 is at the top of the stack and item 10 is at the bottom of the stack. Usually, the contents of the stack are displayed from the bottom of the stack till the top of the stack is reached. So, first item to be displayed is 10, next item to be displayed is 20 and final item to be displayed is 30. So, the code corresponding to this can take the following form for (i = 0; i <= top; i+ +) { printf(”%d\n”, s[i]); } But, the above statement should not be executed when stack is empty i.e., when top takes the value -1. So, the modified code can be written as shown in below example. Example 1: C function to display the contents of the stack void display(int top, int s[]) { int i; if(top= = -1) { printf(”Stack is empty\n”); return; } printf(”Contents of the stack\n”); for (i = 0; i <= top; i++) { printf(”%d\n”, s[i]); } } Self Assessment Questions 1. Explain the POP and PUSH operations with an example. 2. Write steps to display elements from the STACK. Stack implementation using arrays In the previous sections we have seen how the stacks can be implemented using arrays. The complete program to perform operations such as push, pop and display is provided in below example. Two semicolons (i.e.,;;} in the for loop indicates that for loop is an infinite loop. Example : C Program to implement the stack using arrays #include <stdio.h> #include <process.h> #define ST ACK_SIZE 5 /* Include function push shown in example 3.2 Eg. -1 */ /*Include function pop shown in example 3.3 Eg. -1 */ /* Include function display shown in example 3.4 Eg. -1 */ void main( ) { int top; /* Points to top of the stack */ int s[10]; /* Holds the stack items */ int item; /* Item to be inserted or deleted item */ int choice; /* user choice for push, pop and display */ top = -1; /* Stack is empty to start with */ for (;;) { printf(”1: Push 2: Pop\n”); printf(”3: Display 4: Exit\n”); printf(”Enter the choice\n”); scanf(”%d”,& choice); switch( choice ) { case 1: printf(”Enter the item to be inserted\n”); scanf(”%d”,& item); push(item, & top,s); break; case 2: item = pop(&top,s); if (item = = 0) printf(”Stack is empty\n”); else printf(”Item deleted = %d\n”, item); break; case 3: display( top,s ); break; default: exit(0); } } } Output 1 ush 2: Pop 3: Display 4: Exit Enter the choice 1 Enter the item to be inserted 10 1: Push 2: Pop 3: Display 4: Exit Enter the choice 1 Enter the item to be inserted 20 1: Push 2: Pop 3: Display 4: Exit Enter the choice 3 Contents of the stack 10 20 1: Push 2: Pop 3: Display 4: Exit Enter the choice 2 Item deleted = 20 1: Push 2: Pop 3: Display 4: Exit Enter the choice 2 Item deleted = 10 1: Push 2: Pop 3: Display 4: Exit Enter the choice 2 Stack is empty 1: Push 2: Pop 3: Display 4: Exit Enter the choice 4 Applications of stack A stack is very useful in situations when data have to be stored and then retrieved in the reverse order. Some applications of stack are listed below: i. Function Calls: We have already seen the role stacks plays in nested function calls. When the main program calls a function named F, a stack frame for F gets pushed on top of the stack frame for main. If F calls another function G, a new stack frame for G is pushed on top of the frame for F. When G finishes its processing and returns, its frame gets popped off the stack, restoring F to the top of the stack. ii. Large number Arithmetic: As another example, consider adding very large numbers. Suppose we wanted to add 353,120,457,764,910,452,008,700 and 234,765,000,129,654,080,277. First of all note that it would be difficult to represent the numbers as integer variables, as they cannot hold such large values. The problem can be solved by treating the numbers as strings of numerals, store them on two stacks, and then perform addition by popping numbers from the stacks. iii. Evaluation of arithmetic expressions: Stacks are useful in evaluation of arithmetic expressions. Consider the expression 5 * 3 +2 + 6 * 4 The expression can be evaluated by first multiplying 5 and 3, storing the result in A, adding 2 and A, saving the result in A. We then multiply 6 and 4 and save the answer in B. We finish off by adding A and B and leaving the final answer in A. A = 15 2 + = 17 B=64* = 24 A = 17 24 + = 41 We can write this sequence of operations as follows: 53*2+64*+ This notation is known as postfix notation and is evaluated as described above. We shall shortly show how this form can be generated using a stack. Basically there are 3 types of notations for expressions. The standard form is known as the infix form. The other two are postfix and prefix forms. Infix: operator is between operands A + B Postfix : operator follows operands A B + Prefix: operator precedes operands + A B Note that all infix expressions can not be evaluated by using the left to right order of the operators inside the expression. However, the operators in a postfix expression are ALWAYS in the correct evaluation order. Thus evaluation of an infix expression is done in two steps. The first step is to convert it into its equivalent postfix expression. The second step involves evaluation of the postfix expression. We shall see in this section, how stacks are useful in carrying out both the steps. Let us first examine the basic process of infix to postfix conversion. Infix to postfix conversion: Example 1: a + b * c Infix form (precedence of * is higher than of +) a + (b * c) convert the multiplication a + ( b c * ) convert the addition a (b c * ) + Remove parentheses a b c * + Postfix form Note that there is no need of parentheses in postfix forms. Example 2: ( A + B ) * C Infix form ( A B + ) * C Convert the addition (A B + ) C * Convert multiplication A B + C * Postfix form No need of parenthesis anywhere Example 3: a + (( b * c ) / d ) a + ( ( b c * ) /d ) (precedence of * and / are same and they are left associative) a+(bc*d/) abc*d/+ • More examples Infix Postfix (a + b) * (c – d) a b + c d – * a – b / (c + d * e) a b c d e * + / - ((a + b) * c – (d – e))/(f + g) a b + c * d e – - f g + / Order of precedence for operators: multiplication (*) and division (/) addition (+) and subtraction (-) The association is assumed to be left to right. i.e. a + b + c = (a+b)+c = ab+c+ Evaluating a Postfix Expression We can evaluate a postfix expression using a stack. Each operator in a postfix string corresponds to the previous two operands. Each time we read an operand we push it onto a stack. When we reach an operator its associated operands (the top two elements on the stack) are popped out from the stack. We then perform the indicated operation on them and push the result on top of the stack so that it will be available for use as one of the operands for the next operator. The following example shows how a postfix expression can be evaluated using a stack. Example 6523+8*+3+* The process stops when there are no more operator left in the string. The result of evaluating the expression is obtained just by popping off the single element. More examples will be done in the lecture and recitation labs. Self Assessment Questions 1. Discuss the various STACK applications with suitable example of each. 2. Explain how Stacks are useful in evaluation of arithmetic expressions with example. 3. Write a suitable example of following arithmetic notations infix , postfix and prefix forms. Stacks using structures So far we have seen how a stack is implemented using an array and various applications of stacks. In this approach whenever a function push() is called, we have to pass three parameters namely item, top and S, where item is the element to be pushed, top is an integer value which is the index of the top most element in the array S. But, as the number of parameters increases, the overhead of programming also increases and efficiency decreases. In such cases, we group all related items under a common name using a structure and pass structures as parameters, which eventually reduces the burden and increases the efficiency. In our stack implementation, instead of passing two parameters top and S, we can pass only one parameter if we use a structure. So, a stack can be declared as a structure containing two objects viz., an array to store the elements of the stack and an integer indicating the position of the topmost element in the array. The declaration can take the following form: #define STACK_SIZE 5 struct stack { int items[STACK_SIZE]; int top; }; typedef struct stack STACK; Once this definition is done, we can use a variable s to access the contents of the stack and to obtain the position of the top most element. The declaration for this can take the form STACK s; The position of the top most element and the element itself can be accessed (using the ‘.’ operator) by specifying? s. top and s. items[s. top]; If the declaration is of the form STACK *s, the position of the top most element and top most element can be accessed by specifying s->top and s->items[s->top]; Let us implement stacks using structures also. We know that the stack is empty, if the position of the top most element is -1. The function is_empty( ) which returns true whenever the stack is empty and returns false whenever the stack is not empty is shown in below example 1. Example 1. : Function to check whether the stack is empty or not int is_empty(STACK *s) { if (s->top = = -1) return -1; /* Stack empty */ return 0; /* Stack is not empty */ } The function is_full( ) returns true if stack is full and returns false if the stack is not full. This function is shown in below example 2. Example 2: Function to check whether the stack is full or not int is_full(STACK *s) { if ( s->top == STACK _ SlZE -1) return 1; /* Stack is full */ return 0; /* Stack is not full */ } The function to insert an integer item into the stack is shown in below example 3. Example 3: Function to insert an integer item into the stack void push(int item, STACK *s) { if ( is_full(s) ) { printf(”Stack Overflow\n”); return; } s->top+ +; /* Update top to point to next item */ s->items[s->top] = item; /* Insert the item into the stack*/ } The function to insert a character item into the stack is shown in below example 4. Example 4 : Function to insert a character item into the stack void push(char item. ST ACK *s) { if ( is_full(s) ) { printf(”Stack Overflow\n”); return; } s->top++; /* Update top to point to next item */ s->items[s->top] = item; /* Insert the item into the stack*/ } The function to delete an integer item from the stack is shown in below example 5 Example 5 : Function to delete an integer item from the stack int pop(ST ACK *s) { int item; if ( is_ empty(s) ) { printf(”Stack Underflow\n”); return 0; } item = s->items[s->top]; /* Access the top element */ s->top–; /* Update the pointer to point to previous item return item; /* Return the top item to the calling function */ } The function to delete a character item from the stack is shown in below example 6 Example 6 : Function to delete a character item from the stack. char pop(STACK *s) { char item; if ( is_empty(s) ) { printf(”Stack Underflow\n”); return 0; } item = s->items[s->top]; /* Access the top element */ s->top–; /* Update the pointer to point to previous item */ return item; /* Return the top item to the calling function */ } The program to display the contents of stack is shown in below example 7. Example 7 : Function to display the contents of the stack void display(STACK s) { int i; if ( is_empty(&s) ) { printf(”Stack is empty\n”); return 0; } printf(”The contents of the stack\n”); for (i = 0; i<= s.top; i + +) { printf(”%d\n “,s.items[i]; } } The C program to simulate the working of a stack using structures is shown in below example. Example 8 : C program to simulate the stack operations using structures #include<stdio.h> #include <process.h> #define ST ACK_SIZE 5 struct stack { int items[STACK_SIZE]; int top; }; typedef struct stack STACK; /* Include example 1: Check for empty stack */ /* Include example 2: Check for stack full or not */ /* Include example 3: To insert an item on the stack */ /* Include example 5: To delete an item from the stack */ /* Include example 7: To display the contents of the stack */ void main() { int item; /* Item to be inserted */ int choice; /* Push, pop, display or quit */ STACK s; /* To store items */ s.top = -1; /* Stack is empty initially */ for (;;) { printf(”1: Push 2: Pop\n”); printf(”3: Disply 4: Exit\n”); printf(”Enter the choice\n”); scanf(”%d”,&choice); switch(choice) { case 1: printf(”Enter the item to be inserted\n”); scanf(”%d”,& item); push(item, & s); break; case 2: item = pop(&s); if (item != 0) { printf(”Item deleted = %d\n”, item); } break; case 3: display(s); break; default: exit(0); } } } Sample C programs to represents the Stack Implementation: Example 1 // Stack Implementation using arrays to Insert element in the Stack #include<stdio.h> #include<conio.h> #define Max 5 int Staff (Max] , top=-l; void display( ) { if ((top= =-l) || (top= =0)) { printf{”\n Stack is full \n”); } else { printf{”\n Stack elements are \n”); for(int i=top-1;i>=0;i–) printf(”%5d”, Staff([i]); } } void push ( ) { int ele; char ch; if(top-=-l) top=0; do { if(top>=5) { printf(“\n STACK IS FULL”); break; } else { clrscr( ); printf (”\nENTER THE ELEMENT TO BE INSERTED\n”) ; scanf(”%d”,&ele) ; Staff(top++]=ele; display ( ) ; } printf (”\nDO U WANT 2 ADD MORE ELEMENTS:?\n”); scanf ( “\n%c” , &ch); } while ( (ch= =’y’ ) || (ch==’Y’ ) ); } void pop ( ) { if ( (top= =-l) || (top= =0)) { printf (”\nstack is under flow\n”); } else { printf{”\n %d is deleed from stack\n”,Staff(–top]) ; display( ); } } Example 2 //Stack Implementation using push and pop void main() { clrscr( ); char c; int choice; do { clrscr() ; printf(”\n Enter the choice\n”); printf (”l->push\n”); printf (”2->pop\n”) ; scant (” %d “,&choice); if(choice= =l) push( ); else if(choice= =2) pop ( ) ; else printf (” \ninvalid choice”); printf (”\ndo u want 2 continue:?”; scanf(”\n%c”,&c) ; }while{ (c==’y')||(c==’Y')); } Example 3 // Stack Implementation using array #include<stdio.h> #include<conio.h> #define Max 5 int a [Max] , top=-l;- void push() { int ele; char ch; if(top= =-l) top=0; do { if(top>=5) { printf(”Stack if full”); break; else { clrscr(); printf ( “Enter element to be inserted\n” ) ; scanf(“%d”,&ele); a[top++] =ele; } printf ( “Do you want to add more elements: ?\n”) ; scanf ( “\n%c” , &ch); printf(“%c”,ch) }while((ch= = ‘Y’)||(ch==’Y’)); } void pop( ) { if(top= =-l) { printf ( “stack is underflow\n”); } else { for(int i=top-l;i>=0;i–) printf (”%d\n”, a [i] ) ; } } void main() { clrscr ( ) ; char c; int choice; do { clrscr( ); printf (”Enter Your Choice\n”); printf(”l -> Push\n”); printf (”2 -> Pop\n”); scanf (”%d”, &Choice); if(choice= =l) push( ); else if (choice= = 2) pop( ); else printf(“invalid choice”); printf(“Do You Want to continue\n”); scanf(“\n%c”,&c); }while((c= = ‘y’)||(c= = ‘Y’)); } Summary A tack is defined as a special type of data structure where items are inserted from one end called top of stack and items are deleted from the same end. Here, the last item inserted will be on top of stack. Since deletion is done from the same end, Last item Inserted is the First item to be deleted Out from the stack and so, stack is also called Last In First Out (LIFO) data structure. A stack is very useful in situations when data have to be stored and then retrieved in the reverse order. Some applications of stack are : Function Calls, Large number Arithmetic, and Evaluation of arithmetic expressions etc. Terminal Questions 1. Define Stack with neat diagram ? Discuss the Push and Pop Operation. 2. Write a C Program to implement the stack operations using arrays. 3. Discuss the various STACK applications with suitable example of each. 4. Explain how Stacks are useful in evaluation of arithmetic expressions with example. 5. Explain the Evaluating a Postfix Expression using suitable example. 6. Illustrate the C program to represents the Stack Implementation on POP and PUSH operation. Unit4 Overview of Queues In this unit discussed the Overview Queues, Different types of queues with its operations and implementation of queues operations using ‘C’ Introduction A queue is a pile in which items are added an one end and removed from the other. In this respect, a queue is like the line of customers waiting to be served by a bank teller. As customers arrive, they join the end of the queue while the teller serves the customer at the head of the queue. As a result, a queue is used when a sequence of activities must be done on a first-come, first-served basis. Queue is a linear list for which all insertions are made at one end of the list; all deletions (and usually all accesses) are made at the other end. (FIFO) Objectives At the end of this unit, you will be able to understand the: · Brief introductions of Queues and its operations · Different types of queues and its implementation in C · Ordinary queue · Double ended queue · Circular queue · Priority queue Queues and its Operations A queue is defined as a special type of data structure where elements are inserted from one end and elements are deleted from the other end. The end from where the elements are inserted is called ‘rear end ’(r) and the end from where elements are deleted is called ‘front end’(f). In a queue. always elements are inserted from the rear end and elements are deleted from the front end. The pictorial representation of the queue is shown in below figure. Here, the front end is denoted by f and rear end is denoted by r. So, the first item inserted into the queue is 10, the second item inserted is 20 and the last item inserted is 30. Any new element to be inserted into this queue has to be inserted towards right of 30 and that item will be the last element in the queue. The first item to be deleted from the queue is the item which is at the front of the queue i.e. 10. So it is very clear from the operations performed on queues that First item Inserted is the First item to be deleted Out from the queue. So queue is also called First In First Out (FIFO) data structure. This data structure is useful in time-sharing systems where many user jobs will be waiting in the system queue for processing. These jobs may request the services of CPU, main memory or external devices such as printer etc. All these jobs will be given a fixed time for processing and are allowed to use one after the other. This is the case of an ordinary queue where priority is same for all the jobs and whichever job is submitted first, that job will be processed. The primitive operations that can be performed on queues are: nsert an item into queue · Delete an item from queue · Display the contents of queue Other useful operations can be qempty( ) which returns true if queue is empty else returns false and the function qfull() which returns true if queue is full, therwise it returns false. Sometimes, based on the preference, jobs may have to be processed. Such a queue where a job is processed based on the priority is called a priority queue. Let us see the various types of queues in the next section. Self Assessment Questions 1. Define Queue. 2. Discuss the representation of Queue with neat diagram. Different types of queues The different types of queues are: · Ordinary queue · Double ended queue · Circular queue · Priority queue Ordinary queue This queue operates on first come first serve basis. Items will be inserted from one end and they are deleted at the other end in the same order in which they are inserted. Here first element inserted is the first element to go out of the queue. A queue can be represented using an array as shown in fig. The operations that can be performed on these queues are · Insert an item at the rear end · Delete an item at the front end · Display the contents of the queue Let us discuss how these operations can be designed and implemented. a) Insert at the rear end Consider a queue, with QUEUE_SIZE as 5 and assume 4 items are present as shown in below fig. a. Here, the two variables f and r are used to access the elements located at the front end and rear end respectively. It is clear from this figure that at most 5 elements can be inserted into the queue. Any new item should be to be inserted to the right of item 40 i.e., at q[4]. It is possible if we increment r by 1 so as to point to next location and then insert the item 50. The queue after inserting an item 50 is shown in above fig. b. So, the formal version of the function can be written as: void insert_rear() { q[++ r] = item; } Note that as we insert an element, r is incremented by 1 and finally the queue may be full. When queue is full, it is not possible to insert any element into queue and this condition is called overflow i.e., when r reaches QUEUE_SIZE-l, queue becomes full. The function shown in below example, returns 1 if queue is full; otherwise, the function returns 0. Example 1 : Function to check whether queue is full int qfull(int r) { /* returns true if queue is full otherwise false */ return ( r == QUEUE_SIZE-1 ) ? 1: 0; } Once the function qfull() returns true, it is not possible to insert an item; otherwise, an item can be inserted and the function insert_rear() can be modified as shown in below example. Example 2: Function to insert an item at the rear end of queue void insert_rear (int item, int q[], int *r) { if ( qfull(*r) ) /* Is queue full ? */ { printf(”Queue overflow\n”); return; } /* Queue is not full */ q[+ +(*r)] = item; /* Update rear pointer and insert a item */ } b) Delete from the front end The first item-’to be deleted from the queue is the item, which is at the front end of the queue. It is clear from the queue shown in below fig. (a) that the first item to be deleted is 10. Once this item is deleted, next item i.e., 20 in the queue will be the first element and the resulting queue is of the form shown in fig. (b). So, the variable f should point to 20 indicating that 20 is at the front of the queue. This can be achieved by accessing the item 10 first and then incrementing the variable f. The formal version of the program can be written as: void delete_front() { printf(”The deleted item is %d\n”,q[f+ +]); } But, as we delete an item one after the other, finally queue will be empty. Consider the queue shown in fig. (C). Once the item 40 is deleted, f points to the next location and queue is empty i.e., whenever the front end identified by f is greater than the rear end identified by r, then queue is empty. This condition is called underflow of queue. The function to check for an underflow of queue is shown in below example. Example 3: Function to check for underflow of queue int qempty(int f, int r) { return ( f > r) ? 1 : 0; /* returns true if queue is empty otherwise returns false */ } The function qempty( ) returns true when queue is empty and returns false when queue is not empty. If queue is empty, it is not possible to delete an element from queue and this condition is called underflow of queue. So, the formal version of the function delete_front( ) can be modified so as to check for underflow and the modified version is shown in below example. Example 4: Function to delete an item from the front end void delete_front(int q[], int *f, int *r) { if ( qempty(*f, *r) ) { printf(”Queue underflow\n”); return; } printf(”The element deleted is %d\n”, q[(*f)+ +]); if(*f> *r) { *f=O,*r=-l; } } Consider the situation shown in above fig. (d). Deleting an item 50 from queue results in an empty queue. At this stage, suppose an item has to be inserted. Observe that the function insert_rear( ) shown in above examples displays the message “Queue overflow”, because the rear pointer r reaches QUEUE_SIZE-l. It is clear from the queue (shown in figure (d)) that queue is not full and even then, an item can not be inserted. For this reason, whenever queue is empty, we reset the front end identified by f to 0 and rear end identified by r to -1. So, the initial values of front pointer f and rear pointer r should be 0 and -1 respectively. c) Display queue contents The contents of queue can be displayed only if queue is not empty. If queue is empty an appropriate message is displayed. The function display is shown in below example. Example 5: Function to display the contents of queue void display(int q[], int f, int r) { int i; if ( qempty(f,r) ) { printf(”Queue is empty\n”); return; } printf(”Contents of queue is\n”); for(i=f;i<= r; i+ +) printf(” %d\n”,q[i]); } The complete C program to implement different operations on an ordinary queue is shown in example . Example : C program to implement an ordinary queue #include <stdio.h> #include <process.h> #define QUEUE_SIZE 5 /* Include function to check for overflow shown in example 4.2.1- Eg. 1 */ /* Include function to check for underflow shown in example 4.2.1- Eg. 3 */ /* Include function to insert an item at the rear end shown in example 4.2.1- Eg. 2 */ /* Include function to delete an item at the front end shown in example 4.2.1- Eg. 4 */ /* Include function to display the contents of queue shown in example 4.2.1- Eg. 5 */ void main() { int choice,item,f,r,q[10]; /* Queue is empty */ f = 0; /* Front end of queue*/ r = -1; /* Rear end of queue*/ for (;;) { printf(” 1 :Insert 2: Delete\n”); printf(”3: Display 4: Exit\n”); printf(”Enter the choice\n”); scanf(”%d”, &choice); switch ( choice ) { case 1: printf(”Enter the item to be inserted\n”); scanf(”%d”, &item); insert_rear(item, q, &r}; break; case 2: delete_front(q, &f, &r); break; case 3: display(q, f, r); break; default: exit(0); } } } Output 1: Insert 2: Delete 3: Display 4: Exit Enter the choice 2 Queue underflow 1: Insert 2: Delete 3: Display 4: Exit Enter the choice 3 Queue empty 1: Insert 2: Delete 3: Display 4: Exit Enter the choice 1 Enter the item to be inserted 10 1: Insert 2: Delete 3: Display 4: Exit Enter the choice 1 Enter the item to be inserted 20 1: Insert 2: Delete 3: Display 4: Exit Enter the choice 1 Enter the item to be inserted 30 1: Insert 2: Delete 3: Display 4: Exit Enter the choice 3 Contents of queue is 10 20 30 1: Insert 2: Delete 3: Display 4: Exit Enter the choice 2 The element deleted is 10 1: Insert 2: Delete 3: Display 4: Exit Enter the choice 2 The element deleted is 20 1: Insert 2: Delete 3: Display 4: Exit Enter the choice 2 The element deleted is 30 1: Insert 2: Delete 3: Display 4: Exit Enter the choice 3 Queue is empty 1: Insert 2: Delete 3: Display 4: Exit Enter the choice 4 Disadvantage of ordinary queue Consider the queue shown in fig. This situation arises when 5 elements say 10,20,30,40 and 50 are inserted and then deleting first four items. If we try to insert an element say 60, since r has the value QUEUE_SIZE-1 (where QUEUE_SIZE is the maximum number of elements that can be stored in a queue), we get an overflow condition and it is not possible to insert any element. Even though queue is not full, in this case, it is not possible to insert any item into the queue. This disadvantage can overcome if we use the circular queue, which will be discussed in Circular queue. Double ended queue (Deque) Another type of queue called double-ended queue also called Deque is discussed in this section. Deque is a special type of data structure in which insertions and deletions will be done either at the front end or at the rear end of the queue. The operations that can be performed on deques are · Insert an item from front end · Insert an item from rear end · Delete an item from front end · Delete an item from rear end · Display the contents of queue The three operations insert rear, delete front and display and the associated operations to check for an underflow and overflow of queue have already been discussed in ‘ordinary queue’. In this section, other two operations i.e., insert an item at the front end and delete an item from the rear end are discussed. a) Insert at the front end Consider queue shown in above fig (a). Observe that, the front end identified by f is 0 and rear end identified by r is -1. Here, an item can be inserted first by incrementing r by 1 and then insert an item. If the front pointer f is not equal to 0 as shown in above fig. (b), an item can be inserted by decrementing the front pointer .f by 1 and then inserting an item at that position. Except for these conditions, it is not possible to insert an item at the front end. For example, consider the queue shown in above figure (c). Here, an item 10 is already present in the first position identified by f and so, it is not possible to insert an item. The complete C function to insert an item is shown in below example. Example 1: Function to insert an item at the front end void insert_front(int item, int q[ ], int *f, int *r) { if( *f= = 0 && *r = = -1) q[++(*r)] = item; else if ( *f ! = 0) q[--(*f)]=item; else printf(”Front insertion not possible\n”); } b) Delete from the rear end To delete an element from the rear end, first access the rear element and then decrement rear end identified by r. As an element is deleted, queue may become empty. If the queue is empty, reset the front pointer f to 0 and rear pointer r to -1 as has been done in an ordinary queue. We delete an element only if queue is not empty. The complete C function to delete an item from the rear end is shown in below example. Example 2: Function to delete an item from the rear end of queue void delete_rear(int q[],int *f, int *r) { if ( qempty(*f,*r) ) { printf(”Queue underflow\n”); return; } printf(”The element deleted is %d\n”.q[(*r)--]); if (*f > *r) { *f = 0, *r = -1 ; } } Example : C program to Implement double-ended queue #include <stdio.h> #include <process.h> #define QUEUE_SIZE 5 /* Include function to check for overflow 4.2.1 Eg.-1*/ /* Include function to check for underflow 4.2.1 Eg -3*/ /* Include function to insert an item at the front end 4.2. 3 Eg.-1*/ /* Include function to insert an item at the rear end 4.2.1 Eg -2*/ /* Include function to delete an item at the front end 4.2.1 Eg -4*/ /* Include function to delete an item at the rear end 4.2. 3 Eg.-2*/ /* Include function to display the contents of queue 4.2.1 Eg -5*/ void main() { int choice,item,f,r,q [10]; f=0; r = -1; for (;;) { printf(” 1:Insert_front 2:lnsert_rear\n”); printf(”3: Delete_front 4: Delete_rear\n” ); printf(”5: Display 6:Exit\n”); printf(”Enter the choice\n”); scanf(”%d” ,&choice ); switch ( choice ) { case 1: printf(”Enter the item to be inserted\n”); scanf(”%d”,& item); insert_ front(item, q, &f, &r); break; case 2: printf(”Enter the item to be inserted\n”); scanf(”%d”,& item); insert_rear(item, q, &r); break; case 3: delete _front(q, &f, &r); break; case 4: delete_rear(q, &f, &r); break; cases 5: display(q, f, r); break; default: . exit(0); } } } Circular queue In an ordinary queue, as an item is inserted, the rear end identified by r is incremented by 1. Once r reaches QUEUE_SIZE -1, we say queue is full. Note that even if some elements are deleted from queue, because the rear end identified by r is still equal to QUEUE_SIZE-l, item cannot be inserted. Pot details refer in Disadvantage of ordinary queue. This disadvantage is overcome using circular queue. The pictorial representation of a circular queue and its equivalent representation using an array are given side by side in below fig in next page. Assume that circular queue contains only one item as shown in below fig. (a). In this case, the rear end identified by r is 0 and front end identified by f is also 0. Since rear end is incremented while insertion, just before inserting the first item, the value ‘of r should be -1 (Note: An item is inserted only at the rear end and so, only r is incremented by 1 and not f) so that after insertion, f and r points to an item 10. So, naturally, the initial values of f and r should be 0 and -1. The configuration shown in below fig. (b) is obtained after inserting 20 and 30. To insert an item, the rear pointer r has to be incremented first. For this, any of the two statements shown below can be used. r = r + l or r = (r + 1) %QUEUE_SIZE Both statements will increment r by 1. But, we prefer the second statement. We see why this method is preferred instead of a simple statement r= r+1 The queue shown in fig.(c) is obtained after inserting 40 and 50. Note that at this point, the queue is full. Suppose we delete two items 10 and 20 one after the other. The resulting queue is shown in fig. (d). Now try to insert an item 60. If the statement r= r+1 is used to increment rear pointer, the value of r will be 5. But because this is a circular queue r should point to 0. This can be achieved using the statement r = (r + 1) % QUEUE_SIZE After execution of the above statement r will be 0. If this approach is used to check for overflow or underflow, we use a variable count that contains the number of items in the queue at any instant. So as an item is inserted, increment count by 1 and as an item is deleted decrement count by 1. By this it is ensured that at any point of time, count always contains the total number of elements in the queue. So, if queue is empty, count will be 0 and if queue is full, count will be QUEUE_SIZE. Thus we can easily implement the two functions qfull( ) and qempty( ) and these functions are shown in below examples 1 and 2 respectively. Example 1 : Function to check queue overflow int qfull(int count) { return ( count = = QUEUE_SIZE ) ? I: 0; /* return true if Q is full; otherwise false */ } Example 2: Function to check queue underflow int qempty (int count) { return ( count == 0 ) ? 1: 0; /* return true if Q is empty; otherwise false */ } Example 3 : Function to insert an item at the rear end void insert_rear(int item, int q[], int *r, int *count) { if ( qfull(*count) ) { printf(”Overflow of queue\n”); return; } *r = (*r + 1) % QUEUE_SIZE ; /* increment rear pointer */ q[*r] = item; /* Insert the item */ *count += 1; /* update the counter */ } a) To insert from the front end Note that insertion is done only when queue is not full. So, if queue is not full, to insert an item, increment rear end identified by r by 1 and then insert. Also, as an item is inserted update the value of count by 1. The variable count is used to check for overflow or underflow. The function to insert an item into the queue is shown in the above example 3. b) To delete from front end As in an ordinary queue, the element at the front end of the queue has to be deleted. ‘So, access an item which is at the front end by specifying q[f] and update the front end identified f to point to next front item. The front end identified by f can be incremented using the following statement f= (f+ 1) % QUEUE_SIZE; As an item is deleted from a queue, the count should be decremented by 1. The complete C, function to delete an element from the circular queue is shown in below example 4. Example 4 : Function to delete an item from the front end of queue void delete_front(int q[], int *f, int *count) { if ( qempty(*count) ) { printf(”Underflow of queue\n”); return; } printf(”The deleted element is %d\n”,q[*f]); /* Access the item */ *f = (*f + 1) % QUEUE_SIZE; /* Point to next first item */ *count -= 1; /* update counter */ } c) To display queue contents If queue is not empty, elements in the queue should be displayed from the front end identified by f to the rear end identified by r. The total number of items to be displayed can be obtained from the variable count. This can be achieved by initializing the index variable i to the front end identified by f and incrementing i each time using the statement i= (i + 1)% QUEUE_SIZE; count number of times. As the index variable i point to the each item in the queue, the queue contents can be accessed and displayed one by one. The function to display the contents of circular queue is shown in below example 5. Example 5: Function to display the contents of the queue void display(int q[], int f, int count) { int i, j ; if ( qempty(count) ) { printf(”Q is empty\n”); return; } printf(“Contents of queue is\n”); i= f; /* Point to front of queue */ for ( j = 1; j <= count; j++) { printf(%d “,q[i]); /* access, the item */ i= (i + 1) % QUEUE_SIZE; /* Point to next item */ } printf(“\n”); } The complete C program to implement circular queue is shown in below example 6. Example 6: C program to implement circular queue #include <stdio.h> #include <process.h> #define QUEUE_SIZE 5 /* function to check for overflow shown in example 1. */ /* function to check for underflow shown in example 2 */ /* function to insert an item at the rear end shown in example 3 */ /* function to delete an item from the front end shown in example 4 */ /* function to display the contents of the queue shown in example 5 */ void main() { int choice,item,f,r,count,q[20]; f= 0; r= -1; count = 0; /* queue is empty */ for (;;) { printf(” 1 :Insert 2: Delete\n”); printf(”3: Display 4: Exit\n”); printf(”Enter the choice\n”); scanf(”%d”,& choice); switch ( choice ) { case 1: printf(”Enter the item to be inserted\n”); scanf(”%d”,& item); insert_rear( item.q ,&r ,&count ); break; case 2: delete_front(q, &f, &count); break; case 3: display(q, f, count); break; default: exit(0); } } } Output 1: lnsert 2: Delete 3: Display 4: Exit Enter the choice 2 Underflow of queue 1: lnsert 2: Delete 3: Display 4: Exit Enter the choice 3 Q is empty l: lnsert 2: Delete 3: Display 4: Exit Enter the choice 1 Enter the item to be inserted 10 1: lnsert 2: Delete 3: Display 4: Exit Enter the choice 1 Enter the item to be inserted 20 1: Insert 2: Delete 3: Display 4: Exit Enter the choice 1 Enter the item to be inserted 30 1: Insert 2: Delete 3: Display 4: Exit Enter the choice 1 Enter the item to be inserted 40 1: Insert 2: Delete 3: Display 4 Exit Enter the choice 1 Enter the item to be inserted 50 1: Insert 2: Delete 3: Display 4: Exit Enter the choice 1 Enter the Item to be inserted 60 Overflow of Q 1: Insert 2: Delete 3: Display 4: Exit Enter the choice 2 The deleted item is 10 1: Insert 2: Delete 3: Display 4: Exit Enter the choice 1 Enter the item to be inserted 60 1: Insert 2: Delete 3: Display 4: Exit Enter the choice 1 Enter the item to be inserted 70 Overflow of Q 1: Insert 2: Delete 3: Display 4: Exit Enter the choice 4 The C program to simulate the working of circular queue of names using an array is shown in below example 7. Example 7: C program to implement circular queue with items as strings #include <stdio.h> #include <process.h> #include <alloc.h> #include <string.h> #define QUEUE_SIZE 5 /* function to check for underflow */ int qempty(int count) { return ( count = = 0 ) ? 1: 0; } /* function to check for overflow */ int qfull(int count) { return ( count = = QUEUE_SIZE ) ? 1: 0; } /* function to insert an item at the rear end */ void insert_rear(char item[], char q[][30], int *r, int *count) { if ( qfull(*count) ) { printf(”Overflow of queue\n”); return; } *r = (*r + 1) % QUEUE_SIZE; , strcpy (q[*r],item); *count += 1 ; } /* function to delete an item from the front end */ void delete_ front(char q[][30], int *f, int *count) { if ( qempty(*count) ) { printf(”Underflow of queue\n”); return; } printf (”The deleted element is %s\n”,q[*f]); *f = (*f + 1) % QUEUE_SIZE; *count-= 1; } /* function to display the contents of the queue */ void display(char q[][30], int f, int count) { int i,j; if (qempty (count) ) { printf(”Q is empty\n”); return; } printf(”Contents of queue is\n “); i= f; for (j = l;j <= count; j++) { printf(”%s\n”,q[i]); i = (i + 1) % QUEUE_SIZE; } printf(”\n”); } void main() { int choice,f,r,count; char item[20],q[20][30]; f=0; r = -1; count = 0; /* queue is empty */ for (;;) { printf(”1: Insert 2: Delete\n”); printf(”3: Display 4: Exit\n”); printf(”Enter the choice\n”); scanf(”%d”, & choice); switch ( choice ) { case 1: printf (”Enter the item to be inserted\n”); scanf(”%s”, item); insert_rear(item, q,& r ,&count); break; case 2: delete_ front(q, &f, &count); break; case 3: display(q, f, count); break; default: exit(0); } } } Self Assessment Questions 1. Explain the ordinary Queue Insert at the rear end with suitable example. 2. Discuss the deletion of item from queue with suitable example. 3. Write ‘C’ Program to display the contents from the queue. Sample C programs to represents the Queue Implementation : Example 1 // Stack Implementation using arrays to Insert element in the Stack #include<stdio.h> #include<conio.h> #define Max 5 int Staff (Max] , top=-l; void display( ) { if ((top= =-l) || (top= =0)) { printf{”\n Stack is full \n”); } else { printf{”\n Stack elements are \n”); for(int i=top-1;i>=0;i–) printf(”%5d”, Staff([i]); } } void push ( ) { int ele; char ch; if(top-=-l) top=0; do { if(top>=5) { printf(“\n STACK IS FULL”); break; } else { clrscr( ); printf (”\nENTER THE ELEMENT TO BE INSERTED\n”) ; scanf(”%d”,&ele) ; Staff(top++]=ele; display ( ) ; } printf (”\nDO U WANT 2 ADD MORE ELEMENTS:?\n”); scanf ( “\n%c” , &ch); }while ( (ch= =’y’ ) || (ch==’Y’ ) ); } void pop ( ) { if ( (top= =-l) || (top= =0)) { printf (”\nstack is under flow\n”); } else { printf{”\n %d is deleed from stack\n”,Staff(–top]) ; display( ); } } Example 1 //Stack Implementation using push and pop void main() { clrscr( ); char c; int choice; do { clrscr() ; printf(”\n Enter the choice\n”); printf (”l->push\n”); printf (”2->pop\n”) ; scant (” %d “,&choice); if(choice= =l) push( ); else if(choice= =2) pop ( ) ; else printf (” \ninvalid choice”); printf (”\ndo u want 2 continue:?”; scanf(”\n%c”,&c) ; }while{ (c==’y')||(c==’Y')); } Example 2 // Stack Implementation using array #include<stdio.h> #include<conio.h> #define Max 5 int a [Max] , top=-l;- void push() { int ele; char ch; if(top= =-l) top=0; do { if(top>=5) { printf(”Stack if full”); break; else { clrscr(); printf ( “Enter element to be inserted\n” ) ; scanf(“%d”,&ele); a[top++] =ele; } printf ( “Do you want to add more elements: ?\n”) ; scanf ( “\n%c” , &ch); printf(“%c”,ch) }while((ch= = ‘Y’)||(ch==’Y’)); } void pop( ) { if(top= =-l) { printf ( “stack is underflow\n”); } else { for(int i=top-l;i>=0;i–) printf (”%d\n”, a [i] ) ; } } void main() { clrscr ( ) ; char c; int choice; do { clrscr( ); printf (”Enter Your Choice\n”); printf(”l -> Push\n”); printf (”2 -> Pop\n”); scanf (”%d”, &Choice); if(choice= =l) push( ); else if (choice= = 2) pop( ); else printf(“invalid choice”); printf(“Do You Want to continue\n”); scanf(“\n%c”,&c); }while((c= = ‘y’)||(c= = ‘Y’)); } Example 3 // Queue Implementation using Array #include<stdio.h> #include<conio.h> #define Max 5 // define macro int Queue (Max] , front=-l, rear=0; void Display ( ) { int i; if(front+l==rear) printf(”Empty Queue\n”); else { printf (”Elements of Queue \n”) ; for (i=front+l ; i<rear; i++) printf(“ %d\t”,Queue[i]); printf(”\n”); } } void InsertElement( ) { int ele; if (rear= = Max) printf(”Queue is Full”); else { printf(”Enter Element to be Inserted\n”); scanf(”%d”,&ele); Queue (rear++] =ele; Display( ); } } void DeleteElement ( ) { if(front+l= 0=rear) printf(”Empty Queue \n”); else { printf(”%d is Deleted from Queue\n”, Queue[++front ]); Display( ); } } void main( ) { clrscr( ); . char c; int choice; do { printf(”Select Your Choice\n”); printf(”1-> Insert\n”); printf(”2 -> Delete\n”) ; scanf(”%d”,& choice); if(choice= =1) InsertElement( ); else if(choice= =2) DeleteElement(); else printf(”Invalid choice”); printf(“\nDo You Want to Continue \n”); scanf(”\n%c”,&c); }while((c= = ‘y’) || (c==’Y')); } Example 4 //Stack Implementation using Pointers #include<stdio”h> #include<conio.h> #include<alloc.h> #define NewNode (Node*)malloc(sizeof(Node)) typedef struct node { int item; struct node *next; }Node; Node * Push(Node* ); Node * Pop(Node *); void Display(Node*); void main() { int ch; Node *start=NULL; clrscr(); do { printf(”Select choice\n “); scanf(”%d”,&ch); switch(ch) { case 1: start= Push( start); break; case 2: start= Pop( start); break”; default: printf(”Invalid Choice\n “); break; } Display( start ) ; }while( ch!=0); } Node* Push(Node *s) { Node *tmp=NULL; tmp=NewNode; int item; tmp->next=NULL; printf(”Enter the Item \n”); scanf(”% d” , &tmp->item ); if(s==NULL) s=tmp; else { tmp->next=s s=tmp; } return(s); } Node *Pop(Node*s ) { Node *tmp; if(s!=NULL) { printf(”Element Deleted: %d\n”,s~>item); tmp=s; s=s->next; free(tmp); } else printf(”Stack Underdlow\n”); return(s); } void Display(Node *s ) { Node* tmp; tmp=s; if( tmp! = NULL ) { while(tmp!=NULL ) { printf(“%d ->”,tmp->item); tmp=tmp->next; } printf (Null\n “); } else printf(”Stack Underflow\n”); } // Queue Implementation using Pointers #include<conio.h> #include<stdio.h> #include<alloc.h> #define NewNode (Node *)malloc(sizeof(Node)) typedef struct node { int item; node *next; }Node; Node * Insert(Node *f, Node *r); Node * Delete(Node *f, Node *r); void Display(Node *f, Node *r); void main( ) { Node *front=NULL,*rear=NULL; int ch,flag=0; do { printf(”Enter Your Choice\n”); scanf(”%d”,&ch); switch(ch) { case 1: { rear=Insert(front, rear); if(flag= = 0) { front=rear; flag=l; } break; } case 2: { front=Delete(front, rear); if(front-=NULL) { rear=NULL; flag=0; } Break; } default: { printf(”Invalid Choice\n”); break; } } }while(ch!=0); } Node*Insert (Node *f, Node *r) { Node * tmp; tmp=NewNode; scanf(“%d”,&tmp->item); tmp->next=NULL; if(r==NULL) { r=tmp; f=tmp; } else { tmp->next=r; r=tmp; Display(r, f); } return(r); } Node * Delete(Node *f,Node *r) { Node *ptr; if(f==NULL) printf(”Empty Queue\n”); else if(f= =r) { printf(”Element Deleted:% d\n”,f->item); free(r); f=r=NULL; } else { ptr=r; while(ptr->next->next!=NULL) ptr=ptr->next; ptr->next=NULL; printf(”Element Deleted: %d\n”,f->item); free(f); f=ptr; } Display(r,f); return(f); } void Display(Node *r,Node *f) { Node *ptr=r; if(ptr==NULL) printf(”Queue Empty\n”); else { while(ptr!=f) { printf(”%d ->”,ptr->item); ptr=ptr->next; } printf(”%d ->”,ptr->item); printf(”\n”) ; } } Summary Queue is a linear list for which all insertions are made at one end of the list; all deletions (and usually all accesses) are made at the other end. (FIFO) In this unit discussed the various types of Queues (such as Ordinary Queue, Double ended Queue, Circular Queue, Priority Queue) and its implementation using C programming for its different operations such as insertion, deletion, display etc. from/to the queues. Terminal Question 1. Define Queue with neat diagram. 2. Explain the different types of Queues. 3. Discuss the Insert/Delete operation from the rear and front end of the queue. 4. Illustrate the ‘C’ program to implement an ordinary queue. 5. Write note on: Double ended queue (Deque) Unit5 Linked Lists This unit covers the Introduction of Lists, Linear list, Linked list, Typical basic linked-list operations, singly-linked list, circular singly linked list, with its operations Doubly linked list with its operations. Various list operation examples are illustrated using C. Introduction List is an aggregate of data in a useful order for example numbers 1, 2, 3, 4, 5. simple data structure such as arrays, sequential mapping, have the property that successive nodes of the data object are stored a fixed distance apart. These sequential storage schemes proved adequate given the functions one wished to perform (access to an arbitrary node in a table, insertion or deletion of nodes within a stack or queue). The sequential property of linear list is basic to its definition and use. The simple linear list structure array is found almost in any programming language. Functions defined to operate on a list : · insert : insert a new entry into a list · delete : delete an entry from list · length : compute length of a list · next : return the next element in a list · search : search if an element is in a list Objectives At the end of this unit, you will be able to understand the: · Introduction of Linear list · Different types of Linked List and its Operations. · Implementation of Singly-Linked Lists in C · Implementation of Circular singly linked list in C · Implementation of Doubly linked lists in C Linear list A linear list is a sequence of n>= 0 nodes x[1], x[2], x[3]…….x[n] whose essential structural properties involve only the relative positions between items as they appear in a line. The only thing we care about in such structures are the facts that, if n>0, x[1] is the first node and x[n] is the last one. If 1<k<n, the kth node x[k] is preced by x[k-1] and followed by x[k+1]. Linear lists can be divided in two categories: In a restricted list, data can only be added or deleted at the ends of a structure and processing is restricted to operations at the end of the lists. Two restricted list structures are First in First Out (FIFO) Stacks and Last In First Out (LIFO) Queue. There are four operations performed on linear lists, 1. Insertion 2. Deletion 3. Retrieval 4. Traversal. The first three apply to all lists. List traversal is not applicable to restricted lists. Depending on the type of linear list, an insertion can be made at the beginning of the list, or at the end of the lists. While there are no restrictions on inserting data into random lists, Computers data generally insert data at the end of lists. The few applications where random lists are used are found either in data gathering applications, or in which the applications require randomness such as simulation studies or games. When inserting data into ordered lists, the data must be inserted so that the ordering is maintained. This may require data at the beginning, end of the list, or middle of the list. In case of inserting in middle inserting, search algorithm is used. Deletion from general list requires that the list be searched for the data to be deleted. Any sequential search can be used to locate data. Once the data is located, it is removed from the list. List retrieval requires that data be located in a list and presented to the calling module without changing the contents of the lists. List traversal is a special case of retrieval in which all the elements are retrieved in a sequence. List traversal requires a looping algorithm rather than a search. Each execution processes one element in the list. The loop terminates when all elements have been processed. Linked list A linked-list is a collection of records, called nodes, each containing at least one field (member) that gives the location of the next node in the list. In the simplest case, each node contains two members; a ‘data member’ (the value of the list item) and a ‘link member’ (a value locating the next node). The linked list is a very flexible dynamic data structure. It is a low-level structure upon which high-level data structures can be built. Typical basic linked-list operations • Create : Makes a new linked list • Insert : Puts a new node in its place in the list. (Can be coded to take place at the beginning of the list, or at the end of the list or based on seeking to target value). • Remove : Remove a node from the list. • Traverse : this function allow user to visit each node in the list. (The purpose of the visit is defined by the list user, and it is certain to vary from application to application). • isEmpty : This function returns a true/false indication of whether or not there are any nodes in the list. • isFull : This function returns a true/false indication of whether or not the list is full. ( When the data structure is static, this operation may perform a test. In the typical dynamic list, it simply returns false). Singly-Linked Lists The singly-linked list is the most basic of all the linked data structures. A singly-linked list is simply a sequence of dynamically allocated objects, each of which refers to its successor in the list. Despite this obvious simplicity there are myriad implementation variations. Figure shows several of the most common singly-linked list variants. The basic singly-linked list is shown in Figure (a). Each element of the list refers to its successor and the last element contains the null reference. One variable, labeled head in Figure (a), is used to keep track of the list. Figure: Singly linked list variations The basic singly-linked list is inefficient in those cases when we wish to add elements to both ends of the list. While it is easy to add elements at the head of the list, to add elements at the other end (the tail) we need to locate the last element. If the basic singly- linked list is used, the entire list needs to be traversed figure : Singly linked list variations in order to find its tail. Figure (b) shows a way in which to make adding elements to the tail of a list more efficient. The solution uses a second variable, tail, which refers to the last element of the list. Of course, this time efficiency comes at the cost of the additional space used to store the variable tail. The singly-linked list labeled (c) in Figure illustrates two common programming tricks. There is an extra element at the head of the list called a sentinel. This element is never used to hold data and it is always present. The principal advantage of using a sentinel is that it simplifies the programming of certain operations. For example, since there is always a sentinel standing guard, we never need to modify the head variable. Of course, the disadvantage of a sentinel such as that shown in (c) is that extra space is required, and the sentinel needs to be created when the list is initialized. The list (c) is also a circular list. Instead of using a null reference to demarcate the end of the list, the last element of the list refers to the sentinel. The advantage of this programming trick is that insertion at the head of the list, insertion at the tail of the list, and insertion at an arbitrary position of the list are all identical operations. Of course, it is also possible to make a circular, singly-linked list that does not use a sentinel. Figure (d) shows a variation in which a single variable is used to keep track of the list, but this time the variable, tail refers to the last element of the list. Since the list is circular in this case, the first element follows the last element of the list. Therefore, it is relatively simple to insert both at the head and at the tail of this list. This variation minimizes the storage required, at the expense of a little extra time for certain operations. Figure illustrates how the empty list (i.e., the list containing no list elements) is represented for each of the variations given in Figure. Notice that the sentinel is always present in list variant (c). On the other hand, in the list variants which do not use a sentinel, the null reference is used to indicate the empty list. Empty singly-linked lists. In the following sections, we will present the implementation details of a generic singly- linked list. We have chosen to present variation (b)-the one which uses ahead and a tail – since is supports append and preapend operations efficiently. Example 1 : Illustrate the Singly Linked list operations using Pointers. # Singly Linked List using Pointers #include<stdio.h> #include<conio.h> #include<alloc.h> #define NewNode (Node *)malloc( sizeof(Node)) typedef struct node { int item; struct node *next; }Node; Node * Create(Node *); void Display(Node *); int Count(Node *); Node* Insert(Node *); Node * Delete{Node *); void Search(Node*); void main( ) { Node *ptr=NULL,*start=NULL; int ch,cnt; start=Create( start ); Display(start), do { printf(”\n Singly Linked List Operations \n”); printf(” 1-> Count\n”); printf(”2-> Display\n”); printf(”3-> Insert\n”); printf(”4-> Delete\n”); printf(”5-> Search\ n”); printf(”\nenter w. choice \n”). scanf(”%d”,&ch); switch( ch) { case 1 : printf(”\nNo of Nodes = %d\n”,Count (start)); break; case 2: Display(start); break; case 3: start= Insert( start) ; break; case 4: start=Delete(start); break; case 5: Search(start); break; default: printf(”Invalid Selection \n”); break; } }while(ch!=0); } Node * Create(Node *s) { Node *tmp=NULL, *t1 =NULL; int num; t1-=s; do { printf(”\nEnter the element\n”); scanf(”% d”,&num); if(num!=-99) { tmp= New Node; tmp->item=num; tmp->next= NULL; if( s== NULL ) s=t1=tmp; else { tl->next=tmp; tl =tl->next. } } else printf(”Linked List Created Successfully \n”); }while(num!=-99); return(s); } void Display(Node *s ) { if(s== NULL ) printf(”MT Linked List \n”); else { while(s!=NULL) { printf(”%5d”,s->item); s=s->next; } printf(”\n “); } } int Count(Node *s ) { int total= 0; while(s’=NULL) { total++; s=s->next. , } return(total); } Node * Insert(Node *s) { Node *t1 =NULL, *tmp=NewNode; int pos; printf(“Enter the position to be inserted \n.); scanf(“% d.,&pos ); if(pos>0 && pos<=Count(s)+ 1) { printf(”Enter the element to be inserted \n”); scanf(”%d”,&tmp->item) printf(“enter the name \n”); scanf(“%C”,name); if(pos==l) { tmp->next=s; s=tmp; } else { t1=s; while(pos>2) { t1 =t1->next; pos–; } tmp->next=t1->next; t1->next=tmp;} } else printf(”Invalid position \n”); return(s); } Node * Delete(Node *s) { Node *t1 =s, *tmp=NULL; int pos; printf(”Enter the position, to be deleted \n”); scanf(“% d”,&pos); if(pos>0 && pos<=Count(s)) { if(pos== 1 ) { s=s->next. free(tl); } else { while(pos>2) { t1=t1->next. pos–; } tmp=t l->next; t l->next=tmp->next; free(tmp); } } else printf(“Invalid Position \n”); retum(s); } void Search(Node *s) { int ele; int f1ag=0; printf(”ente the element to be searched\n”); scanf(”%d”,& ele); if(s!=NULL) { while(s!=NULL) { if( s->item==ele ) { printf(”\n% d is present \n”,ele); flag= 1 ; } s=s->next; } if(flag=0) printf(”Element Not Found \n”); } else printf(”List is MT, Key element can’t be searched\n”); } Self Assessment Questions : 1. Discuss the linked list with its operation. 2. Explain the basic singly linked list with neat diagram. Circular singly linked list In the linked lists discussed so far in earlier sections, the link field of the last node contained a NULL pointer. In this section, we discuss linear lists again with slight modification by storing address of the first node in the link field of the last node. The resulting list is called a circular singly linked list or a circular list. The pictorial representation of a circular list is shown in following figure. The circular lists have some advantages over singly linked lists that are not circular. In a singly linked list given the address of any node x, only those nodes which follow x are reachable but, the nodes that precede x are not reachable. To reach the nodes that precede x, it is required to preserve a pointer variable say first which contain the address of the first node before traversing. But in circular lists if the address of any node x is known, one can traverse the entire list from that node and so all nodes are reachable. So, in a circular linked list, the search for the predecessor of the node x can be initiated from x itself and there is no need of a pointer variable that points to the first node of the list. The disadvantage of these circular lists is that when proper care is not taken it is possible that we may end up in an infinite loop unless proper care is taken to detect the end of the list. Since the list is circular, any node can be considered as the first node and its predecessor is considered as a last node. The following two conventions can be used: 1. A pointer variable first can be used to designate the starting point of the list. Using this approach, to get the address of the last node, the entire list has to be traversed from the first node. 2. In the second technique, a pointer variable last can be used to designate the last node and the node that follow last, can be designated as the first node of the list. So, we use the second approach in our discussions for convenience. Note: Whatever operations are possible using singly linked lists, all those operations can be performed using circular lists also. A circular list can be used as a stack, queue or a deque. The basic operations required to implement these data structures are insert_ front, insert_ rear, delete_ front, delete_ rear and display. Let us implement these functions one by one. Insert a node at the front end Consider the list shown in following fig. (a). The list contains 4 nodes and a pointer variable last contains address of the last node. Step 1: To insert an item 50 at the front of the list, obtain a free node temp from the availability list and store the item in info field as shown in dotted lines in above fig. (b). This can be accomplished using the following statements temp = getnode( ); temp->info = item; Step 2: Copy the address of the first node(i.e., last->link) into link field of newly obtained node temp and the statement to accomplish this task is temp->link = last->link; Step 3: Establish a link between the newly created node temp and the last node. This is achieved by copying the address of the node temp into link field of last node. The corresponding code for this is last->link = temp; Now, an item is successfully inserted at the front of the list. All these steps have been designed by assuming that the list is already existing. If the list is empty, make temp itself as the last node and establish a link between the first node and the last node. Repeatedly insert the items using the above procedure to create a list. The C function to insert an item at the front of the circular linked list is shown in below example. Example 1: Function to insert an item at the front end of the list. NODE insert_ front (int item, NODE last) { NODE temp; temp = getnode( ); /* Create a new node to be inserted */ temp->info = item; if (last = = NULL) /* Make temp as the first node */ last = temp; else /* Insert at the front end */ temp->link = last->link; last->link = temp; /* link last node to first node */ return last; /* Return the last node */ } Insert a node at the rear end Consider the list shown in below fig. (a). This list contains 4 nodes and last is a pointer variable that contains the address of the last node. Figure to insert at the rear end Let us insert the item 80 at the end of this list. After successfully inserting 80, the list shown in fig.(c) is obtained. Following steps shown below to insert an item at the rear end of the list. Step 1: Obtain a free node temp from the availability list and store the item in info field as shown in dotted lines in fig. (b). This can be accomplished using the following statements temp = getnode( ); temp->info = item; Step 2: Copy the address of the first node(i.e., last->link) into link field of newly obtained node temp and the statement to accomplish this task is temp->link = last->link; . Step 3: Establish a link between the newly created node temp and the last node. This is achieved by copying the address of the node temp into link field of last node. The corresponding code for this is last->link = temp; Step 4: The new node is made as the last node using the statement: return temp; /* make new node as the last node */ These steps have been designed by assuming the list is already existing. If the list is empty make temp itself as the-first node as well as the last node. The C function for this is shown in below example. Example : Function to insert an item at the rear end of the list NODE insert_ rear (int item, NODE last) { NODE temp; temp = getnode( ); /* Create a new node to be inserted */ temp->info = item; if ( last == NULL) /* Make temp as the first node */ last = temp; else /*Insert at the rear end */ temp->link = last->link; last->link = temp; /* link last node to first node */ return temp; /* Make the new node as the last node */ } Note: Compare the functions insert_ front( ) and insert_ rear( ). All statements in both the functions are same except that in insert_ front( )function, address of last node is returned and in the function insert_ rear( ) address of the new node is returned. Delete a node from the front end Consider the list shown in below figure. This list contains 5 nodes and last is a pointer variable that contains the address of the last node. Figure to delete an item from the front end To delete the front node (see the sequence of numbers 1,2,3 shown in above fig. ), the steps to be followed are Step 1: first = last->link; /* obtain address of the first node */ Step 2: last->link = first->link; /* link the last node and new first node */ Step 3: printf (”The item deleted is %d\n”, first->info); Freenode (first); /*delete the old first node */ Now, the node identified as first node is deleted. These steps have been designed by assuming the list is already existing. If there is only one node, delete that node and assign NULL to last indicating the list is empty. The code corresponding to this is can be /* If there is only one node, delete it */ if ( last->link = = last ) { printf (”The item deleted is %d\n”, last->info); freenode(last); last = NULL; return last; } All these steps designed so far have to be executed provided the list is not empty. If the list is empty, display the appropriate message. The complete code to delete an item from the front end is shown in below example. Example : Function to delete an item from the front end NODE delete_ front(NODE last) { NODE temp, first; if ( last = = NULL ) { printf(”List is empty\n”); return NULL; } if ( last->link = = last) /* Only one node is present */ { printf(”The item deleted is %d\n”, last->info); freenode (last); return NULL; } /* List contains more than one node */ first = last->link; /* obtain node to be deleted */ last->link = last->link; /*Store new first node in link of last */ printf (”The item deleted is %d\n”, first->info); freenode (first); /* delete the old first node */ return last; } Delete a node from the rear end To delete a node from the rear end, it is required to obtain the address of the predecessor of the node to be deleted. Consider the list shown in fig. where the pointer variable last contains the, address of the last node. Figure to delete a node from the rear end To delete the node pointed to by last, the steps to be followed are: Step 1: Obtain the address of the predecessor of the node to be deleted. This can be accomplished by traversing from the first node till the link field of a node contains address of the last node. The code corresponding to this is prev = last->link; while ( prev->link != last ) { prev = prev->link; } Step 2: The first node and the last but one node (i.e., prev) are linked. This can be accomplished using the statement prev->link =’last->link; Step 3: The last node can be deleted using the statement freenode (last); After executing these statements, return the node pointed to by prev as the new last node using the statement return(prev); All these steps have been designed by assuming that the list is already existing. If there is only one node, delete that node and assign NULL to the pointer variable last indicating that the list is empty. If the list is empty in the beginning display the appropriate message. The C function to delete an item from the rear end of circular list is shown in below example. Example : Function to delete an item from the rear end NODE delete_ rear (NODE last) { NODE prev; if ( last = = NULL ) { printf(”List is empty\n”); return NULL; } if ( last->link = = last) /* Only one node is present */ { printf(”The item deleted is %d\n”, last->info); freenode(last); return NULL; } /* Obtain address of previous node */ prev = last->link; while( prev->link != last ) { prev = prev->link; } prev->link = last->link; /* prev node is made the last node */ printf(”The item deleted is %d\n”, last->info); freenode(last); /* delete the old last node */ return prev; /* return the new last node */ } The C function to display the contents of circular list is shown in below example and the reader is required to understand how the function is working. Example: Function to display the contents of the circular queue void display(NODE last) { NODE temp; if ( last == NULL) { printf(”List is empty\n”); return; } printf(”Contents of the list is\n “); for (temp = last->link; temp != last; temp = temp->link) printf(”%d “,temp->info); printf(”%d\n”, temp->info); } · The C program to implement deque using circular linked list is shown in below example Example : Program to implement deque using circular list #include <stdio.h> #include <alloc.h> #include <process.h> struct node { int info; struct node *link; }; typedef struct node* NODE; /* function to get anode from the availability list */ /* function to return node to availability list */ /* function to insert an item at the front end */ /* function to insert an item at the rear end */ /* function to delete an item from the front end */ /* function to delete an item from the rear end */ /* function to display the contents of the list */ void main( ) { NODE last; int choice, item; last = NULL; for (;;) { printf(” 1 : Insert_ Front 2: Insert_ Rear\n “); printf(”3 elete_ Front 4: Delete_ Rear\n “); printf(”5: Display 6: Exit\n”); printf(”Enter the choice\n”); scanf(”%d”, &choice); switch(choice) { case 1: printf(”Enter the item to be inserted\n”); scanf(”%d”, &item); last = insert_ front (item, last); break; case 2: printf(”Enter the item to be inserted\n”); scanf(”%d”, &item); last = insert_ rear (item, last); break; case 3: last = delete_ front(last); break; case 4: last = delete_ rear(last); break; case 5: display(last); break; default: exit(0); } } } Note: Some of the disadvantages of singly linked lists/circular lists are: 1. Using singly linked lists and circular lists it is not possible to traverse the list backwards. 2. To find the predecessor, it is required to traverse the list from the first node in case of singly linked list. In case of circular list, the predecessor can be obtained by traversing the whole list from the node specified. For example, if the position of the current node is 15, to find the position of node 14, the whole list has to be traversed. All these disadvantages can be overcome by using doubly linked lists. Self Assessment Questions : 1. Explain the Circular singly linked list with neat diagram. 2. Explain the steps to insert a node at the rear end in circular singly linked list. 3. Explain the steps to delete the node pointed to by last. Doubly linked lists Figure Doubly linked lists In singly linked lists (including circular lists), each node contains the address of the next node. If there is one more field which contains the address of the previous node, it is possible to obtain the address of the predecessor of a node specified. Using such lists, it is possible to traverse the list in forward and backward directions. A list where both forward and backward traversal is possible should have two link fields. The link field, which contains address of the left node is called left link denoted by llink and the link field which contains the address of the right node is called right link and is denoted by rlink. Such a list where each node has two links is called a doubly linked list or a two way list. The list can be traversed from the first node whose address is stored in a pointer variable first, to the last node in the forward direction. It can also be traversed from the last node whose address is stored in a pointer variable last, to the first node in backward direction. The pictorial representation of a doubly linked list is shown in above figure. In the doubly linked list shown in figure (a) the link field of the leftmost node and link field of rightmost node points to NULL. The list shown in fig. (b) is a doubly linked circular list. In this list, the left link of the left most node contains address of the right most node and the right link of the right most node contains address of the left most node. The list shown in fig. (c), is a doubly linked circular list with a header node. In this type of list, the left link of a header node contains address of the last node and right link of last node contains address of the header node. An empty list with a header node can be represented as shown in fig.(d), where the left link and right link of a header node points to itself. All those problems that can be solved using singly linked lists can be solved using doubly linked lists. It is left to the reader to implement all the problems solved so far, using doubly linked lists and doubly linked circular list. Given any problem let us implement them using doubly linked lists and with a header node. Using a header node, problems can be solved very easily and effectively. Insert a node at the front end Consider the list shown in below fig. To insert a node pointed to by temp at the front of the list, address of the first node i.e., pointed to by cur should be known. This can be accomplished by using the statement cur = head->rlink; The node pointed to by temp can be easily inserted between header node and the node pointed to by cur (follow the dotted lines). This can be accomplished by using the following statements. head->rlink = temp; temp->llink =head; temp->rlink = cur; cur->llink = temp; Figure to insert a node at the front end The C function to insert at the front of a doubly linked circular list with a header node is shown in below example. Example : Function to insert a node at the front end of the list NODE insert_ front (int item, NODE head) { NODE temp, cur; /* Node to be inserted */ temp = getnode( ); temp->info = item; /* obtain address of first node */ cur = head->rlink; /* Insert between header node and first node */, head->rlink = temp; temp->llink = head; temp->rlink = cur; cur->llink = temp; /* return the header node */ return head; } 5.4.2 Insert a node at the rear end To insert a node at the rear end of the list, consider the list shown in fig. and try to write the corresponding code. After writing the code compare this with the function insert_ front ( ). Note that the two functions are same except that llink and rlink have been exchanged. The reader should know the simplicity of the code using the header node. The C function to insert an item at the rear end of the list is shown in example. Figure to insert a node at the rear end Example : Function to insert an item at the rear end NODE insert_ rear(int item, NODE head) { NODE temp, cur; /* Node to be inserted */ temp = getnode ( ); temp->info = item; /* obtain address of the last node */ cur = head->llink; /* Insert at the end of the list */ head->llink = temp; temp->rlink = head; temp->llink = cur; cur->rlink = temp; /* return the header node */ return head; } 5.4.3 Delete a node from the front end Consider the list shown in below fig. Find the address of the first node to be deleted using the statement: cur = head->rlink; Also obtain the successor of the node to be deleted using the statement: next = cur->rlink; Figure to delete a node at the front end Once the addresses of the node to be deleted, its predecessor and successor are known, following the sequence of numbers shown in above fig. the node at the front end can be deleted. The steps to be followed are shown below. Step 1, 2: Establish a link between the header node and successor of the node to be deleted i.e., next in both directions. This can be accomplished using the statement head ->rlink = next; next->llink = head; Step 3: Once these statements are executed, the node to be deleted i.e., cur is isolated and it can be deleted. The corresponding statements are: printf(”The item deleted is %d\n”, cur->info); freenode(cur); Finally return the address of the header node. Note that all these steps have been designed by assuming list is already existing. If list is empty display the appropriate message. The C function to delete an item from the front end of the list is shown in below example. Example: Function to delete a node from the front end NODE delete_ front (NODE head) { NODE cur, next; if ( head->rlink = = head) { printf(”Deque is empty\n”); return head; } cur = head->rlink; /* first bode is known */ next = cur->rlink; /* second node which will be the first node*/ /* adjust pointers and delete the node */ head->rlink = next; next->llink = head; printf(”The node to be deleted is %d\n”, cur->info); freenode(cur); return head; } Example : Program to implement insert at front/rear/before/after a node, to delete front/rear/based on item/all nodes etc. # include<studio.h> # include < alloc.h> # include < process.h> struct nodded { int info; struct node * llink; struct node* rlink; }; typedefs struct node * NODE; /* function to insert at the front end */ /* function to insert at the rear end*/ /* function to delete the first node */ /* function to display the contents of the list */ void display (NODE head) { NODE temp; if(head->rlink = = head ) { priritf(”Deqye is empty\n”); return; } printf(”Contents of the deque is\n”); for(temp = head->rlink; temp != head; temp = temp->rlink) printf(”%d “,temp->info); printf (”\n”); } void main( ) { NODE head; int choice, item; head = getnode(); head->rlink = head; head->llink = head; for (;;) { printf(” 1 : Insert_ front 2: Insert_ rear\n “); printf(”3: Delete_ front 4: Delete_ rear\n”); printf(”5: Display “); printf(”6: Exit\n”); printf(”Enter the choice\n”); scanf(” %d ” ,&choice ); switch(choice) { case 1: printf(”Enter the item to be inserted\n”); scanf(”%d”,& item); head = insert_ front(item, head); break; case 2: printf(”Enter the item to be inserted\n”); scanf(”%d”,& item); head = insert_ rear(item, head); break; case 3: head = delete_ front(head); break; case 4: head = delete_ rear(head); break; case 5: display(head); break; default; exit(0); } } } Self Assessment Questions: 1. Discuss the doubly linked list with neat diagram. 2. Write C function to insert a node at the front end of the list. 3. Write a steps to delete a node from the front end with neat diagram. Summary In this unit discussed the various types of list and its operations. The sequential property of linear list is basic to its definition and use. The simple linear list structure array is found almost in any programming language. The linked list is a very flexible dynamic data structure. It is a low-level structure upon which high-level data structures can be built. In this unit discussed the various linked list operation implementation using C for Circular singly linked list, Doubly linked list. Terminal Questions 1. What is List ? Discuss the functions defined to operate on List. 2. Explain the typical basic operated on linked list. 3. Explain the singly linked list with neat diagram. 4. Illustrate the ‘C’ program for singly link list operators using points. 5. Write note on Circular Singly Linked List. 6. Explain Insert/Delete a node at the rear, front end from circular singly linked list. 7. Write ‘C’ program to implement deque using circular linked list. 8. Explain the doubly linked list with neat diagram. 9. Explain the operation of insert and delete a node from the doubly linked list Unit6 Trees This unit covers the Introduction of Trees, Definitions, Binary tree, Storage representation of a Binary tree, Various operations on binary trees using linked representation, Binary Search tree [BST], and operations of BST. Introduction So far we have discussed the data structures where linear ordering was maintained using arrays and linked lists. These data structures and their relationships are invariably expressed using single dimension. For some problems it is not possible to maintain this linear ordering using linked lists. Using non-linear data structures such as trees, graphs, multi-linked structures etc., more complex relations can be expressed. In this unit on trees, we begin with some definitions, the various operations that can be performed on trees along with various applications. Objectives At the end of this unit, you will be able to understand the: • Overview of Tree Concepts • Binary Tree and its type • Various operations on binary trees • Binary Search Tree [BST] • Tree operation implementation using C Overview of Tree Concepts A tree consists of a finite set of elements, called nodes and a finite set of directed lines, called branches that connect the nodes. The number of branches associated with a node is the degree of the node. When the branch is directed to wards the node, it is an indegree branch; When the branch is directed away from the node it is an outdegree branch, the sum of outdegree and indegree branches equals to the degree of the node. “A tree consists of a finite set of elements, called nodes, and a finite set of directed lines, called branches, that connects the nodes”. If the tree is not empty, then the first node is called as root. The indegree of root by definition is zero. With the exception of the root, all the nodes in a tree must have an indegree exactly one. All nodes in the tree can have zero, one or more branches leaving them; that is they may have an outdegree of zero, one or more. Consider a tree shown in figure: Parent: A node is a parent if it has successor nodes; means outdegree greater than zero. (Your parents) Child: A node is a child node if indegree is one. (You) Siblings: Two or more nodes with same parent are siblings. (Your bro and sis) An ancestor is any node in the path from the root to the node. A descendant is any node on the path below the parent node; that is all nodes in the paths from a given node to a leaf are descendants of the node. A tree may be divided into subtrees. A subtree is any connected structure below the root. Elements of Tree Siblings Two nodes that have the same parent. In a binary tree the two children are called the left and right. In the figure to the right, D and E are siblings. In this example m=2 so the tree is a binary tree Internal Nodes Nodes that are not root and not leaf are called as internal nodes. Graph : A graph G = (V, E) where V is set of vertices and E is set of edges. In the graph shown in fig. 6. 1. a, V = {1,2,3,4} E = {(1,2), (1,3), (2,3), (3,4), (4,2)} where E is represented as a set of ordered pairs (x, y), if and only if there is an edge from vertex x to y with x as the initial node and y as the terminal node. A graph in which every edge is directed is called a directed graph or digraph as shown in. fig. 6.1.a. A graph in which every edge is undirected is called an undirected graph and is shown in fig. 6.1.b. A graph is said to be a mixed graph, if some of the edges are undirected and some of the edges are directed as shown in fig. .1.c. If there is a maximum of one edge between a pair of nodes in the graph, the graph is called simple graph. The graphs shown in fig. 6.1 are all simple graphs. In the graph shown in fig. 6.1.d, since vertex 4 is not adjacent to any other nodes, the vertex 4 is called an insolated node. The total number of edges leaving a node is called an outdegree of that node and the number of edges incident on a node is called indegree of that node. Sum of outdegree and indegree is the total degree of a node. The outdegree of node 2 in fig. 6.1.a is 1 and indegree is 2. So, the total degree of node 2 is 3. In a graph if one vertex, say u, is reachable from vertex v and v is reachable from u, then the path from v to v through u or u to u through v is called a circuit or a cycle. There is a cycle C = ( (2,3), (3,4) ,,(4,2) ) in the graph shown in fig. 6.1.a. A simple digraph having no cycle is called an acyclic graph. A directed tree is an acyclic digraph, which has only one node with indegree 0, and all other nodes have indegree 1. A node with an indegree 0 is called the root node and all other nodes are reachable from the root node. The nodes that are all reachable from a node u are all called descendents of u. The nodes from which u is reachable starting from the root node are called ancestors of node u. The nodes, which are all reachable from It using only one edge, are called children of node It and node u is the father for all those children. All the nodes that are all left descendents of a node u form the left subtree of u and the right descendents form the right subtree of node u. A node in a directed tree that has an outdegree of 0, is called a leaf node or a terminal node Le., a node with an empty left child and an empty right child is called a leaf node. Figure 6.2 Binary trees In the directed tree shown in figure 6.2.a, the ancestors of node 35 are 80, 60 and 100. The descendents for the node 60 are 80, 35, 30 and 40. The root node is 100 and the terminal nodes are 70, 35, 30 and 40. Note that the indegree of 100 is zero and all other nodes have an indegree 1 and all the leaf nodes have outdegree 0. All non-leaves are called internal nodes and leaf or terminal nodes are called external nodes. The level of any node u is given by the number of edges in the path from root node. So, the level of root node is zero. In fig. 6.2.a, the node 35 is at level 3 and the node 100 at level 0. The Height or depth of a tree is one more than the maximum level in the tree. So, the height of the tree shown in fig. 6.2.a is 4 and that of tree shown in fig. 6.2.b is 1 and the height of an .empty. binary tree is zero. Self Assessment Questions 1. Discuss the various Graphs with neat diagram. 2. Define Directed tree. Binary Tree It is a directed tree in which outdegree of each node is less than or equal to two i.e., each node in the tree can have 0, or 2 children. An empty tree is also a binary tree. A binary tree can also be defined recursively as follows: A binary tree is a finite set with the following properties: 1. The first subset contains only one element and it is called root of the tree. If root contains NULL it is called empty binary tree. 2. The second subset is a binary tree called left sub tree. The left sub tree can be empty. 3. The third subset is a binary tree called right sub tree. The right sub tree can be empty. The trees shown in fig. 6.2 are all binary trees. · Binary tree operations A binary tree is a tree in which no node can have more than two subtrees. In other words, a node can have zero, one, or two sub trees. In other words A tree in which every parent has one or two children (but not more than that) is called as binary tree. The “root” component of binary tree is the forefother of all children. But it can have only up to two children one “right” child and “left” child. These children can become fathers and each can produce only two children. In fact a child can become a father, grandfather, great grandfather and son. Fig shows five binary trees all with three nodes. We can have binary trees with any number of nodes. “A child cannot have more than one father. If it is so then the tree is not a binary tree. A binary tree node cannot have more than two subtrees.” 6.3.1 Strictly binary tree If the outdegree of every node in a tree is either 0 or 2, then the tree is said to be strictly binary tree i.e., each node can have maximum two children or empty left and empty right child. The trees shown in fig. 6.2.e and fig. 6.2.f are strictly binary trees. Example : A binary tree is said to be strictly binary if every non-leaf node has non-empty left and right subtrees. Fig shows a strictly binary tree. F A Strictly binary tree · The binary tree in the above fig is not strictly binary because the non-leaf node E has no right sub-tree and the non-leaf node C has no left sub-tree. 6.3.2 Complete binary tree A strictly binary tree in which the number of nodes at any level i is 2i-1, then the tree is said to be a complete binary tree. The tree shown in fig. 6.2.f is a strictly binary tree and at the same time it is a complete binary tree. Example: In a complete binary tree, the total number of nodes at level 0 is 1 i.e., 2° Number of nodes at level 1 is 2 i.e., 21 Number of nodes at level 2 is 4 i.e., 22 Number of nodes at level 3 is 8 i.e., 23 …………………………………… …………………………………… …………………………………… Number of nodes at the last level d is 2d. It is clear from the figure that all the nodes in the final level d are leaf nodes and the total number of leaf nodes is 2d. In a complete binary tree, we can easily find the number of non-leaf nodes. Now, let us find the number of non-leaf nodes in a complete binary tree. Total number of nodes in the complete binary tree = 2° + 21 + 22 + ………….2d. Summation of this series is given by S = a( rn- 1) 1( r- 1) where a = 1, n = d+ 1 and r = 2 So, total number of nodes nt = 2d+1- 1 Nodes at level d are all leaf nodes. So, number of non-leaf nodes is given by 2d+1 – 1 –2d which is equal to 2d – 1. Almost complete binary tree A tree of depth d is an almost complete binary tree, if the tree is complete up to the level d-l i.e., the total number of nodes at the level d-l should be 2d-1. The total number of nodes at level d may be equal to 2d. If the total number ‘of nodes at level d is less than 2d, then the number of nodes at level d-l should be 2d-1 and in level d the nodes should be present only from left to right. The trees shown in fig. 6.3.a to fig. 6.3.c are all almost complete binary trees and the; rest are not. The nodes in an almost complete binary can be numbered level by level from left to right as shown in fig. 6.3.c. Figure 6.3 Binary trees Storage representation of a binary tree The trees can be represented using sequential allocation technique (using arrays) or by allocating the memory for a node dynamically (using linked allocation technique). In a linked allocation technique a node in a tree has three fields: · info which contains the actual information · llink which contains address of the left subtree · rlink -contains address of the right subtree. So, a node can be represented using structure as shown below: struct node { int info; struct node *llink; struct node *rlink; }; typedef struct node* NODE; A pointer variable root can be used to point to the root node always. If the tree is empty, the pointer variable root points to NULL indicating the tree is empty. The pointer variable root can be declared and initialized as NODE root = NULL; Memory can be allocated or de-allocated using the functions getnode() and freenode( ) as we discussed in linked lists in unit 9. A tree can also be represented using an array, which is called sequential representation (array representation). Consider the trees shown in fig. 6.4.a and fig. 6. 4.b. The nodes are numbered sequentially from 0. The node with position 0 is considered as the root node. If an index i is 0, it gives the position of the root node. Given the position of any other node i, 2i+1 gives the position of the left child and 2i+2 gives the position of the right child. If i is the position of the left child, i+ 1 gives the position of the right child and if i is the position of the right child, i-1 gives the position of the left child. Given the position of any node i, the parent position is given by (i-1) /2. If i is odd, it points to the left child otherwise, it points to the right child. The different ways in which a tree can be represented using an array is shown below. · In the first representation shown below, some of the locations may be used and some may not be used. For this, a flag field namely, used is used just to indicate whether a memory location is used to represent a node or not. If flag field used is 0, the corresponding memory location is not used and indicates the absence of node at that position. So, each node contains two fields: -info where the information is stored -used indicates the presence or the absence of a node The structure declaration for this is: #define MAX_SIZE 200 struct node { int info; int used; }; typedef struct node NODE; An array A of type NODE can be used to store different items and the declaration for this is shown below: NODE A [MAX_SIZE]; · An alternate representation is that, instead of using a separate flag field used to check the presence of a node, one can initialize each location in the array to 0 indicating the node is not used. Non-zero value in the location indicates the presence of the node. In section 6.7, a program is shown to create the tree and traverse it in different orders using this approach. In the next section let us concentrate on how the linked allocation technique is used to create and manipulate the tree data structure along with various applications. Self Assessment Questions 1. Define Binary tree? Discuss its properties. 2. Define Strictly Binary Tree? 3. Explain the Complete Binary tree. 4. Explain tree storage representation using sequential technique with suitable example. Various operations on binary trees using linked representation Let us see and implement various operations that can be performed on binary trees. Various operations that can be performed are: Insertion Operation Suppose the node pointed to by temp has to be inserted whose information field contains the item ‘I’ as shown in next figure 6.5. Let ‘d’ be an array, which contains only the directions where the node temp has to be inserted. If ‘d’ contains ‘LRLR’, from the root node first moves towards left(L), then right(R), then left(L) and finally move towards right(R). Finally if the pointer points to NULL, at that position, node temp can be inserted otherwise, node temp can not be inserted. To achieve this, one has to start from the root node. Let us use two pointers prev and cur where prev always points to parent node and cur points to child node. Initially cur points to root node and prev points to NULL. To start with, one can write the following statements. prev = NULL cur = root Now keep updating the node pointed to by cur towards left if the direction is (’L ‘) otherwise, update towards right. The pointer variable prev always points to the parent node and cur points to the child node. Once all directions are over, if cur points to NULL, insert the node temp towards left or right based on the last direction. Otherwise, display an error message. In the following algorithm, an index variable i is used to access the directions. The C function to insert a node is shown in below example Example : Function to insert an item into a tree NODE insert(int item, NODE root) { NODE temp; /* Node to be inserted */ NODE cur /* Child node */ NODE prev; /* Parent node */ char direction[.10]; /* Directions where the node has to be inserted */ int i; /* Maximum depth where a node can be inserted */ temp = getnode(); /* Obtain a node from the availability list */ temp->info = item; /* Copy the necessary information */ temp->llink = temp->rlink = NULL; if ( root = = NULL ) return temp; /* Node is inserted for the first time */ printf(”Give the directions where you want to insert\n”); scanf(”%s”, direction); toupper(direction); /* Convert the directions to upper case */ prev = NULL; cur = root; /* find the position to insert */ for ( i = 0; i< strlen(direction) && cur != NULL; i+ +) { prev= cur; /* Parent */ if ( direction[i] == ‘L’ ) /* If direction is (L) move towards left */. cur = cur->llink; else /* Otherwise move towards right */ cur = cur->rlink; } if ( cur != NULL ÷÷ i!= strlen(direction)) { printf(”Insertion not possible\n”); freenode(temp); return root; } /* insert the node at the leaf level */ if ( direction[i-1 ] == ‘L’ ) /* Attach the node to the left of the parent * / prev->llink = temp; else prev->rlink = temp; /* Attach the node to the right of the parent */ return root; } Traversals Traversing is the most common operation that can be performed on trees. In the traversal technique each node in the tree is processed or visited exactly once systematically one after the other. The different traversal techniques are Inorder, Preorder and Postorder. Each traversal technique can be expressed recursively. Algorithm for tree traversals: • The preorder traversal of a binary tree can be recursively defined as follows 1. Process the root Node [N] 2. Traverse the Left subtree in preorder[L] 3. Traverse the Right subtree in preorder [R] • §The postorder traversal of a binary tree can be recursively defined as follows: 1. Traverse the Left subtree in postorder[L] 2. Traverse the Right subtree in postorder [R] 3. Process the root Node [N] Self Assessment Questions : 1. List the various operations that can be performed on Binary tree. 2. Write a C function to insert an item into a tree. 3. Explain the tree traversal technique with suitable example for each. 4. Write an algorithm for preorder traversal of binary tree. Binary Search Tree (BST) A binary search tree is a binary tree in which for each node say x in the tree, elements in the left- subtree are less than info (x) and elements in the right subtree are greater or equal to info(x). Every node in the tree should satisfy this condition, if left subtree or right subtree exists. Other common operations performed on binary search trees are · Insertion -An item is inserted · Searching -Search for a specific item in the tree · Deletion -Deleting a node from a given tree. Insertion Operation Creating a binary search tree is nothing but repeated insertion of an item into the existing tree. So, we concentrate on how to insert an item into the tree. In a BST (Binary Search Tree), items towards left subtree of a node temp will be less than info(temp) and the items in the right subtree are greater or equal to info(temp). Consider the BST shown in fig. 6.8. Suppose the node pointed to by temp (with item 140) has to be inserted. The item 140 is compared with root node i.e., 100. Since it is greater, the right subtree of root node has to be considered. Now compare 140 with 200 and since it is less, consider the left subtree of the node 200. Now, compare 140 with 150. Since it is less, consider the left subtree of node 150. Since, left subtree of node 150 empty, the node containing the Item 140 has to be inserted towards left of a node 150. Thus, to find the appropriate place and insert an item, the search should start from the root node. This is achieved by using two pointer variables prev and cur. The pointer variable prev always points to parent of cur node. Initially cur points to root node and prev points to NULL i.e., prev = NULL cur = root Now, as long as the item is less than info (cur), keep updating the node pointed to by the pointer variable cur towards left. Otherwise, update towards right. The pointer variable prev always points to the parent node and cur points to the child node. Once cur points to NULL, insert the node temp towards left (prev) if item is less than info (prev), otherwise insert towards right. The code corresponding to this can be prev = NULL; cur = root; /* find the position where to insert */ while ( cur != NULL ) { prev = cur; cur = ( item < cur->info ) ? cur->llink : cur->rlink; } if ( item < prev->info) prev->llink = temp; else prev->rlink = temp; The above steps can be executed provided the tree exists. If the tree is empty initially, then make the node pointed to by temp itself as root node. The complete C function to insert an item into a binary search tree is shown in below example Example 6.8: Function to insert an item into an ordered binary tree (with duplicate elements) NODE insert (int item, NODE root) { NODE temp, cur, prev; temp = getnode( ); /* Obtain new node from the availability list */ temp->info = item; /* Copy appropriate data */ temp->llink = NULL; temp- >rlink = NULL; if ( root = NULL ) return temp; /* Insert a node for the first time */ /* find the position to insert */ prev = NULL; cur = root; while ( cur != NULL ) { /* Obtain parent and appropriate left or right child */ prev = cur; cur = ( item < cur->info ) ? cur->llink : cur->rlink; } if ( item < prev->info ) /* If node to be inserted is less than parent */ prev->llink = temp; /* Insert towards left of the parent */ else prev->rlink = temp; /* otherwise, insert towards right of the parent */ return root; /* Return the root of the tree */ } In the above function duplicate items are inserted towards right. To avoid insertion of duplicate elements, the above function can be modified as shown in below example. Note: In an ordered binary tree, the elements towards left of each node can be greater and elements towards right of each node can be less than or equal also. Example : Function to insert an item into an ordered binary tree (No duplicate items are allowed) NODE insert (int item, NODE root) { NODE temp, cur, prev; temp = getnode( ); /* Obtain new node from the availability list */ temp->info = item; /* Copy appropriate data */ temp->llink = NULL; temp->rlink = NULL; if ( root = = NULL) return temp; /* Insert a node for the first time */ /* find the position to insert */ prev = NULL; cur = root; while ( cur != NULL ) { /* Obtain parent */ prev = cur; / * do not insert duplicate item */ if ( item = = cur->info ) { printf(”Duplicate items not allowed\n”); freenode(temp); return root; } /* Find the appropriate left or right child */ cur = ( item < cur->info ) ? cur->llink : cur->rlink; } if ( item < prev->info ) /* If node to be inserted is less than parent */ rev->llink = temp; /* Insert towards left of the parent */ else prev->rlink = temp; /* otherwise, insert towards right of the parent */ return root; /* Return the root of the tree */ } Searching Start searching from the root node and move downward towards left or right based on whether the item lies towards left subtree or right subtree. This is achieved by comparing this item with root node of an appropriate subtree (left subtree or right subtree). If two items are same then search is successful and address of that node is returned. If the item is less than info (root), search in the left subtree, otherwise search in the right subtree. If item is not found in the tree, finally root points to NULL and return root indicating search is unsuccessful. The iterative function to search for this item is shown in below example. Example Function to search for an item in BST using iteration NODE iterative_ search (int item, NODE root) { /* search for the item */ while ( root != NULL && item != root->info ) { root = ( item < root->info ) ? root->llink : root->rlink; } eturn root; } The recursive function to search for an item is shown in below example. Example : Function to search for an item in BST using recursion NODE recursive_ search (int item, NODE root) { if ( root = = NULL ÷÷ item = = root->info ) return root; if ( item < root->info ) return recursive_ search(item, root->llink); return recursive_ search(item, root->rlink); } Other operations Other useful operations are shown below. · find maximum -to return maximum item in the tree · find minimum -to return minimum item in the tree · to find height of the tree · to count the number of nodes · to count the leaf nodes · deletion -deleting a node from the tree To find maximum value in a tree BST Given a binary search tree, a node with maximum value is obtained by traversing and the right most node in the tree. If there is no right subtree then return root itself as containing the item with highest value. The corresponding C function is shown in exam Example : Function to return the address of highest item in BST NODE maximum(NODE root) { NODE cur; if ( root = = NULL) return root; cur = root; while ( cur->rlink != NULL) cur = cur->rlink; /* obtain right most node in BST */ return cur; } To find minimum value in a BST Given a binary search tree, a node with least value is obtained by traversing and obtaining the let most node in the tree. If there is no left subtree then return root itself as the node containing a item with least value. The corresponding C function is shown in below example. Example : Function to return the address of least item in BST . NODE minimum (NODE root) { NODE cur; if ( root = = NULL) return root; cur = root; while ( cur->llink != NULL) cur = cur->llink; /* obtain left most node in BST */ return cur; } Height of tree Height of the tree is nothing but the maximum level in a tree plus one. The recursive definition to find the height of the tree is 0 if root = NULL Height(root) = 1 + max ( height(root->llink), height(root->rlink ) ) otherwise The corresponding C function to find the height of this tree is shown in below example. Example : Function to find the height of the tree . /* function to find maximum of two numbers */ int max(int a, int b), { return ( a> b ) ? a: b; } /* Function to find the height of the tree */ int height(NODE root) { if ( root = = NULL) return 0; return 1 + max( height(root->llink), height(root->rlink)); Count nodes in a tree The number of nodes in the tree is obtained by traversing the tree in any of the traversal technique and increment the counter whenever a node is visited. The variable count can be a global variable and it is initialed to zero to start with. In this example, the inorder traversal is used to visit each node. The C function to obtain the number of nodes in a tree is shown in below example. Example : Function to count the number of nodes in a tree . void count_ node(NODE root) { if ( root != NULL ) { count_ node(root ->llink ) ; count + +; count_ node( root ->rlink); } } Count leaves in a tree As in the previous case visit each node in a given tree. Whenever a leaf is encountered update count by one. A leaf or a terminal node is one, which has an empty left and empty right child. The function to count the leaves in a binary tree is shown in below example. The variable count can be a global variable and it is initialed to zero to start with. Example : Function to count the leaves or terminal nodes in a tree. void count_ leaf (NODE root) { if ( root != NULL ) { count_ leaf(root->llink); /* Traverse recursively towards left */ /* if a node has empty left and empty right child ? */ if (root->llink = = NULL && root->rlink = = NULL ) count + +; count_ leaf(root->rlink); /* Traverse recursively towards right */ } } Delete a node from the tree To delete a node, we should search for the node to be deleted. If the required node is present, then that node has to be deleted. Otherwise, the message “Item not found” has to be displayed. The search for a node has been discussed earlier in section 6.5.2. It is very important to remember that once the required node is found and deleted, in a binary search tree the ordering of the tree should be maintained i.e., even after deleting a node, elements in the left subtree should be lesser and elements in the right subtree should be greater or equal. In a binary search tree the node to be deleted will have two cases: 1. An empty left subtree and nonempty right subtree or an empty right subtree and nonempty left subtree (A node having empty left child and empty right child is also deleted using this case). 2. Non empty left subtree and non empty right subtree Case 1: Consider the figures shown in below fig: 6.9 (a) and (b) where cur denotes the node to be deleted and in both cases one of the subtrees is empty and the other is non- empty. The node identified by parent is the parent of the node cur. The non empty subtree can be obtained and is saved in a variable q. The corresponding code is: if ( cur->llink == NULL) /* If leftsubtree is empty */ q = cur->rlink; /* obtain the address of non empty right subtree */ else if ( cur->rlink == NULL) /* If right subtree is empty */ q = cur->llink; /* obtain the address of non empty left subtree */ Now, the non-empty subtree identified by q should be attached to the parent of the node to be deleted and then delete the cur node. This is explained immediately after case 2. Case 2: Consider the figures shown in fig. 6-10 and fig. 6-11 where cur denotes the node to be deleted. In both cases both the subtrees are non-empty. The node identified by parent is the parent of the node cur. The node can be easily deleted using the following procedure in sequence: 1. Find the inorder successor of the node to be deleted. The corresponding code is: suc = cur->rlink; /* lnorder successor always lies towards right */ while ( suc->llink != NULL) /* and immediately keep traversing left */ { suc = suc->llink; } 2. Attach left subtree of the node to be deleted to the left of successor of the node to be deleted. The corresponding code is: suc->llink = cur->llink; /* Attach left of node to be deleted to left of successor of the node to be deleted */ 3. Obtain the right subtree of the node to be deleted. The corresponding code is: q = cur->rlink; /* Right subtree is obtained */ 4. Attach the right subtree of the node to be deleted to the parent of the node to be deleted. This is explained as shown in the nex figure 6.10 and 6.11. Attach the node q to parent: If parent of the node to be deleted does not exist, then return q itself as the root node using the statement if (parent = = NULL) return q; If a parent exists for the node to be deleted, attach the subtree pointed to by q, to the parent of the node to be deleted. In this case, attach q to parent based on the direction. If cur is the left child, attach q to left (parent) otherwise attach q to right(parent). This can be achieved by using the following statement /* connecting parent of the node to be deleted to q */ if( cur = = parent->llink ) parent->llink = q else parent->rlink = q; Once the node q is attached to the parent, the node pointed to by cur can be deleted and then return the address of the root node. The corresponding statements are: freenode (cur); return root; All these statements have been written by assuming the node to be deleted and its parent is known. So, just before deleting, search for the specified node, obtain its parent and then delete the node. The complete function to delete an item from the tree is shown in below example. Example 1 : Function to delete an item from the tree NODE delete_ item (int item, NODE root) { NODE cur, parent, suc, q; if( root = = NULL) { printf(”Tree is empty! Item not found\n”); return root; } /*obtain the position of the node to be deleted and its parent */ parent = NULL; cur = root; while ( cur != NULL && item != cur->info ) { parent = cur; cur = ( item < cur->info ) ? cur->llink : cur->rlink; } if ( cur = = NULL) { printf(”Item not found\n”); return root; } /* Item found and delete it */ *************************************************************/ /* CASE 1 */ *************************************************************/ if ( cur->llink = = NULL) /* If left subtree is empty */ q = cur->rlink; /* obtain the address of non empty right subtree */ else if ( cur->rlink = = NULL) /* If right subtree is empty */ q = cur->llink; /* obtain the address of non empty left subtree */ else { /************************************************/ /* CASE 2 */ /************************************************/ /* obtain the inorder successor */ suc = cur->rlink; /* Inorder successor lies towards right */ while (suc->llink != NULL) /* and immediately keep traversing left */ { suc = suc->llink; } suc->llink = cur->llink; /* Attach left of node to be deleted to left */ /* of successor of the node to be deleted */ q = cur->rlink; /* Right subtree is obtained */ } /* If parent does not exist return q itself as the root */ if (parent = = NULL) return q; /* connecting parent of the node to be deleted to q */ if ( cur = = parent->llink) parent->llink = q; else parent->rlink = q; freenode(cur); return root; } The complete program for creating a tree, traversing and deleting a specified item is shown in below example Example 2 : C program to create a tree, traverse a tree and delete a item from the tree #include <stdio.h> #include <alloc.h> #include <process.h> #include <string.h> struct node { int info; struct node *llink; struct node *rlink; }; typedef struct node* NODE; /* function to delete an item from the tree if it exists using above function example */ void main( ) { NODE root, temp; int choice, item; root = NULL; for (;;) { printf(”1: delete 4: Exit\n”); printf(”Enter the choice\n”); scanf(”%d”, &choice); switch (choice ) { case 1: printf(”Enter the item to be deleted\n”); scanf(”%d”,& item); root = delete_ item(item, root); break; default: exit(0); } } } Self Assessment Questions: 1. Define BST with its common operations. 2. Explain the steps involves in insertion of item into BST. Summary The advantage of “Linked Lists” is that they solve the problem of sequential storage representation. But disadvantage in that is they are sequential lists. That is they are arranged so that it is necessary to move through item one after another sequentially to access a particular node. To overcome these problems of these sequential lists we will use data structure called “Trees”. The binary search tree is organized in such a way that all of the items less than the item in a chosen node is contained in the left sub tree and all the items greater than the chosen node are contained in the right sub tree. In this manner one does not have to search the entire tree for a particular item in the manner of linked list traversals. Terminal Questions 1. Define Tree ? Explain Binary tree with its properties. 2. Explain the storage representation of Binary tree with sequential representation of a tree. 3. List and describe the various operations on binary tree using linked representation. 4. Explain the types of algorithms for tree traversal. 5. Write a note on Binary Search Tree (BST). 6. Explain the two types of cases to delete node from tree. Unit7 Graphs This unit covers the Overview of Graphs, Adjacency lists & Adjacency Matrix, Depth – First Traversal, Breadth – First Traversal and Spanning Trees Introduction In computer science, a graph is a kind of data structure, specifically an abstract data type (ADT), that consists of a set of nodes and a set of edges that establish relationships (connections) between the nodes. The graph ADT follows directly from the graph concept from mathematics. A graph G is defined as follows: G=(V,E), where V is a finite, non-empty set of vertices (singular: vertex) and E is a set of edges (links between pairs of vertices). When the edges in a graph have no direction, the graph is called undirected, otherwise called directed. In practice, some information is associated with each node and edge. Graph data structures are non-hierarchical and therefore suitable for data sets where the individual elements are interconnected in complex ways. For example, a computer network can be simulated with a graph. Hierarchical data sets can be represented by a binary or nonbinary tree. Objectives At the end of this unit, you will be able to understand the: • Overview of Graphs • Brief introduction of Adjacency lists & Adjacency Matrix • Depth – First Traversal • Breadth – First Traversal • Spanning Trees and two algorithms for finding the minimum spanning tree . Overview of Graphs A graph is a collection of nodes, called vertices, and line segments, called arcs or edges, that connect pairs of nodes. A path is a sequence of vertices in which each vertex is adjacent to the next one. In Figure 7-1, {A, B, C, E} is one path and {A, B, E, F} is another. Note that both directed and undirected graphs have paths. In an undirected graph, you may travel in either direction. A cycle is a path consisting of at least three vertices that starts and ends with the same vertex. In Figure 7.1(b), B, C, D, E, B is a cycle. Note, however, that the same vertices in Figure 7-1(a) do not constitute a cycle because in a digraph, a path can only follow the direction of the arc, whereas in an undirected graph, a path can move in either direction along the edge. A loop is a special case of a cycle in which a single arc begins and ends with the same vertex. In a loop, the end points of the line are the same. Two vertices are said to be connected if there is a path between them. A graph is said to be connected if, suppressing direction, there is a path from any vertex to any other vertex. Furthermore, a directed graph is strongly connected if there is a path from each vertex to every other vertex in the digraph. A directed graph is weakly connected if at least two vertices are not connected. (A connected undirected graph would always be strongly connected, so the concept is not normally used with undirected graphs.) A graph is disjoint if it is not connected. Figure 7-2 contains a weakly connected graph (a), a strongly connected graph (b), and a disjoint graph (c). (a) Directed graph (b) Undirected graph Fig. 7.1: Directed and undirected graphs The degree of a vertex is the number of lines incident to it. In Figure 7-2(a), the degree of vertex B is 3 and the degree of vertex E is 4. The outdegree of a vertex in a digraph is the number of arcs leaving the vertex; the indegree is the number of arcs entering the vertex. Again in Figure 7-2(a) the indegree of vertex B is 1 and its outdegree is 2; in Figure 7- 2(b) the indegree of vertex E is 3 and its outdegree is 1. Fig. 7.2: Connected and disjoint graphs Self Assessment Questions 1. Define a graph with neat diagram. 2. Explain directed and Undirected graph with neat diagram. Adjacency lists & Adjacency Matrix Two main data structures for the representation of graphs are used in practice. The first is called an adjacency list, and is implemented by representing each node as a data structure that contains a list of all adjacent nodes. The second is an adjacency matrix, in which the rows and columns of a two-dimensional array represent source and destination vertices and entries in the graph indicate whether an edge exists between the vertices. Adjacency lists are preferred for sparse graphs; otherwise, an adjacency matrix is a good choice. Finally, for very large graphs with some regularity in the placement of edges, a symbolic graph is a possible choice of representation. Adjacency lists In graph theory, an adjacency list is the representation of all edges or arcs in a graph as a list. If the graph is undirected, every entry is a set of two nodes containing the two ends of the corresponding edge; if it is directed, every entry is a tuple of two nodes, one denoting the source node and the other denoting the destination node of the corresponding arc. Typically, adjacency lists are unordered. In computer science, an adjacency list is a closely related data structure for representing graphs. In an adjacency list representation, we keep, for each vertex in the graph, all other vertices which it has an edge to (that vertex’s “adjacency list”). For instance, the representation suggested by van Rossum, in which a hash table is used to associate each vertex with an array of adjacent vertices, can be seen as an instance of this type of representation, as can the representation in Cormen et al in which an array indexed by vertex numbers points to a singly-linked list of the neighbors of each vertex. The adjacency list uses a two-dimensional ragged array to store the edges. An adjacency list is shown in Figure 7.4. Fig. 7.4: Adjacency list The vertex list is a singly linked list of the vertices in the list. Depending on the application, it could also be implemented using doubly linked lists or circularly linked lists. The pointer at the left of the list links the vertex entries. The pointer at the right in the vertex is a head pointer to a linked list of edges from the vertex. Thus, in the nondirected graph on the left in Figure 6.4, there is a path from vertex B to vertices A, C, and E. To find these edges in the adjacency list, we start at B’s vertex list vertex and traverse the linked list to A, then to C, and finally to E. Adjacency Matrix The adjacency matrix uses a vector (one-dimensional array) for the vertices and a matrix (two-dimensional array) to store the edges (see Figure 7-3). If two vertices are adjacent – that is, if there is an edge between them – the matrix intersect has a value of 1; if there is no edge between them, the intersect is set to 0. If the graph is directed, then the intersection in the adjacency matrix indicates the direction. In addition to the limitation that the size of the graph must be known before the program starts, there is another serious limitation in the adjacency matrix: Only one edge can be stored between any two vertices. Although this limitation does not prevent many graphs from using the matrix format, some network structures require multiple lines between vertices. Self Assessment Questions 1. Explain the Adjacency lists with suitable example. 2. Explain the Adjacency matrix for directed graph. Depth – First Traversal In the depth-first traversal, we process all of a vertex’s descendents before we move to an adjacent vertex. This concept is most easily seen when the graph is a tree. In Figure 7.5, we show the preorder traversal, one of the standard depth-first traversals. In a similar manner, the depth-first traversal of a graph starts by processing the first vertex of the graph. After processing the first vertex, we select any vertex adjacent to the first vertex and process it. As we process each vertex, we select an adjacent vertex until we reach a vertex with no adjacent entries. This is similar to reaching a leaf in a tree. We then back out of the structure, processing adjacent vertices as we go. It should be obvious that this logic requires a stack (or recursion) to complete the traversal. The order in which the adjacent vertices are processed depends on how the graph is physically stored. Fig. 7.5: Depth first traversal of a tree Let’s trace a depth-first traversal through the graph in Figure 7.7 The number in the box next to a vertex indicates the processing order. The stacks below the graph show the stack contents as we way down the graph and then as we back out. Fig. 7.6: Depth first traversal of a graph 1. We begin by pushing the first vertex, A, into the stack. 2. We then loop, pop the stack, and, after processing the vertex push all of the adjacent vertices into the stack. To process X at Step 2, therefore, we pop X from the stack, process it, and then push G and H into the stack, giving the stack contents as shown in Figure 6-6(b)-H G. 3. When the stack is empty, the traversal is complete. Self Assessment Questions 1.Explain the depth first traversal of a graph with suitable example. Breadth – First Traversal In the breadth-first traversal of a graph, we process all adjacent vertices of a vertex before going to the next level. Looking at the tree in, Figure 7.7, we see that its breadth-first traversal starts at level 1 and then processes all the vertices in level 2 before going on to process the vertices in level 3. Fig. 7.7: Breadth-first traversal of a tree The breadth-first traversal of a graph follows the same concept we begin by picking a starting vertex; after processing it, we process all of its adjacent vertices. After we process all of the first vertex adjacent vertices, we pick the first adjacent vertex and process all of its vertices, then the second adjacent vertex and process all of its vertices and so forth until we are finished. The breadth-first traversal uses a queue rather than a stack, As we process each vertex, we place all of its adjacent vertices in the queue. Then, to select the next vertex to be processed, we delete a vertex from the queue and process it. Let’s trace this logic through the graph in Figure 7.8. Fig. 7.8: Breadth-first traversal of a graph 1. We begin by enqueuing vertex A in the queue. 2. We then loop, dequeuing the queue and processing the vertex from the front of the queue. After processing the vertex, we place all of its adjacent vertices into the queue. Thus, at Step 2 in Figure 7.8(b]), we dequeue vertex X, process it, and then place vertices G and H in the queue. We are then ready for Step 3, in which we process vertex G. 3. When the queue is empty, the traversal is complete. Spanning Trees A spanning tree of a graph is an undirected tree consisting of only those edges necessary to connect all the nodes in the original graph. For any pair of nodes there exists only one path between them and the insertion of any edge to a spanning tree forms a unique cycle. Those edges left out of the Spanning tree that were present in the original graph connect paths together in the tree. If a DFS is used, those edges traversed by the algorithm form the edges of the tree, referred to as depth first spanning tree. If a BFS is used, the spanning tree is formed from those edges traversed during the search, producing a breadth first spanning tree. Network: A network is a graph that has weights or costs associated with its edges. It is also called weighted graph. Minimum spanning tree: This is a spanning tree that covers all vertices of a network such that the sum of the costs of its edges is minimum. There are two algorithms for finding the minimum spanning tree, given a network. 1) Kruskal’s algorithm 2) Prim’s algorithm 1. Kruskal’s algorithm to find min. spanning tree. Step 1: Arrange all edges in ascending order of their cost to form an input set. Step 2: From this input set, take each edge and if it does not form a cycle, include it in the output set, which forms the minimum spanning tree. The sum of the costs of the edges in the output set is the least. Note: If a vertex u is reachable from vertex w and w is reachable from u, then the path between u and w is called a cycle. Eg. in fig. 7.10 above, the set [(1, 2) (2, 3), (3, 4), (4, 1)] is a cycle and the superset [(1, 2) (2, 5) (5, 6) (6, 4) (4, 1)] also is a cycle. Eg. Applying Kruskal’s algorithm to the network of fig. 7.10, we get Step 1: Input set = { (1, 3), (4, 6), (2, 5), (3, 6), (2, 3), (3, 4), (1, 4), (1, 2), (3, 5), (5, 6)} Step 2: Output set = { (1, 3) (4, 6) (2, 5), (3, 6), (2, 3)} Minimum cost = 1+2+3+4+5 = 15. Thus, minimum spanning tree is an shown: 2. Prim’s algorithm to find minimum spanning tree: Step 1: Let S be the set of vertices in the spanning tree and A be the set of all vertices in the network. Let E be the set of all edges forming the spanning tree. Initially let E = { } and S = {V1} where V1 is the starting vertex in A. Step 2: While (S not equal to A) do { · find (a, b) to be a low-cost edge such that a is in S and b is in (A – S). · E = E Union {(a, b)} · S = S Union { b } } Eg. For the network in fig. (6.10), we get outputs after applying steps of Prim’s algorithm, as follows: Step 1: E = { }, S = { 1 } A = { 1, 2, 3, 4, 5, 6 } Now since S = A, while loop terminates and E has the set of edges forming the minimum spanning tree (same as in fig. 7.11) and whose cost is 15. Note: Given a network (= weighted graph), many spanning trees can be formed but only one minimum spanning tree can be obtained [especially when the weights of the network are unique]. Exercise: Derive the minimum spanning tree for the graph below using both the Prim’s and Kruskal’s methods – Answer: E = { (a, b), (b, c), (b, d), (b, f), (d, e)} Cost = 56 Optimal path algorithems: If we have a graph representing the highway system of India, where the vertices represent cities and the edges indicate highways, a motorist who wants to travel from Mumbai to Delhi would like to know: 1) Is there a path from Mumbai to Delhi ? 2) If there is more than 1 path from Mumbai to Delhi, which path is the shortest ? 1) Dijkstra’s algorithm to find shortest path from a given starting point (= source vertex) in the graph to any destination point. Step 1: Let S be the start/source node and T be the last node to be given a permanent label. Assign a temporary label l(i) = ¥ to all nodes except for S, whose label is given 0 and is made permanent by setting P to S. Step 2: For each node i with a temporary label, redefine l(i) to be the smaller of l(i) and l(p) +d(p, i) where d(p, i) = cost of edge (p, i). Find the node i with the smallest temporary label; set p = i and make the label l(p) permanent. Step 3: If node T has a temporary label, then repeat step 2 else T has a permanent label and this corresponds to the length of the shortest path from S to T. Eg: Find the shortest paths from S to all nodes in the fig. 7.13. Note: If there is no direct path from vertex A to B, then adjacency matrix [A, B] set to ¥. Applying the steps of the Dijkstra’s algorithm, we get the outputs as shown: Thus, to go from S to 4, shortest path is of cost 15 and goes through vertices 5 & 2. To move to 3 from S, the shortest path goes through nodes 5, 2, 4, and 1 and is of cost 20. In pass 2 of step 2, the distance from S to node 1 has changed from ¥ to 18 because the least of the values (l(i), d(p, i) + l(p)) has to be chosen, according to step 2 of algorithm. Here l(i) is label assigned to node 1 (= ¥), l(p) is label assigned to permanent node 5( = and d(p, i) is cost of edge from p to i, i.e., cost of edge from node 5 to 1 (= 10, as understood from adjacency matrix, 5th row and 1st column intersection). \ min (¥, 10 + = 18 assigned to node 1. In pass 3 of step 2, l(i) = l(3) = 25 l(p) = l(4) = 15 d(p, i) = d(4, 3) = 5 \ min (25, 15+5) = 20 assigned to node 3. 2) Floyd-Warshall’s all pairs shortest path algorithm. Given the initial cost adjacency matrix as A° (i, j), calculate the matrix Ak (i, j) to be the cost of the shortest path from i to j going through no intermediate vertex of index greater than k thus – Ak (i, j) = min (Ak–1 (i, j) , Ak–1(i, k) + Ak–1(k, j)) where k ³ 1. Thus, the shortest distances between all pairs of vertices in the graph can be obtained by looking up the final values in the A3 matrix. Tip: Copy the kth row and column as it is in the Ak matrix from the Ak–1 matrix and derive the other values using the given formula in the algorithm. Exercise: Apply the Dikjsha’s algorithm and Floyd-Warshall’s all pairs algorithm to get the shortest path(s) [Source vertex is 1] in the graph below: Self Assessment Questions 1. Define Spanning Trees with neat diagram. 2. Explain the Prim’s algorithm to find minimum spanning tree with suitable example. Summary A graph is a collection of nodes, called vertices, and line segments, called arcs or edges, that connect pairs of nodes. Graph data structures are non-hierarchical and therefore suitable for data sets where the individual elements are interconnected in complex ways. For example, a computer network can be simulated with a graph. In this context we discussed in this book the Overview of Graphs, Adjacency lists & Adjacency Matrix , Depth – First Traversal, Breadth – First Traversal, Spanning Trees. Terminal Questions 1) A binary tree has 10 nodes. The preorder and inorder traversal of the tree are shown below. Draw the tree. Preorder : J C B A D E F I G H Inorder : A B C E D F J G I H Hints: 1) The first node in the preorder sequence is the root. 2) Traverse Left to right in the preorder sequence, finding each node’s position (left or right) with respect to the previously-located node, in the Inorder Sequence. 2) Give the in-order, postorder and breadth first traversal for the following tree: Answer: Inorder = D B H E A F C G Postorder = D H E B F G C A B First = A B C D E F G H 1) Show the BST and B tree after the insertion of the following elements according to their respective insertion algorithms – 43, 64, 80, 96, 128, 150 and 250. Note: BST means Binary Search Tree; the B tree should be of order 4. 2) Given the BST below: a) Write the algorithm to find the smallest value in the tree. b) Draw the tree obtained after separately deleting from the original tree i) node 13 ii) node 14 iii) node 23 3) Construct the binary tree if the result of traversing it is as below: Postorder : I E J F C G K L H D B A Inorder : E I C F J B G D K H L A Hints: 1) The last node in the postorder sequence is the root. 2) Traverse right to left in the postorder sequence, finding each node’s position (left or right) with respect to the previously-located node in the Inorder Sequence. 4) Suppose that keys 1, 2, 3, …, 19, 20 are inserted in that order into a B tree with m = 4. Show the B tree after each insertion. How many internal nodes does the final B tree have ? Unit8 Searching Methods This unit covers Introduction of searching methods, Basic searching techniques, Linear search, Binary search, Hash search, Binary tree search, Algorithmic Notation, Average time, Worst-case time and best possible time, sequential search, Binary search. Introduction Information retrieval is one of the most important applications of computers. It usually involves giving a piece of information called the key, and ask to find a record that contains other associated information. This is achieved by first going through the list to find if the given key exists or not, a process called searching. Computer systems are often used to store large amounts of data from which individual records must be retrieved according to some search criterion. The process of searching for an item in a data structure can be quit straightforward or very complex. Searching can be done on internal data structures or on external data structures. Information retrieval in the required format is the central activity in all computer applications. This involves searching. This block deals with searching techniques. Searching methods are designed to take advantage of the file organisation and optimize the search for a particular record or to establish its absence. The file organisation and searching method chosen can make a substantial difference to an application’s performance. Objectives At the end of this unit, you will be able to understand the: · Brief discussion on Basics Searching Techniques. · Algorithmic Notation such as The average time, The worst-case time and, The best possible time. · Introduction of Sequential Search [Linear search] · Introduction of Binary Search Basics Searching Techniques Consider a list of n elements or can represent a file of n records, where each element is a key / number. The task is to find a particular key in the list in the shortest possible time. If you know you are going to search for an item in a set, you will need to think carefully about what type of data structure you will use for that set. At low level, the only searches that get mentioned are for sorted and unsorted arrays. However, these are not the only data types that are useful for searching. · Linear search: Start at the beginning of the list and check every element of the list. Very slow [order O(n) ] but works on an unsorted list. · Binary Search : This is used for searching in a sorted array. Test the middle element of the array. If it is too big. Repeat the process in the left half of the array, and the right half if it’s too small. In this way, the amount of space that needs to be searched is halved every time, so the time is O(log n) · Hash Search : Searching a hash table is easy and extremely fast, just find the hash value for the item you’re looking for then go to that index and start searching the array until you find what you are looking for or you hit a blank spot. The order is pretty close to o(1), depending on how full your hash table is. · Binary Tree search: Search a binary tree is just as easy as searching a hash table, but it is usually slower (especially if the tree is badly unbalanced). Just start at the root. Then go down the left subtree if the root is too big and the right subtree if is too small. Repeat until you find what you want or the subtree you want isn’t there. The running time is O(log n) on average and O(n) in the worst case. Self Assessment Questions 1. Discuss the various types of Searching techniques. Algorithmic Notation let’s examine how long it will take to find an item matching a key in the collections. We are interested in: 1. The average time 2. The worst-case time and 3. The best possible time. However, we will generally be most concerned with the worst-case time as calculations based on worst-case time can lead to guaranteed performance predictions. Conveniently, the worst-case time are generally easier to calculate than average time. If there are n items in our collection whether it is stored as an array or as linked list-then it is obvious that in the worst case, when there is no item in the collection with the desired key, then n comparisons of the key with keys of the items in the collection will have to be made. To simplify analysis and comparison of algorithms, we look for a dominated operation and count the number of times that dominant operation has to be performed. In the case of searching, the dominant operation is the comparison, since the search requires n comparisons in the worst case, we say this is O(n) (pronounce this “big-Oh-n” or “Oh-n”) algorithm. The best case-in which the first comparison returns a match-requires a single comparison and is O(1). The average time depends on the probability that the key will be found in the collection-this is something that we would not expected to know in the majority of cases. Thus in this case, as in most others, estimation of the average time is of little utility. If the performance of the system is vital, i.e. it’s part of a life-critical system, then we must use the worst case in our design calculations as it represents the best guaranteed performance. We will now discuss two searching methods and analyze their performance. These two methods are: • The sequential search • The binary search 8.2.1 Sequential Search [Linear search] This is the most natural searching method. Simply put it means to go through a list or a file till the required record is found. It makes no demands on the ordering of records. The algorithm for a sequential search procedure is now presented. Algorithm: Sequential Search This represents the algorithm to search a list of values of to find the required one. INPUT: List of size N. Target value T OUTPUT: Position of T in the list –I BEGIN 1. Set FOUND to false Set I to 0 2. While (I<=N) and (FOUND is false) If List [I] = T FOUND = true Else I=I+1 END 3. If FOUND is false T is not present in List. END This algorithm can easily be extended for searching for a record with a matching key value. Analysis of Sequential Search Whether the sequential search is carried out on lists implemented as arrays or linked lists or on files, the criterial part in performance is the comparison loop step 2. Obviously the fewer the number of comparisons, the sooner the algorithm will terminate. The fewest possible comparisons = 1. When the required item is the first item in the list. The maximum comparisons = N when the required item is the last item in the list. Thus if the required item is in position I in the list, I comparisons are required. Hence the average number of comparisons done by sequential search is Sequential search is easy to write and efficient for short lists. It does not require sorted data. However it is disastrous for long lists. There is no way of quickly establishing that the required item is not in the list or of finding all occurrences of a required item at one place. We can overcome these deficiencies with the next searching method namely the Binary search. Example : Program to search for an item using linear search. #include<stdio.h> /* Search for key in the table */ int seq_search(int key, int a[], int n) { Int I; for(i=0;i<n;i++) { If(a[i]==key) return i+1 } return 0; } void main() { int I,n,key,pos,a[20]; printf(”Enter the value of n\n”); scanf(”%d”,&n); printf(”Enter n values\n”; for(i=0;i<n;i++) scanf(%d”,&a[i]); printf(”Enter the item to be searched\n”); scanf(”%d”, &key); pos= seq_search(key,n,a); if(pos==0) printf(”Search unscccessful \n”); else printf(”key found at position = %d \n”,pos); } Binary Search The drawbacks of sequential search can be eliminated if it becomes possible to eliminate large portions of the list from consideration in subsequent iterations. The binary search method just that, it halves the size of the list to search in each iterations. Binary search can be explained simply by the analogy of searching for a page in a book. Suppose you were searching for page 90 in book of 150 pages. You would first open it at random towards the later half of the book. If the page is less than 90, you would open at a page to the right, it is greater than 90 you would open at a page to the left, repeating the process till page 90 was found. As you can see, by the first instinctive search, you dramatically reduced the number of pages to search. Binary search requires sorted data to operate on since the data may not be contiguous like the pages of a book. We cannot guess which quarter of the data the required item may be in. So we divide the list in the centre each time. We will first illustrate binary search with an example before going on to formulate the algorithm and analysing it. Example: Use the binary search method to find ‘Scorpio’ in the following list of 11 zodiac signs. Aries 1 Comparison 1 (Leo Scorpio) Aquarius 2 Comparison 2 Cancer 3 (Sagittarius Scorpio) Comparison 3 Capricorn 4 ( =scorpio) Gemini 5 Leo 6 Libra 7 Pisces 8 Sagittarius 9 Scorpio 10 Taurus 11 This is a sorted list of size 11. The first comparison is with the middle element number 6 i.e. Leo. This eliminates the first 5 elements. The second comparison is with the middle element from 7 to 11, i.e. 9 Sagittarius. This eliminates 7 to 9. The third comparison is with the middle element from 9 to 11, i.e. 10 Scorpio. Thus we have found the target in 3 comparisons. Sequential search would be taken 10 comparisons. We will now formulate the algorithm for binary search. Algorithm Binary Search This represents the binary search method to find a required item in a list sorted in increasing order . INPUT: Sorted LIST of size N, Target Value T OUTPUT: Position of T in the LIST = I BEGIN 1. MAX = N MIN = 1 FOUND = false 2. WHILE (FOUND is false) and (MAX > = MIN) 2.1 MID = (MAX + MIN)DIV 2 2.2 If T = LIST [MID] I=MID FOUND = true Else If T < LIST[MID] MAX = MID-1 Else MIN = MD+1 END It is recommended that the student apply this algorithm to some examples. Analysis of Binary Search : In general, the binary search method needs no; more than [Iog2n] + 1 comparisons. This implies that for an array of a million entries, only about twenty comparisons will be needed. Contrast this with the case of sequential search which on the average will need comparisons. The conditions (MAX = MIN) is necessary to ensure that step 2 terminates even in the case that the required element is not present. Consider the example of Zodiac signs. Suppose the l0th item was Solar (an imaginary Zodiac sign). Then at that point we would have MID = 10 MAX =11 MIN = 9 And from 2.2 get MAX = MID-l = 9 In the next iteration we get (2.1) MID = (9 + 9) DIV 2 = 9 (2.2) MAX= 9-1 = 8. Since MAX <MIN, the loop terminates. Since FOUND is false, we consider the target was not found. In the binary search method just described above, it is always the key in the middle of the list currently being examined that is used for comparison. The splitting of the list can be illustrated through a binary decision tree in which the value of a node is the index of the key being tested. Suppose there are 31 records, then the first key compared is at location 16 of the list since (1 + 31)/2 = 16. If the key is less than the key at location 16 the location 8 is tested since (1 + 15)/2 = 8; or if key is less than the key at location 16, then the location 24 is tested. The binary tree describing this process is shown below Figure. Illustrations of C Programmes Example : Program to search for an item using Binary Search. [ interpolation search ] #include<stdio.h> int search(item, a,low, high) int item; /* Element to search */ int a[]; /* Element to be searched */ int low; /* Points to the first element */ int high; /* Point tot the last element */ { int mid; /* Point to the middle element of the table */ if(low>high) /* No item found */ return -1; mid= low+(high-low) * ((item-a[low])/(a[high]-a[low])); return(item==a[mid]?mid+1: /* return the middle element */ item<a[mid]? search(item,a,low,mid-1): /* search left part */ search(item,a,mid+1,high)) /* search right part */ } void main() { int n, i,a[20],item,pos; printf(“enter the number of elements \n”); scanf(“%d”,&n); printf(“Enter %d items \n”,n); for(i=0;i<n;i++) { scanf(“%d”,&a[i]); } printf(“Enter the item to be searched \n”); scanf(“%d”,&item); pos=search(item,a,0,n-1); /* 0- low index and n-1 is the high index */ if(pos== -1) printf(“Item not found \n”); else printf(“Item found at %d position \n”,pos); 8.4 Summary In this book discussed about the Basics Searching Techniques, Algorithmic Notation and mainly two type of searching techniques that are Sequential Search [Linear search], Binary Search Searching methods are designed to take advantage of the file organisation and optimize the search for a particular record or to establish its absence. The file organisation and searching method chosen can make a substantial difference to an application’s performance. Terminal Questions 1. Explain the types of Basic Searching Technique. 2. Explain the Sequential (Linear Search) Search with appropriate algorithm. 3. Write a ‘C’ program to search for an item using linear search. 4. Write an algorithm with analysis steps for Binary Search. 5. Write a ‘C’ program to search for an item using Binary Search. Unit9 Sorting Methods This unit covers the Introduction of sorting methods, How do you sort?, Several performance criteria to be used in evaluating a sorting algorithm, and discussed the Internal sorting - Insertion sort, Bubble sort, Selection sort, Shell sort, Queue sort, tree sort, External sort:- Merge sort, 2-way merge sort. Various sorting methods examples are illustrated using C. Introduction Retrieval of information is made easier when it is stored in some predefined order. Sorting is, therefore, a very important computer application activity. Many sorting algorithms are available. Differing environments require differing sorting methods. Sorting algorithms can be characterized in the following two ways: 1. Simple algorithms which require the order of n2 (written as O (n2) comparisons to sort n items. 2. Sophisticated algorithms that require the O(nlog2n) comparisons to sort items. The difference lies in the fact that the first method move data only over small distances in the process of sorting, whereas the second method method large distances, so that items settle into the proper order sooner, thus resulting in fewer comparisons. Performance of a sorting algorithm can also depend on the degree of order a heady present in the data. There are two basic categories of sorting methods: ‘Internal Sorting’ and ‘External Sorting’. Internal sorting are applied when the entire collection of data to be sorted is small enough that the sorting can take place within main memory. The time required to read or write is not considered to be significant in evaluating the performance of internal sorting methods. External sorting methods are applied to larger collection of data which reside on secondary devices read and write access time are major concern in determing sort performances. In this unit we will study some methods of internal sorting and External sorting. Objectives At the end of this unit, you will be able to understand the: · Overview of Sorting Methods · Various sorting methods and its complexity · Several performance criteria to be used in evaluating a sorting algorithm · Internal and external Sorting method with various types of algorithms and their implementation in C. Overview of Sorting Methods In the last units array, the efficient routine for finding data in an array was based on the premise that the data being searched was already sorted. Indeed, computers are used so extensively to process data collections that in many installations, a great deal of their time is spent maintaining that data in sorted order in the first place. It turns out that methods of searching a sorted list have much in common with methods of achieving that sorted condition. In order to concentrate on sorting abstractions without having to be concerned with the type of data being sorted, most of the code presented in the rest of this unit will be designed to sort only a single kind of data, namely arrays of cardinals. Only minor modifications in a few places are necessary to use the code presented in the next few sections for other kinds of data. There is a surprisingly diverse collection of algorithms that have been developed to solve the apparently simple problem of “Sorting”. The general sorting problem is simple enough to describe: Given an initially unordered array of N records, with one field distinguished as the key, rearrange the records so they are sorted into increasing (or decreasing) order according to each record’s key. Sorting is the problem of taking an arbitrary permutation of n items and rearranging them into the total order. Increasing or Decreasing Order? The same algorithm can be used by both all we need do is change £ to ³ in the comparison function as we desire. · What about equal keys? Does the order matter or not? May be we need to sort on secondary keys or leave in the same order as the original permutations. · What about non-numerical data? Alphabetizing is sorting text strings and libraries have very complicated rules concerning punctuation etc. Is Brown-Williams before or after Brown America before or after Brown John? Sorting algorithms are used in all kinds of applications and are necessary for instance, if we plan to use efficient searching algorithms like Binary Search or Interpolation Search since these require their data to be sorted. There are dozens of algorithms, the choice of which depends on factors such as the number of items relative to working memory, knowledge of the orderliness of the items or the range of the keys, the cost of comparing keys vs. the cost of moving items, etc. To choose an algorithm we attempt to characterize the performance of the algorithm with respect to an array of size N. We then determine which operations are critical for each type of problem. For example, in sorting we can characterize the performance of a sorting algorithm by. 1. the number of times it compares an element in the array to another value (comparisons) or 2. the number of times it moves an element from or to a position in the array (swaps). Sorting arrays lets us pay particular attention to the mechanics of the sort algorithm. Arrays, though simple, have the benefits of being internal (memory resident), randomly accessible (using indices), and capable of allowing us to accomplish the sort ‘in place’. Of course there are sorts for trees, as well as special sorts for external data structures. But these often observe the sort itself with lots of housekeeping overhead. The amount of extra memory used by a sort is important. Extra memory is used by sort in the following ways: · sort in place using no extra memory except for a small stack or table · use a linked-list and each element requires a link pointer · need enough extra memory to store a copy of the array to be sorted Normally, when considering a sorting problem, we will assume that the number of records to be sorted Is small enough that we can fit the entire data set in the computer’s memory (RAM) all at once. When this Is true, we can make use of an internal sorting algorithm, which assumes that any key or record can be accessed or moved at any time. That is, we have “random access” to the data. Sometimes, when sorting an extremely large data set such as Census Data, there are simply , too many records for them to all fit in memory at once. In this case, we have to resort to external sorting algorithms that don’t assume we have random access to the data. Instead, these algorithms assume the data is stored on magnetic tapes or disks and only portions of the data will fit in memory. These algorithms use “sequential access” to the data and proceed by reading in, processing, and writing out blocks of records at a time. These partially sorted blocks need to be combined or merged in some manner to eventually sort the entire list. One final Issue to keep in mind when Implementing a sorting algorithm is the size of the records themselves. Many sorting algorithms move and interchange records in memory several times before they are moved into their final sorted position, For large records, this can add up to lots of execution time spent simply copying data. A popular solution to this problem is called “indirect sorting”. The Idea is to sort the indices of the records, rather than the records themselves. Self Assessment Questions 1. What do you mean by Sorting method discuss in brief. 2. Explain the characteristics of the performance of a sorting algorithm How do you sort? There are several different ideas which lead to sorting algorithms: 1. Insertion – putting an element in the appropriate place in a sorted list yields a larger sorted list. 2. Exchange – rearrange pairs of elements which are out of order, until no such pairs remain. 3. Selection – extract the largest element form the list, remove it, and repeat. 4. Distribution – separate into piles based on the first letter, then sort each pile. 5. Merging -Two sorted lists can be easily combined to form a sorted list. There are many different methods used for sorting. Quite frequently, a combination of these methods are used to perform a sort. We will cover four common sorting methods, which we termed selection, insertion, comparison, and divide and conquer. These common sorting methods can be represented as algorithmic functions, which are step by step, problem solving procedures that have a finite number of steps. Sorting methods can be grouped in to various subgroups that share common themes. 1) Priority queue sorting methods: Example: Selection Sort and Heap Sort 2) Divided-and-conquer method: Example: MergeSort and Quicksort 3) Insertion based sort: Example: InsertionSort 4) Other methods: Example: BubbleSorl and ShellSort Self Assessment Questions 1. List the various methods involves in sorting algorithms. Evaluating a Sorting Algorithms There are several performance criteria to be used in evaluating a sorting algorithm: 1. Running time: Typically, an elementary sorting algorithm requires O(N2) steps to sort N randomly arranged Items. More sophisticated sorting algorithms require O(N log N) steps on average. Algorithms differ in the constant that appears in front of the N2 or N log N. Furthermore, some sorting algorithms are more sensitive to the nature of the input than others. Quicksort, for example, requires O(N log N) time in the average case, but requires O(N2) time in the worst case. 2. Memory requirements: The amount of extra memory required by a sorting algorithm is also an important consideration. In place sorting algorithms are the most memory efficient since they require practically no additional memory. Linked list representations require an additional N words of memory for a list of pointers. Still other algorithms require sufficient memory for another copy of the input array. These are the most inefficient in terms of memory usage. 3. Stability: This is the ability of a sorting algorithm to preserve the relative order of equal keys in a file. Stability on Sorting algorithm When a sorting algorithm is applied to a set of records, some of which share the same key, there are several different orderings that are all correctly sorted. If the ordering of records with identical keys is always the same as in the original input, then we say that the sorting algorithm used is “stable”. This property can be useful. For instance, consider sorting a list of student records alphabetically by name, and then sorting the list again, but this time by letter grade in a 1 particular course. If the sorting algorithm is stable, then all the students who got “A” will be listed alphabetically. Stability is a difficult property to achieve if we also want our sorting algorithm to be efficient Sorting algorithms are often subdivided into “elementary” algorithms that are simple to implement compared to more complex algorithms that, while more efficient, are also more, difficult to understand, implement, and debug. It is not always true that the more complex algorithms are the preferred ones, Elementary algorithms are generally more appropriate in the following situations: 1) less than 100 values to be sorted 2) the values will be sorted just once 3) special cases such as: a) the input data are “almost sorted” b) there are many equal keys . In general, elementary sorting methods require O(N2) steps for N random key values. The more complex methods can often sort the same data in just O(N log N) steps. Although it is rather difficult to prove, it can be shown that roughly N log N comparisons are required, in the general case. Examples of elementary sorting algorithms are: selection sort, insertion sort, shell sort and bubble sort. Examples of sophisticated sorting algorithms are quicksort, radix sort, heapsort and mergesort. Self Assessment Questions 1. Discuss the various types of performance criteria to be used in evaluating a sorting algorithm: 2. Explain where elementary algorithms are more appropriate. Internal Sorting In internal sorting, all the data to be sorted is available in the high speed main memory of the computer. We will study the following methods of internal sorting: 1. Insertion sort 2. Bubble sort 3. Quick sort 4. 2-Way Merge sort 5. Heap sort Insertion Sort An insertion sort has the advantage that it is simple to understand and simple to implement. Unfortunately, it is rather slow. Given an unsorted array of integer values, an insertion sort visits each element of the array, in turn. As it visits a particular element, it scans the array from the beginning up to the determines where in that segment of the array the current value belongs. It then inserts the current value in that location and shifts every element of the array to the right, up to the present location. It then goes on to the next location (value) in the array. Notice that one index is going from 0 to n and for each such value and another index is scanning the array from 0 to the value of the first index. The result of this is – that this type of sort is O(n2). · Insertion Sort This is a naturally occurring sorting method exemplified by a card player arranging the cards dealt to him. He picks up the cards as they are dealt and inserts them into the required position. Thus at every step, we insert an item into its proper place in an already ordered list. We will illustrate insertion sort with an example below given and presenting the formal algorithm. Example 1: Sort the following list using the insertion sort method: Step 1 1 < 4, Therefore insert before 4 Step 2 3 > 1, 3 Insert between 1 & 4 Step 3 2 > I, 2, Insert between I & 3 Step 4 5 > I, 2,3,4, Insert after 4 (5) Insertion sort Thus to find the correct position search the list till an item just greater than the target is found. Shift all the items from this point one, down the list. Insert the target in the vacated slot. We now present the algorithm for insertion sort. ALGORITHM: INSERT SORT INPUT: LIST[ ] of N items in random order. OUTPUT: LIST[ ] of N items in sorted order. 1. BEGIN 2. FORI = 2 TO N DO 3. BEGIN 4. IF LIST[I] LIST[I-1] 5. THEN BEGIN 6. J = I. 7. T = LIST [I] /*STORE LIST [I] */ 8. REPEAT /* MOVE OTHER ITEMS DOWN THE LIST. 9. J = J-1 10. LIST [J + 1] = LIST [J]; 11. IF J = 1 THEN 12. FOUND = TRUE 13. UNTIL (FOUND = TRUE) 14. LIST [I] = T 15. END 16. END 17. END Example : Program to sort n numbers using insertion sort. #include<stdio.h> voide insertion_sort(int a[],int n) { int i,j,item; for(i=0;i<n;i++) { /* item to be inserted */ item = a[i]; /* Try from (i-1)th position */ j=i-1; while(j>=0 && item<a[j]) { A[j+1] = a[j] /* Move the item to the next position */ j–; /* and update the position */ } A[j+1]=item; /* appropriate position found and so insert item */ } } void main() { int i, n,a[20]; printf(“Enter the no. of elements to sort \n”); scanf(“%d”, &n); printf(“Enter n elements \n”); for(i=0;i<n;i++) scanf(“%d”,&a[i]); insertion_sort(a,n); printf(“The sorted array is \n”); for(i=0;i<n;i++) printf(“%d\n”,a[i]); } Bubble Sort In a bubble sort we try to improve on the performance of the two previous sorts by making more exchanges in each pass. Again we will make several passes over the array (n -1 to be exact). For the first pass, we begin at element O and proceed forward to element n- 1. Along the way we compare each element to the element that follows it. If the current element is greater than that in the next location, then they are in the wrong order, and we swap them. In this way, an element early in the array that has a very large value is able to percolate (bubble) upward. In the next pass we do exactly the same thing. But now the last element in the array is guaranteed to be where it belongs. so our pass need only proceed as far as element n -2 of the array. This will move the second largest value in the array into the next to last position. This proceeds with each pass encompassing one less element of the array. The result, aside from a sorted array, is that we make n passes over n elements. This sort method, too, is O(n2). The bubble sort can be made to work in the opposite direction. moving the least value to the beginning of the array on each pass. This is sometimes referred to as a stone sort. Though the names tend to get confused, In addition there are some pretty obvious improvements that can made rather readily to the bubble sort. For example, at a certain point in our multiple passes over the array it may be the case that the rest of the array is already sorted. As soon as we make a pass in which no exchanges are made this must be the case. So we can keep a counter of exchanges that take place in each pass and quit immediately if the counter is 0 at the end of a pass. Another variation that is often seen is called a shaker sort. In this you simple do one pass of a bubble sort upward, followed by a downward pass of a stone sort. This doesn’t gain you much in efficiency, but it is cute. In this sorting algorithm, multiple Swapping take place in one pass. Smaller elements move or ‘bubble’ up to the top of the list, hence the name given to the algorithm. In this method adjacent members of the list to be sorted are compared. If the item on top is greater than the item immediately below it, they are swapped. This process is carried on till the list is sorted. The detailed algorithm follows: Algorithm for Bubble Sort INPUT: LIST [] of N items in random order O UTPUT: LIST [] of N items sorted in ascending order. 1. SWAP = TRUE PASS = 0/ 2. WHILE SWAP = TRUE DO BEGIN. 2.1 FOR I = 0 TO (N-PASS) DO BEGIN 2.1.1 IFA[I] >A [I+1] BEGIN TMP = A[I] A[I] = A[I + 1] A[I + 1] = TMP SWAP = TRUE END ELSE SWAP = FALSE 2.1.2 PASS = PASS + 1 END END Total number of comparisons in Bubble sort are = (N -1) + (N -2) …+ 2 + 1 =(N-1)* N =O(N2) 2 This inefficiency is due to the fact that an item moves only to the next position in each pass. Example : C program, Function to arrange numbers in ascending order using bubble sort technique. #include<stdio.h> void bubble_sort(int a[], int n) { int i; /* To access subsequent item while comparing*/ int j; /* Keep track of the passes */ int temp; /* Used to exchange the item */ int sum; /* Holds the total number of exchanges */ int pass; /*Holds the number of passes required */ int exchag; /* Holds the number of exchanges in each pass */ int flag; /* Indicate any exchange has been done or not */ sum = 0; pass = 0; for(j=1;j<n;j++) { exchg = 0; /* number of exchanges just before the pass */ flage = 0; /* No exchange been done */ for(i=0;i<n-j;i++) { if(a[i]>=a[i+1]) { /* Exchange and update the number of exchange in the current pass*/ temp=a[i]; a[i]=a[i+1]; a[i+1=temp; exchg++; sum++ /* Update the total number of exchanges */ flag=1; /* Exchange has been done */ } } pass++; /* update the number of passes */ printf(“Number of exchanges in pass : %d=%d\n”,j,exchg); print(“Total number of exchanges = %d\n”,sum); } void main() { int i,n,a[20]; printf(“Enter the number of items to sort\n); scanf(%d,&n); print(“Enter the items to sort\n); for(i=0;i<n;i++) scanf(“%d”,&a[i]); bubble_sort(a,n); printf(“The sorted items are \n”); for(i=0;i<n;i++) { Printf(“%d\n”,a[i]); } } Note: At least one pass is required to check whether the items are sorted. So, the best case time complexity is O(1). Selection Sort A selection sort is slightly more complicated. but relatively easy to understand and to code. It is one of the slow sorting technique. Given an unsorted array of integer values, a selection sort visits each element of the array, in turn. As it visits an element, it employs a second index to scan the array from the present location to the end while it identifies the least (smallest) value in that segment of the array. It then swaps that least value with the current value. Because one index scans the array once while another index scans part of the array each time, this sort algorithm is also O(n2). Simple steps for selection sort. [Assume we are sorted the n items in array A] for i = 1 to n do for j = j+1 to n do if A[i] > A[j] then swap(A[i],A[j]) Example : Program to illustrate the Selection sort, assume we have an array ‘A’ with ‘n’ elements. Void selectionsort(int A[], int n) { Int minindex,j,p,tmp; for(p=0;p< n-1;p++) { minindex = p; for (j= p+1; j<n; j++) if(A[j]<A[minindex] minindex = j; } tmp = A[p]; A[p]=A[minindex]; A[minindex]=tmp; } Example : C program, Function to arrange number in ascending order using Selection sort technique. #include<stdio.h> voide selection_sort(int a[],int n); { int i,j,pos,small,temp; for(i=0;i<n-1;i++) { small=a[i]; /* Initial small number in ith pass */ pos=i; /* Position of smaller number */ /* Find the minimum of remaining elements along with the position */ for(j=i+1;j<n;j++) { if(a[i]<small) { small=a[j]; pos=j; } } /* Exchange ith item with least item */ temp=a[pos]; a[pos] = a[i]; a[i]=temp; } } void main() { int i,n,a[20]; printf(“Enter the number of elements to sort\n”); scanf(%d”,&n); printf(“Enter %d elements to sort \n”,n); for(i=0;i<n;i++) scanf(“%d”,&a[i]); selection_sort(a,n) printf(“The sort elements are \n”); for(i=0;i<n;i++) printf(“%d”,&a[i]); } Shell Sort The shell sort is an improved version of the selection sort. It attempts to effect this improvement by finding and making exchanges that have a big impact on the eventual sorted order. That is, we want a value to make one long jump to near its eventual location, rather than lots of little exchanges that move it little closer each time. To do this we select a gap, usually something slightly less than half the size of the array. We then make a pass along the array comparing elements that are a gap distance apart. If they need to be swapped. then we do so. On the next pass and subsequent passes we decrease the size of the gap by half. Thus on our last pass the gap is just one. But by then all values are quite near their final location. This sort can be improved on slightly using the kind of modifications that we have seen in earlier sorts. The best optimized version of shell sort is theoretically O(n1.2). Simple steps for Shell sort : [Assume we have an array ‘A’ with ‘n’ elements] for(gap=n/2; gap>0; gap/=2) { for(i=gap; i<n; i++) for(j=i-gap; j>=0; j -= gap) { if(!COMPARE(a, j, ,j+gap)) break; SWAP(a, j, j+gap); } } Example : Program to illustrate the Shell sort, assume we have an array ‘A’ with ‘n’ elements. void shellsort(int A[], int n) { int p, j, increment, tmp; for( increment = 1;increment<=n; increment*=2); for( increment = (increment/2)-1; increment >0; increment=( increment -1)/2); { for(p= increment; p<n; p++) { tmp=A[p]; for(j=p; j>= increment && A[j- increment]>tmp; j-= increment) A[j]=A[j- increment] A[j]=tmp; } } } 9.5.5 Quick Sort Quick sort doesn’t look at all like any of the sorts we have examined up until now. It is a partition sort. It is one of the speediest of these ‘in place’ array sorts. It is also one of the most complex to code. We say that it is faster than its siblings, but in fact, it is also an O(n2) algorithm. It gets its reputation for speed from the fact that it is O(n2). only in the worst case, which almost never occurs. In practice quick sort is an O(n Log n) algorithm. Quick sort begins by picking an element to be the pivot value. There is much debate, about how to best pick this initial element. In terms of understanding how quick sort works, it really doesn’t matter. They can just pick the first element in the array. With the pivot to work with, the array is divided into three parts: all values less than the pivot, the pivot, and an values greater than the pivot. When this is finished, the pivot value is in its proper position in the sorted array. Quick sort, then, recursively applies this same process to the first partition, containing low values, and to the second position, which contains high values. Each time at least one element (the pivot) finds its final resting place and the partitions get smaller. Eventually the partitions are of size one and the recursion ends. Quick sort is fast. It is also difficult to code correctly the first time. Most designers seem to believe that implementing quick sort for data set sizes less than 300 to 500 is wasted effort. This is the most widely used internal sorting algorithm. In its basic form, it was invented by C.A.R. Hoare in 1960. Its popularity lies in the ease of implementation, moderate use of resources and acceptable behaviour for a variety of sorting cases. The basis of quick sort is the ‘divide’ and conquer’ strategy i.e. Divide the problem [list to be sorted] into sub-problems [sub-Iists], until solved sub problems [sorted of. sub-lists] are found. This is implemented as 1. Choose one item A [I] from the list A[ ]. 2. Rearrange the list so that this item is in the proper position i.e. all preceding items have a lesser value and all succeeding items have a greater value than this item. 1. A [0], A[1]……A[I-1] in sub list 1 2. A [I] 3. A [I + 1], A[I + 2]………A[N] in sublist 2 3. Repeat steps 1 & 2 for sublist 1 & sublist 2 till A[ ] is a sorted list. As can be seen, this algorithm has a recursive structure. Step 2 or the ‘divide’ procedure is of at most importance in this algorithm. This is usually implemented as follows: 1. Choose A[I] the dividing element. 2. From the left end of the list (A[0] onwards) scan till an item A[R] is found whose value is greater than A[l] 3. From the right end of list [A[N] backwards] scan till an item A[L] is found whose value is less than A[I]. 4. Swap A[R] & A[L]. 5. Continue steps 2, 3 & 4 till the scan pointers cross. Stop at this stage. 6. At this point sublist1 & sublist2 are ready. 7. Now do the same for each of sublist1 & sublist2. We will now give the implementation of Quicksort and illustrate it by an example. Quicksort (int A[], int X, int I) { int L, R, V 1. If (IX) { 2. V= A[I], L = X-I, R = I; 3. For (;;) { 4. While (A[ + + L] V); 5. While (A[--R] V); 6. If (L = R) /* left & right ptrs. have crossed */ 7. break; 8. Swap (A, L, R) /* Swap A[L] & A[R] */ } 9. Swap (A, L, I) 10. Quicksort (A, X, L-1) 11. Quicksort (A, L + 1, I) } } Quicksort is called with A,I, N to sort the whole file. Example: Consider the following list to be sorted in ascending order. ‘ADD YOUR MAN’. (Ignore blanks) N = 10 A[ ]= Quicksort (A, 1, 10) 1. l0 >1 2. V= A[10] = ‘N’ L =I-1= 0 R = I= 10 4. A [4]=’Y’>V; therefore, L=4 5. A [9]=’A’ <V; therefore, R=9 6. L < R 8. SWAP (A, 4,9) to get 4. A[5] =’O’ > V; Therefore L =5 5. A[8] -’M’ < V; Therefore R =8 6. L<R 8. SWAP (A, 5,8) to get] 4. A[6]=’U’ >V;..L=6 5. A[5]=’M’<V;R=5 6. L<R,..break. 9. SWAP (A,6,10) to get At this point ‘N’ is in its correct place. A[6], A[1] to A[5] constitutes sublist1. A[7] to A[10] constitutes sublist2. Now 10. Quicksort (A,1,5) 11. Quicksort (A,6,10) The Quicksort algorithm uses the O(N Log2 N).comparisons on average. The performance can be improved by keeping in mind the following points. 1. Switch to a faster sorting scheme like insertion sort when the sublist size becomes comparatively small. 2. Use a better dividing element I in the implementations; We have always used A[N] as the dividing element. A useful method for the selection of a dividing element is the Median-of three method. Select any 3 elements from the list. Use the median of these as the dividing element. Example : C program to sort the numbers in ascending order using quick sort. #include<stdio.h> /* Function to partition the array for quick sort*/ int partition(int a[],int low, int high) { int i,j,temp,key; key=a[low]; i=low+1; j=high; while(1) { while(i<high&&key>=a[i]) i++; while(key<a[j]) j–; if(i<j) { temp=a[i]; a[i]=a[j]; a[j]=temp; } else { temp=a[low]; a[low]=a[j]; a[j]=temp; } } } /* function to sort the numbers in ascending order using quick sort */ void qucksort(int a[], int low, int high) { int j; if(low<high) { j=partition(a,low,high); /* partion the array into 2 subtables */ quicksort(a,low,j-1) /* Sort the left part of the array */ quicksort(a,j+1,high); /* Sort the right part of the array */ } } void main() { int I,n,a[20]; printf(“Enter the value for n \n”); scanf(%d”,&n); printf(“Enter the number to be sorted \n); for(i=0;i<n;i++) scanf(%d”, &a[i]); quicksort(a,0,n-1); printf(“The sorted array is \n”); for(i=0;i<n;i++) printf(“%d\n”,a[i]); } Tree Sort The use of a tree to sort an array is also a departure from the sorts that we have looked at so far. While all of those earlier sorts did their work ‘in place’, a tree sort requires additional space for a tree to be constructed. Our goal, in every case, has been to end up with the original array in sorted order. To do that with a tree sort we must copy the data from the tree back into the array by doing an in-order traversal of the completed tree. A tree sort proceeds by visiting each element of the array and adding it to an ordered binary tree. The obvious disadvantage to this is the additional space required to hold the tree. When all of the elements of the array have been added to the tree, we walk the tree and repopulate the array in sorted order. In the worst case, the original array is already in sorted order. If this happens, then for each element of the array we will end up adding that element as a leaf on a tree that is really a linked list. In this worst case a tree sort is O(n2). In practice this seldom happens, so tree sort is presumed to be O(n Log n). This is a good example of a typical space/time trade-off. Heap Sort A Heap sort is an efficient sort. It Is also, perhaps, the most difficult to understand. A heap sort can be done ‘in place’. That is, it can be done in an array without creating any additional data structure. Recall that a tree sort has to build a tree as it proceeds. A heap sort does not have to create an actual heap (tree) representation. It can simply rearrange the array into a heap representation. The representation of a heap using an array depends on the fact that a tree representing a heap is a complete tree. There are no gaps between the leaves on the bottom level. The first element of the array is the root of the heap. The second and third elements of the array are level 1 of the heap. The third, fourth, fifth, and sixth elements are level 2 of the heap, and so on. If the heap is not a full tree, there will be some elements missing from the part of the array that represents the final level of the heap; but these will be on the end of the array. There will be no gaps. A heap sort can be accomplished by heapifying the original, unsorted array. Then select the root of the heap, which must be the largest element. Swap this value with the last element in the heap and reheapify .Notice that the heap portion of the array shrinks as you do this. The array is partitioned into the sorted portion, at the end, and the heap portion, at the beginning. The sorted portion grows and the heap portion shrinks until only the entire array is sorted. Heap sort is an O(n log n) sort. It is especially useful for very large data sets of specialized data. Self Assessment Questions 1. Discuss the Insertion sort with suitable example. 2. Write a C program to arrange the numbers in ascending order using bubble sort technique. 3. Write algorithm for Selection sort. 4. Write the advantages of Quick Sort External Sorts Merge Sort Like the other sorts we have seen, we will look at merge sort (and later at radix sort) In terms of a simple Internal array. These sorts, though, are really only useful when they are applied to large external data structures. Merge sort is a very old sort. It was used extensively when the primary external data store was magnetic tapes. It actually takes its name from the action of merging 2 or more sorted tapes to make a single sorted tape. In fact, there are a variety of sorts known a merge sorts. They have names such as balanced merge, natural merge, and polyphase merge; and are all variations on the same theme. A true merge sort is a two part process. The first phase is called the distribution phase and the second is called the merge phase. These two phases may be alternated several times before the sort is complete. In the course of the distribution phase, data elements must be written to a new array (or a new file). As a consequence, merge sorts (all of them) require at least 2n space. In a distribution phase elements from the original (or current) array are written into a new array (or arrays) such that the new array(s) are individually sorted, This can be done in several ways. One way is to select items from the original array and place them into bucket arrays. There might be a bucket to hold values 0-10, another for values 11-20, and so on. The buckets can be sorted independently, then passed to a merge phase to recombine them into a final sorted file. In another approach to the distribution phase, values are taken from the original array until there is a drop down (i.e. an out of order value is encountered). The new partial An array is then merged with what remains of the original array. In this fashion, each distribution/merge phase guarantees that one element is sorted. This approach, however, may require up to n distribution/merge phases. Nonetheless, merge sorts are O(n log n), with the drawback that they need extra space to operate. Way Merge Sort Merge sort is also one of the ‘divide and conquer’ class of algorithms. The basic idea into this is to divide the list into a number of sublists, sort each of these sublists and merge them to get a single sorted list. The recursive implementation of 2- way merge sort divides the list into 2 sorts the sublists and then merges them to get the sorted list. The illustrative implementation of 2 way merge sort sees the input initially as n lists of size 1. These are merged to get 0/2 lists of size 2. These n/2 lists are merged pair wise and so on till a single list is obtained. This can be better understood by the following example. This is also called CONCATENATE SORT. We give here the recursive implementation of 2 Way Merge Sort Mergesort (int List[ ], int, low, int high) { int mid; 1. Mid = (low + high)/2; 2. Mergesort (LIST, low, mid); 3. Mergesort (LIST, mid + 1, high); 4. Merge (low, mid, high, List, FINAL) } Merge (int low, int mid, int high, int LIST [ ], int FINAL) { int a, b, c, d; a = low b = low c = mid + 1 While (a < = mid and c < = high) do { If LIST [ a] < = LIST [c] then { FINAL [b] = LIST [a] a=a+1 } else { FINAL [b] = LIST [c] c=c+1 } b=b+1 } If (a > mid) then for d = c to high do { B [bl = LIST [d] b = b+1 } Else for d = a to mid do { B[b] = A[d] b = b+l. } } To sort the entire list, Mergesort should be called wit h LIST, 1, N. Mergesort is the best method for sorting linked lists in random order. The total computing time is of the 0(0 log2 n). The disadvantage of using mergesort is that it requires two arrays of the same size and type for the merge phase. That is, to sort and list of size n, it needs space for 2n elements. Summary Sorting is a process of arranging a series in either ascending or descending order of their values/keys. Sorting a set of data assumes, first of all, that each element in the set has a value that we can use as the sort key, the field on which we will sort. In context to this, we discussed the various sorting techniques covers the internal and external sorting techniques with the their evaluating a Sorting Algorithms such as Insertion Sort, Bubble Sort, Selection Sort, Shell Sort, Quick Sort, Tree Sort, External Sorts, Merge Sort and 2- Way Merge Sort. 9.8 Terminal Questions 1. Discuss the different ideas which lead to Sorting Algorithms. 2. Explain what are the criteria to be used in evaluating a Sorting Algorithm? 3. Explain any two Internal Sorting with algorithm. 4. Write a ‘C’ program to sort ‘N” numbers using insertion sort. 5. Write the algorithm and ‘C’ program for sorting the numbers in ascending order using quick sort technique. 6. Write note on: a) Heap sort b) Merge sort