VIEWS: 73 PAGES: 20 CATEGORY: Software POSTED ON: 5/16/2010
Data Structures Data Structures: An aggregation of atomic and composite data types into a set with defined relationships. Space complexity of a program is the amount of memory it needs to run to completion. Time complexity of a program is the amount of computer time it needs to run to completion. Binary search: First=1 Last=end Loop(first<=last) mid=(first+last)/2 if(target>list[mid]) 5/15/2010 11:02:34 PM 1 Data Structures First=mid+1 Else if target <list[mid] Last=mid-1 Else First=last+1 Locn=mid If target = list[mid] Found=true Else Found=false Return found 5/15/2010 11:02:34 PM 2 Data Structures A hash search is a search in which the key through an algorithmic function, determines the location of the data. A linked list is an ordered collection of data in which each element contains the location of the next element:data and link. The nodes in a ll are called self – referential structures. In the circularly – linked list, the last node’s link points to the first node of the list. A doubly ll is a data structure in which the node has a pointer to both its successor and its predecessor. A stack is a linear list in which all additions and deletions are restricted to one end called the top of the stack.{LIFO structure}. Eg., coins and dishes(Balancing Symbols,Postfix…) 5/15/2010 11:02:34 PM 3 Data Structures Common operations: Push Pop Stack top Infix a + b Postfix a b + Prefix + a b 5/15/2010 11:02:34 PM 4 Data Structures Algorithm Queens8 Createstack(stack) Set row=1 and col =0 Loop (row<=boardsize) Loop(col<=boardsize and row<=boardsize) add 1 to col if (not guarded(row,col)) place queen at board [row][col] pushstack(stack,[row][col]) add 1 to row set col = 0 5/15/2010 11:02:34 PM 5 Data Structures Loop(col>=boardsize) Popstack(stack,[row,col]) Remove queen at board[row][col] Printboard(stack) A Queue is a linear list in which data can only be inserted at one end, called the rear and deleted from the other end called the front.FIFO. Operations: Enq Deq Q front,rear. 5/15/2010 11:02:34 PM 6 Data Structures Recursion is a repetitive process in which an algorithm calls itself. Towers of Hanoi: Void towers(int n,char src,char dst,char aux){ Static int step=0 If n==1 Printf(“Step %3d:move from %c to %c”,++step,src,dst) Else {towers(n-1,src,aux,dst) Printf(“Step %3d”,++step,src,dst) }return;} 5/15/2010 11:02:34 PM 7 Data Structures A tree consists of a finite set of elements called nodes and a finite set of directed lines called branches that connect the nodes. The level of a node is its distance from the root. The height of the tree is the level of the leaf in the longest path from the root plus one. A binary tree is a tree in which no node can have more than two subtrees. The balance factor of a binary tree is the difference in height between its left and right subtrees. 5/15/2010 11:02:34 PM 8 Data Structures Preorder: - Root,left,right Inorder – left,root, right Post order – left,right,root General algorithm: If(root is not null) Inorder(root->leftsubtree) Inorder(root) Inorder(root->rightsubtree) 5/15/2010 11:02:34 PM 9 Data Structures In the depth – first traversal, the processing proceeds along a path from the root through one child to the most distant descendent of that child before processing a second child. In the breadth – first traversal, the processing proceeds horizontally from the root to all of its children, then to its children’s children and so forth until all nodes have been processed. A general tree in which each node can have an unlimited outdegree. Binary search tree:properties: All items in the left subtree are less than the root. All items in the right subtree are greater than or equal to the root. Each subtree is itself a binary search tree. 5/15/2010 11:02:34 PM 10 Data Structures An AVL(Adelson Velskii Landis) tree is a search tree in which the heights of the subtrees differ by no more than one.It is a height- balanced binary tree.(Depth – O(log N)). A heap is a complete or nearly complete binary tree in which the key value in a node is greater than the key values in all of its subtrees and the subtrees are in turn heaps. Two operations:Insert and delete a node. Reheap up operation repairs the structure so that it is a heap by floating the last element up the tree until that element is in its correct position in the tree. Applications of Heaps: Priority Queues. 5/15/2010 11:02:34 PM 11 Data Structures An m-way tree is a search tree in which each node can have from zero to m subtrees , where m is defined as the order of the tree. A B – Tree is an m – way search tree with the properties: The root is either a leaf or it has 2..m subtrees. All internal nodes have at least [m/2] non null subtrees and at most m nonnull subtrees. All leaf nodes are at the same level;perfectly balanced. A leaf node has at least [m/2]-1 and at the most m-1 entries. Grows from bottom up. 5/15/2010 11:02:34 PM 12 Data Structures B – tree and B+ tree: Each data entry must be represented at the leaf level. Each leaf node has one additional pointer, which is used to move to the next leaf node in sequence.(B+) Sorting (internal – primary and external – secondary storage) Insertion sort,heap,quick sort(look in the book) Graph is a collection of nodes,called vertices and a collection of line segments connecting pairs of vertices called lines. 5/15/2010 11:02:34 PM 13 Data Structures A directed graph or digraph is a graph in which each line has a direction to its successor. An undirected graph is a graph in which there is no direction on the lines known as edges. A path is a sequence of vertices in which each vertex is adjacent to the next one. A cycle is a path consisting of at least three vertices that starts and ends with the same vertex. The outdegree of a vertex is the number of arcs leaving the vertex The indegree of a vertex is the number of arcs entering the vertex. 5/15/2010 11:02:34 PM 14 Data Structures A network is a graph whose lines are weighted. Unsigned int gcd(int m,int n) { Int rem; While(n>0) { Rem=m%n; M=n;n=rem;} Return m; } 5/15/2010 11:02:34 PM 15 Data Structures Solution to collision: Linear Probing Quadratic probing(i2) Double Hashing(R-(X Mod R), R – prime<X) Rehashing(create a new table, twice as large as the old table size) Extendible hashing(00,01,10,11) Algorithm(run time – worst case) Insertion sort ( O(N) – presorted - O(N2)) Shell sort(O(N2)) Heap sort(2N log N – O(N)) 5/15/2010 11:02:34 PM 16 Data Structures Merge Sort(O(N log N)) Quick sort(O(N log N) – O(N2)) External sort Polyphase Merge External selection Multiway merge NP stands for nondeterministic polynomial time. A nondeterministic machine has a choice of next steps. It is free to chose any that it wishes, and if one of these steps leads to a solution, it will always choose the correct one. 5/15/2010 11:02:34 PM 17 Data Structures A simple way to check if a problem is in NP is to phrase the problem as a y/n problem? An NP – complete problem has the property that any problem in NP can be polynomially reduced to it. A problem p1 can be reduced to p2 as follows. Provide a mapping so that nay instance of p1 can be transformed to an instance of p2. Solve p2, and then map the answer back to the original. Given a function to compute on n inputs the divide – and – conquer strategy suggest splitting the inputs into k distinct subsets 1<k<=n, yielding k subproblems. 5/15/2010 11:02:34 PM 18 Data Structures Any subset that satisfies these constraints is called a feasible solution. We need to find a feasible solution that either maximizes or minimizes a given objective function. Dynamic programming is an algorithm design method that can be used when the solution to a problem can be viewed as the result of a sequence of decisions. A problem that is np – complete has the property that it can be solved in polynomial time iff all other np – complete problems can also be solved in polynomial time. If an np – hard problem can be solved in polynomial time, then all 5/15/2010 11:02:34 PM 19 Data Structures Np – complete problems can be solved inpolynomial time. All np – complete problems are np – hard but some np –hard problems are not known to be np – complete. 5/15/2010 11:02:34 PM 20