Document Sample

The Efficiency of Algorithm If two algorithms solve the same problem, which one should we choose…? 1 Outline • Attributes of Algorithms • A Choice of Algorithms • Insertion Sort • Measuring Efficiency • Designing Algorithms – divide and conquer algorithms 2 Attributes of Algorithms 3 Attributes of Algorithms (I) • Correctness – May be providing correct results – but to the wrong problem? – Must provide correct results for all possible input values • Do we know the answer ahead of time? – Probably not. But may be a certain standard against which we can check the result for reasonableness • In some cases, the correct result may be an error message – Issue of the accuracy of the result we are willing to accept as correct • For 3.14? 3.14159? 3.1416? • Practically Useable – Think about building a road STRAIGHT UP to a mountain 4 Attributes of Algorithms (II) • Ease of understanding/Ease of handling – A correct algorithm is usually used many times for solving different instances of the same problem • Ex. sequential search for different lists – The problem itself does not usually “stand still” – Program Maintenance – after a program (algorithm) is written, it will therefore needed to be maintained, both to fix any newly-uncovered errors and to extend to meet new requirements • Elegance – See the example next page 5 Elegance VS. Ease of Understanding 1. Set the value of sum to 0 1. Print the value of 100/2 * (100+1) 2. Set the value of x to 1 3. While x <= 100 do steps 4 & 5 4. Add x to sum 5. add 1 to the value of x 6. Print the value of sum 7. Stop •What’s the two algorithms for? •Calculate 1+2+3+4+…+100 •Which one is elegant? •Which one is easy to understand? •What if we want to calculate the sum of 1 to 1000? 6 Attributes of Algorithms (III) • Space Efficiency – Be judged by the amount of information the algorithm must store in the computer’s memory in order to do the job, in addition to the initial data on which the algorithm is operating • Time Efficiency – An indication of the amount the work required by the nature of the algorithm itself and the approach it uses – Measure of the inherent efficiency of the method, independent of the machine speed or the specific working data – Count the fundamental unit or units of work of an algorithm • It is the number of steps each algorithm requires, not the time the algorithm takes on a particular machine, that is important for comparing two algorithms that do the same task 7 A Choice of Algorithm Develop algorithms for the “data cleanup” problem as an example 8 Data CleanUp Problem • Problem Definition: Given a list, remove the 0 entries from the list – Application: List = age data; compute the average age 0 24 16 0 36 42 23 21 0 27 24 16 36 42 23 21 27 NIL NIL NIL legit = 7 9 The Shuffle-Left Algorithm (I) legit = 10 • Solve the problem as we might 0 24 16 0 36 42 23 21 0 27 solve it using a pencil and paper (and an eraser) to modify the list – Proceed through the list from left to right and pass over legit = 9 nonzero values 24 24 16 0 36 42 23 21 0 27 • Point with a finger on the left hand to keep our place – Encounter a zero – squeeze it out by taking each remaining legit = 9 item and copy it over one cell to the left 24 16 16 0 36 42 23 21 0 27 – legit keep the total nonzero items 10 The Shuffle-Left Algorithm (II) legit = 9 legit = 9 24 16 0 0 36 42 23 21 0 27 24 16 0 36 42 23 21 0 27 27 legit = 9 legit = 9 24 16 0 36 42 23 21 0 27 27 24 16 0 36 42 23 21 0 27 27 legit = 9 legit = 8 24 16 0 36 42 23 21 0 27 27 24 16 36 42 23 21 0 27 27 27 11 The Shuffle-Left Algorithm (III) legit = 8 legit = 7 24 16 36 42 23 21 0 27 27 27 24 16 36 42 23 21 27 27 27 27 legit = 7 Algorithm stops, as the left-hand 24 16 36 42 23 21 27 27 27 27 finger is past the number of legitimate data items (legit = 7) This algorithm (on this list) requires examining all 10 data items, to see which ones are 0, and copying 9 + 7 + 3 = 19 items In addition to the memory for the list, the algorithm requires four memory locations to store n, legit, left, and right 12 The Shuffle-Left Algorithm (IV) 13 The Copy-Over Algorithm (I) • Algorithm outline Input List – Proceed by scanning the list 0 24 16 0 36 42 23 21 0 27 from left to right – Every nonzero value is copied into a new list Copy Over • When finish the original list still New List exists, but so does a new list in 24 16 36 42 23 21 27 the desired form Every item gets examined to see if it is 0 (as in the shuffle-left algorithm), and every nonzero item get copied once into the new list (seven copies for the example) A lot of extra memory space is required because an almost complete second copy of the list is stored 14 The Copy-Over Algorithm (II) 15 The Converging-Pointers Algorithm (I) legit = 10 • We move one finger along the 0 24 16 0 36 42 23 21 0 27 list from left to right and another finger from right to left • The left finger slides to the right legit = 9 over nonzero values 27 24 16 0 36 42 23 21 0 27 • Whenever the left finger encounters a 0 item, we reduce the value of legit by one, copy whatever item is at the right legit = 9 finger into the left-finger position, 27 24 16 0 36 42 23 21 0 27 and slide the right finger one cell left 16 The Converging-Pointers Algorithm (II) legit = 8 legit = 8 27 24 16 0 36 42 23 21 0 27 27 24 16 0 36 42 23 21 0 27 legit = 7 legit = 7 27 24 16 21 36 42 23 21 0 27 27 24 16 21 36 42 23 21 0 27 This algorithm stops when the left finger meets the right finger, which is pointing to a nonzero element This algorithm (on this list) requires examining all 10 data items, and a total of three copies are done This algorithm requires no more memory space than algorithm one, and do fewer copies than algorithm two 17 The Converging-Pointers Algorithm (III) 18 Insertion Sort 19 Introduction • Solve the sorting problem – Input: A sequence of n numbers a1 , a2 ,..., an – Output: A permutation (reordering) a1 , a'2 ,..., a'n of the input ' sequence such that a1 a'2 ... a'n ' – An instance of the sorting problem • (31, 41, 59, 26, 41, 58) (26, 31, 41, 41, 58, 59) • Insertion sort – Efficient for sorting a small number of elements – How do you sort a hand of playing cards? – Input: an array A[1..n] containing a sequence that is to be sorted – the input numbers are sorted in place: the numbers are rearranged within the array 20 Sorting A Hand of Cards Using Insertion Sort 21 Insertion Sort Algorithm 22 23 Loop Invariants • Help you to understand why an algorithm is correct • Loop invariant of insertion sort – At the start of each iteration of the for loop of lines 1 – 8, the subarray A[1..j-1] consists of the elements originally in A[1..j-1] but in sorted order • Must show three things about a loop invariant – similar to mathematical induction – Initialization: it is true prior to the first iteration of the loop – Maintenance: if it is true before an iteration of the loop, it remains true before the next iteration – Termination: when the loop terminates, the invariant gives us a useful property that helps show that the algorithm is correct 24 Measuring Efficiency Analysis of algorithms: the study of the efficiency of various algorithms Use Sequential Search as an example Then introduce order of magnitude 25 Time Analysis of Insertion Sort • How to measure the running time of an algorithm? – The running time usually depends on input size • The running time may depend on which input of that size is given • For insertion sort, the input size is the array length – The running time of an algorithm on a particular input is the number of primitive operations or “step” executed • A constant amount of time is required to execute each step (a line in pseudo code) • Time analysis of insertion sort next slide – Best case: if the array is already sorted – linear function T (n) c1n c2 (n 1) c4 (n 1) c5 (n 1) c8 (n 1) (c1 c2 c4 c5 c8 )n (c2 c4 c5 c8 ) an b – Worst case: if the array is in reverse sorted order – quadratic function (homework!!!) 26 Running Time of Insertion Sort 27 Sequential Algorithm (Review ) 28 Times Analysis of Sequential Search • Step 4 is the central unit of work – Steps 3 & 7 add a few extra work • Add a constant factor for each comparison – The constant factor can be ignored. Why??? • How many times is step 4 executed? – Depend on how many times the loop is executed – Best case: 要找的名字是List的頭一個 (只需比較一次) – Worst case: 要找的名字是最後一個 or 找不到 (需比較n次) – Average case: 需比較n/2次 29 Worst-case and Average- case Analysis • Concentrate on worst-case analysis – The worst-case running time is an upper-bound on the running time for any input – For some algorithms, the worst case occurs fairly often • For sequential search searches for absent information – The “average case” is often roughly as bad as the worst case • How long does it take to determine where in subarray A[1..j-1] to insert A[j]? – Almost half tj = j/2 quadratic function 30 Why Is Analysis of Algorithm Important? • Analysis of algorithm is critical when the input size is very very large – In the New York city, n > 20,000,000 • 50,000 comparisons per second 20000000 1 sec onds comparisons * 200 sec onds 2 50000 comparison 31 Space Analysis of Sequential Search • Space Efficiency – Be judged by the amount of information the algorithm must store in the computer’s memory in order to do the job, in addition to the initial data on which the algorithm is operating • Sequential search: very space-efficient – Initial data: list of names and the target NAME – Additional information: i and Found 32 Order of Magnitude (數量級) • Order of Growth • Why can we ignore the constants? – The worst-case behavior of sequential search: cn • See next slide for different constant factors (c=2, 1, ½) – Increase at different rate, but same basic straight line shape • Anything that varies as a constant time n (and whose graph follows the basic shape of n) is said to be order of magnitude n, written (n) • Sequential search is an (n) algorithm in both the worst and average case • Consider only the leading term of a formula – an2 + bn + c ( n ) 2 33 The same straight-line shape 34 See Another Example 1 2 3 4 1 243 187 314 244 2 215 420 345 172 3 197 352 385 261 4 238 764 125 552 Write an algorithm to print each cell – For each of rows 1 through 4 do the following For each of columns 1 through 4 do the following write the entry in this row and column What’s the order of magnitude of this algorithm? ( n ) 2 Think about you have n rows and n columns Can you convert the above algorithm into the while or repeat formats? 35 The same n2 shape 36 Now We Have Two Shape Classifications… • The value of the constants does not affect the classification, which is why we can generally ignore it • Is it important to distinguish the two different order of magnitude of n and n2 ? – Yes. Because cn2 grows at a much faster rate than c’n • Even if a small c and large c’ n2 > n when n > 1 an (n 2 ) algorithm does more work than an (n) algorithm 37 For large enough n, 0.25n2 has larger values than 10n (n 2 ) eventually has larger values than (n) 38 A Comparison of Two Extreme (n ) and (n) Algorithms 2 Input size of 1,000,000 are not uncommon – think of the New York City telephone list 39 Brief Summary – Order of Magnitude • If an (n 2 ) algorithm and an (n) algorithm exist for the same task, then for large enough n, the (n 2 ) one does more work and takes longer time to execute, regardless of the constant factors for peripheral work. – This is the rationale for ignoring the constant factors and concentrating on the basic order of magnitude of algorithms • It is for large values of input n that we need to be concerned about the time resources being used – We need to have an algorithm with smaller order of magnitude • For small n, the value of the constant factor is significant • Don’t make assumptions about input size 40 Designing Algorithms 41 Divide-and-Conquer Approach • Divide-and-conquer approach – To solve a given problem, the algorithm call themselves recursively one or more times to handle closely related sub-problems. – Break a problem into several sub-problems that are similar to the original problem but smaller in size, solve the sub-problem recursively, and then combine these solutions to create a solution to the original problem. • Three steps in a divide-and-conquer approach – Divide the problem into a number of sub-problems – Conquer the sub-problems by solving them recursively. If the sub- problem sizes are small enough, just solve them in a straightforward manner – Combine the solutions to the sub-problems into the solution for the original problem 42 Merge Sort • Divide: divide the n-element sequence to be sorted into two subsequences of n/2 elements each • Conquer: sort the two subsequences recursively using merge sort • Combine: merge the two sorted subsequences to produce the sorted answer 43 Initial Call: MERGE-SORT(A, 1, length[A]) If p r, the subarray has at most one element and is therefore already sorted. 44 45 Merge • MERGE(A, p, q, r) – A is an array and p, q, r are indices numbering elements of A such that p q < r – A[p..q] and A[q+1..r] are in sorted order – Merge A[p..q] and A[q+1..r] to form a single sorted subarray that replaces the current subarray A[p..r] – (n) : n = r – p + 1 46 // Compute # of elements in L // Compute # of elements in R // Copy A[p, q] to L // Copy A[q+1, r] to R // Put a sentinel card at the end of L // Put a sentinel card at the end of R // Put the smaller of L[i] and R[j] to A[K] 47 48 Merge 49 Merge Loop Invariant of Merge • At the start of each iteration of the for loop of lines 12 – 17, the subarray A[p..k-1] contains the k-p smallest elements of L[1..n1+1] and R[1..n2+1], in sorted order. Moreover, L[i] and R[j] are the smallest elements of their arrays that have not been copied back into A. • Proof: see pp. 30 – 31 50 Analyzing Divide-and- Conquer Algorithms • Describe the running time of a divide-and-conquer algorithm by recurrence equation T ( n) (1) aT ( n / b ) D ( n ) C ( n ) nc Otherwise – T(n): the running time on a problem of size n – If the problem size is small enough (n c), constant time (Θ(1)) – Divide the problem to a sub-problems, each of which is 1/b the size of the original • For Merge sort, the worst case running time Θ(nlgn) (1) T (n) 2T ( n / 2)( n ) If n = 1 If n > 1 lgn = log2n 51 Another view of the Analysis of +1 merge sort n=2m n/2i = 1 i = m = lg n 52 Example n=8 8c n=4 4c 4c 4c 2c 2c 2c 2c 2c 2c c c c c c c c c c c c c 53

DOCUMENT INFO

Shared By:

Categories:

Tags:

Stats:

views: | 5 |

posted: | 7/23/2011 |

language: | English |

pages: | 53 |

How are you planning on using Docstoc?
BUSINESS
PERSONAL

By registering with docstoc.com you agree to our
privacy policy and
terms of service, and to receive content and offer notifications.

Docstoc is the premier online destination to start and grow small businesses. It hosts the best quality and widest selection of professional documents (over 20 million) and resources including expert videos, articles and productivity tools to make every small business better.

Search or Browse for any specific document or resource you need for your business. Or explore our curated resources for Starting a Business, Growing a Business or for Professional Development.

Feel free to Contact Us with any questions you might have.