The Efficiency of Algorithm

Document Sample
The Efficiency of Algorithm Powered By Docstoc
					The Efficiency of Algorithm

If two algorithms solve the same problem,
     which one should we choose…?




                                            1
          Outline

•   Attributes of Algorithms
•   A Choice of Algorithms
•   Insertion Sort
•   Measuring Efficiency
•   Designing Algorithms – divide and conquer algorithms




                                2
Attributes of Algorithms




                           3
          Attributes of Algorithms (I)

• Correctness
   – May be providing correct results – but to the wrong problem?
   – Must provide correct results for all possible input values
       • Do we know the answer ahead of time?
           – Probably not. But may be a certain standard against which
             we can check the result for reasonableness
       • In some cases, the correct result may be an error message
   – Issue of the accuracy of the result we are willing to accept as correct
       • For   3.14? 3.14159? 3.1416?
• Practically Useable
   – Think about building a road STRAIGHT UP to a mountain


                                      4
          Attributes of Algorithms (II)

• Ease of understanding/Ease of handling
   – A correct algorithm is usually used many times for solving different
     instances of the same problem
       • Ex. sequential search for different lists
   – The problem itself does not usually “stand still”
   – Program Maintenance – after a program (algorithm) is written, it will
     therefore needed to be maintained, both to fix any newly-uncovered
     errors and to extend to meet new requirements
• Elegance
   – See the example next page




                                     5
            Elegance VS. Ease of
            Understanding
1.   Set the value of sum to 0               1. Print the value of 100/2 * (100+1)
2.   Set the value of x to 1
3.   While x <= 100 do steps 4 & 5
4.    Add x to sum
5.    add 1 to the value of x
6.   Print the value of sum
7.   Stop

            •What’s the two algorithms for?
                •Calculate 1+2+3+4+…+100
            •Which one is elegant?
            •Which one is easy to understand?
            •What if we want to calculate the sum of 1 to 1000?

                                         6
          Attributes of Algorithms (III)
• Space Efficiency
   – Be judged by the amount of information the algorithm must store in
     the computer’s memory in order to do the job, in addition to the initial
     data on which the algorithm is operating
• Time Efficiency
   – An indication of the amount the work required by the nature of the
     algorithm itself and the approach it uses
   – Measure of the inherent efficiency of the method, independent of the
     machine speed or the specific working data
   – Count the fundamental unit or units of work of an algorithm
       • It is the number of steps each algorithm requires, not the time the
         algorithm takes on a particular machine, that is important for
         comparing two algorithms that do the same task


                                      7
  A Choice of Algorithm

Develop algorithms for the “data cleanup”
        problem as an example




                                            8
          Data CleanUp Problem

• Problem Definition: Given a list, remove the 0 entries from
  the list
   – Application: List = age data; compute the average age


      0   24 16   0   36 42 23 21       0   27




      24 16 36 42 23 21 27 NIL NIL NIL           legit = 7




                                    9
             The Shuffle-Left Algorithm (I)
legit = 10                               • Solve the problem as we might
0   24 16    0   36 42 23 21   0   27      solve it using a pencil and paper
                                           (and an eraser) to modify the list
                                             – Proceed through the list from
                                               left to right and pass over
legit = 9                                      nonzero values
24 24 16     0   36 42 23 21   0   27            • Point with a finger on the
                                                     left hand to keep our place
                                             – Encounter a zero – squeeze it
                                               out by taking each remaining
legit = 9                                      item and copy it over one cell to
                                               the left
24 16 16     0   36 42 23 21   0   27
                                             – legit  keep the total nonzero
                                               items


                                    10
             The Shuffle-Left Algorithm (II)
 legit = 9                                legit = 9
24 16 0      0   36 42 23 21   0   27    24 16 0          36 42 23 21    0       27 27




 legit = 9                                    legit = 9
24 16 0      36 42 23 21   0   27 27         24 16 0       36 42 23 21       0    27 27



 legit = 9                                    legit = 8
24 16 0      36 42 23 21   0   27 27         24 16 36 42 23 21      0    27 27 27




                                        11
             The Shuffle-Left Algorithm (III)
 legit = 8                                      legit = 7
24 16 36 42 23 21         0   27 27 27         24 16 36 42 23 21 27 27 27 27




 legit = 7                                      Algorithm stops, as the left-hand
24 16 36 42 23 21 27 27 27 27                   finger is past the number of legitimate
                                                data items (legit = 7)


 This algorithm (on this list) requires examining all 10 data items, to see which
  ones are 0, and copying 9 + 7 + 3 = 19 items
 In addition to the memory for the list, the algorithm requires four memory
  locations to store n, legit, left, and right

                                          12
The Shuffle-Left Algorithm (IV)




             13
                 The Copy-Over Algorithm (I)
                                                   • Algorithm outline
    Input List                                         – Proceed by scanning the list
0   24 16        0   36 42 23 21     0   27              from left to right
                                                       – Every nonzero value is copied
                                                         into a new list
                         Copy Over
                                                   • When finish the original list still
    New List
                                                     exists, but so does a new list in
    24 16 36 42 23 21 27                             the desired form

 Every item gets examined to see if it is 0 (as in the shuffle-left algorithm), and
  every nonzero item get copied once into the new list (seven copies for the
  example)
 A lot of extra memory space is required because an almost complete second
  copy of the list is stored

                                              14
The Copy-Over Algorithm (II)




            15
             The Converging-Pointers
             Algorithm (I)
legit = 10
                                             • We move one finger along the
 0 24 16     0   36 42 23 21   0   27
                                               list from left to right and another
                                               finger from right to left
                                             • The left finger slides to the right
 legit = 9                                     over nonzero values
27 24 16     0   36 42 23 21   0   27        • Whenever the left finger
                                               encounters a 0 item, we reduce
                                               the value of legit by one, copy
                                               whatever item is at the right
 legit = 9                                     finger into the left-finger position,
27 24 16     0   36 42 23 21   0   27          and slide the right finger one cell
                                               left



                                        16
             The Converging-Pointers
             Algorithm (II)
 legit = 8                                        legit = 8
27 24 16     0   36 42 23 21       0   27        27 24 16     0   36 42 23 21     0     27



 legit = 7                                        legit = 7
27 24 16 21 36 42 23 21            0   27        27 24 16 21 36 42 23 21          0     27



 This algorithm stops when the left finger meets the right finger, which is pointing
  to a nonzero element
 This algorithm (on this list) requires examining all 10 data items, and a total of
  three copies are done
 This algorithm requires no more memory space than algorithm one, and do fewer
  copies than algorithm two

                                            17
The Converging-Pointers
Algorithm (III)




           18
Insertion Sort




                 19
           Introduction
• Solve the sorting problem
   – Input: A sequence of n numbers  a1 , a2 ,..., an 
   – Output: A permutation (reordering)  a1 , a'2 ,..., a'n  of the input
                                             '

     sequence such that a1  a'2  ...  a'n
                             '

   – An instance of the sorting problem
       • (31, 41, 59, 26, 41, 58)  (26, 31, 41, 41, 58, 59)
• Insertion sort
   –   Efficient for sorting a small number of elements
   –   How do you sort a hand of playing cards?
   –   Input: an array A[1..n] containing a sequence that is to be sorted
   –   the input numbers are sorted in place: the numbers are rearranged
       within the array


                                       20
Sorting A Hand of Cards
Using Insertion Sort




            21
Insertion Sort Algorithm




             22
23
          Loop Invariants

• Help you to understand why an algorithm is correct
• Loop invariant of insertion sort
   – At the start of each iteration of the for loop of lines 1 – 8, the
     subarray A[1..j-1] consists of the elements originally in A[1..j-1] but in
     sorted order
• Must show three things about a loop invariant – similar to
  mathematical induction
   – Initialization: it is true prior to the first iteration of the loop
   – Maintenance: if it is true before an iteration of the loop, it remains
     true before the next iteration
   – Termination: when the loop terminates, the invariant gives us a
     useful property that helps show that the algorithm is correct

                                      24
     Measuring Efficiency

 Analysis of algorithms: the study of the
  efficiency of various algorithms
 Use Sequential Search as an example
 Then introduce order of magnitude


                                             25
            Time Analysis of Insertion
            Sort
• How to measure the running time of an algorithm?
    – The running time usually depends on input size
        • The running time may depend on which input of that size is given
        • For insertion sort, the input size is the array length
    – The running time of an algorithm on a particular input is the number of
      primitive operations or “step” executed
        • A constant amount of time is required to execute each step (a line in
          pseudo code)
• Time analysis of insertion sort  next slide
    – Best case: if the array is already sorted – linear function
        T (n)  c1n  c2 (n  1)  c4 (n  1)  c5 (n  1)  c8 (n  1)
         (c1  c2  c4  c5  c8 )n  (c2  c4  c5  c8 )  an  b
    – Worst case: if the array is in reverse sorted order – quadratic function
      (homework!!!)


                                                26
Running Time of Insertion
Sort




            27
Sequential Algorithm
(Review )




            28
          Times Analysis of
          Sequential Search
• Step 4 is the central unit of work
   – Steps 3 & 7 add a few extra work
      • Add a constant factor for each comparison
          – The constant factor can be ignored. Why???
• How many times is step 4 executed?
   –   Depend on how many times the loop is executed
   –   Best case: 要找的名字是List的頭一個 (只需比較一次)
   –   Worst case: 要找的名字是最後一個 or 找不到 (需比較n次)
   –   Average case: 需比較n/2次




                                  29
          Worst-case and Average-
          case Analysis
• Concentrate on worst-case analysis
   – The worst-case running time is an upper-bound on the running time
     for any input
   – For some algorithms, the worst case occurs fairly often
       • For sequential search  searches for absent information
   – The “average case” is often roughly as bad as the worst case
       • How long does it take to determine where in subarray A[1..j-1] to
         insert A[j]?
           – Almost half  tj = j/2  quadratic function




                                    30
          Why Is Analysis of Algorithm
          Important?
• Analysis of algorithm is critical when the input size is very
  very large
   – In the New York city, n > 20,000,000
       • 50,000 comparisons per second
     20000000                 1 sec onds
              comparisons *                          200 sec onds
         2                  50000        comparison




                                    31
          Space Analysis of
          Sequential Search
• Space Efficiency
   – Be judged by the amount of information the algorithm must store in
     the computer’s memory in order to do the job, in addition to the initial
     data on which the algorithm is operating
• Sequential search: very space-efficient
   – Initial data: list of names and the target NAME
   – Additional information: i and Found




                                     32
          Order of Magnitude (數量級)
• Order of Growth
• Why can we ignore the constants?
   – The worst-case behavior of sequential search: cn
      • See next slide for different constant factors (c=2, 1, ½)
          – Increase at different rate, but same basic straight line shape
      • Anything that varies as a constant time n (and whose graph
        follows the basic shape of n) is said to be order of magnitude n,
        written  (n)
      • Sequential search is an  (n) algorithm in both the worst and
        average case
• Consider only the leading term of a formula
   – an2 + bn + c   ( n )
                         2




                                    33
     The same straight-line shape




34
         See Another Example
                        1            2               3         4
            1          243          187             314       244
            2          215          420             345       172
            3          197          352             385       261
            4          238          764             125       552

Write an algorithm to print each cell –
For each of rows 1 through 4 do the following
 For each of columns 1 through 4 do the following
  write the entry in this row and column

 What’s the order of magnitude of this algorithm?  ( n )
                                                          2

    Think about you have n rows and n columns
 Can you convert the above algorithm into the while or repeat formats?

                                     35
     The same n2 shape




36
          Now We Have Two Shape
          Classifications…
• The value of the constants does not affect the classification,
  which is why we can generally ignore it
• Is it important to distinguish the two different order of
  magnitude of n and n2 ?
   – Yes. Because cn2 grows at a much faster rate than c’n
      • Even if a small c and large c’



                                             n2 > n when n > 1  an  (n 2 )
                                             algorithm does more work
                                             than an  (n) algorithm


                                   37
For large enough n, 0.25n2
has larger values than 10n

                  (n 2 ) eventually has larger
                  values than  (n)




            38
          A Comparison of Two Extreme
           (n ) and  (n) Algorithms
              2




Input size of 1,000,000 are not uncommon – think of the New York City
telephone list                       39
          Brief Summary – Order of
          Magnitude
• If an  (n 2 ) algorithm and an  (n) algorithm exist for the
  same task, then for large enough n, the  (n 2 ) one does more
  work and takes longer time to execute, regardless of the
  constant factors for peripheral work.
   – This is the rationale for ignoring the constant factors and
     concentrating on the basic order of magnitude of algorithms
• It is for large values of input n that we need to be concerned
  about the time resources being used
   – We need to have an algorithm with smaller order of magnitude
• For small n, the value of the constant factor is significant
• Don’t make assumptions about input size

                                   40
Designing Algorithms




                       41
          Divide-and-Conquer
          Approach
• Divide-and-conquer approach
   – To solve a given problem, the algorithm call themselves recursively
     one or more times to handle closely related sub-problems.
   – Break a problem into several sub-problems that are similar to the
     original problem but smaller in size, solve the sub-problem
     recursively, and then combine these solutions to create a solution to
     the original problem.
• Three steps in a divide-and-conquer approach
   – Divide the problem into a number of sub-problems
   – Conquer the sub-problems by solving them recursively. If the sub-
     problem sizes are small enough, just solve them in a straightforward
     manner
   – Combine the solutions to the sub-problems into the solution for the
     original problem

                                    42
         Merge Sort

• Divide: divide the n-element sequence to be sorted into two
  subsequences of n/2 elements each
• Conquer: sort the two subsequences recursively using
  merge sort
• Combine: merge the two sorted subsequences to produce
  the sorted answer




                              43
                     Initial Call:
                     MERGE-SORT(A, 1, length[A])




If p  r, the subarray has at most one
element and is therefore already sorted.




          44
45
            Merge

• MERGE(A, p, q, r)
   – A is an array and p, q, r are indices numbering elements of A such
     that p  q < r
   – A[p..q] and A[q+1..r] are in sorted order
   – Merge A[p..q] and A[q+1..r] to form a single sorted subarray that
     replaces the current subarray A[p..r]
   –    (n) : n = r – p + 1




                                    46
// Compute # of elements in L
// Compute # of elements in R


// Copy A[p, q] to L

// Copy A[q+1, r] to R


// Put a sentinel card at the end of L
// Put a sentinel card at the end of R




                         // Put the smaller of
                            L[i] and R[j] to A[K]

   47
 48
Merge
 49
Merge
          Loop Invariant of Merge

• At the start of each iteration of the for loop of lines 12 – 17,
  the subarray A[p..k-1] contains the k-p smallest elements of
  L[1..n1+1] and R[1..n2+1], in sorted order. Moreover, L[i] and
  R[j] are the smallest elements of their arrays that have not
  been copied back into A.
• Proof: see pp. 30 – 31




                                50
          Analyzing Divide-and-
          Conquer Algorithms
• Describe the running time of a divide-and-conquer algorithm
  by recurrence equation

  T ( n)    
              (1)
             aT ( n / b ) D ( n ) C ( n )
                                              nc
                                              Otherwise
   – T(n): the running time on a problem of size n
   – If the problem size is small enough (n  c), constant time (Θ(1))
   – Divide the problem to a sub-problems, each of which is 1/b the size
     of the original
• For Merge sort, the worst case running time  Θ(nlgn)

                 
               (1)
     T (n)  2T ( n / 2)( n )
                                    If n = 1
                                    If n > 1
                                                 lgn = log2n


                                        51
          Another view of
          the Analysis of
+1        merge sort

               n=2m
               n/2i = 1
               i = m = lg n




     52
                   Example
                                                       n=8
                                                                 8c
    n=4

              4c                              4c                                   4c


     2c                2c           2c                  2c                2c                2c

c         c        c        c   c        c         c         c        c        c        c        c




                                         53

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:5
posted:7/23/2011
language:English
pages:53
pptfiles pptfiles
About