# The Efficiency of Algorithm

Document Sample

```					The Efficiency of Algorithm

If two algorithms solve the same problem,
which one should we choose…?

1
Outline

•   Attributes of Algorithms
•   A Choice of Algorithms
•   Insertion Sort
•   Measuring Efficiency
•   Designing Algorithms – divide and conquer algorithms

2
Attributes of Algorithms

3
Attributes of Algorithms (I)

• Correctness
– May be providing correct results – but to the wrong problem?
– Must provide correct results for all possible input values
– Probably not. But may be a certain standard against which
we can check the result for reasonableness
• In some cases, the correct result may be an error message
– Issue of the accuracy of the result we are willing to accept as correct
• For   3.14? 3.14159? 3.1416?
• Practically Useable

4
Attributes of Algorithms (II)

• Ease of understanding/Ease of handling
– A correct algorithm is usually used many times for solving different
instances of the same problem
• Ex. sequential search for different lists
– The problem itself does not usually “stand still”
– Program Maintenance – after a program (algorithm) is written, it will
therefore needed to be maintained, both to fix any newly-uncovered
errors and to extend to meet new requirements
• Elegance
– See the example next page

5
Elegance VS. Ease of
Understanding
1.   Set the value of sum to 0               1. Print the value of 100/2 * (100+1)
2.   Set the value of x to 1
3.   While x <= 100 do steps 4 & 5
5.    add 1 to the value of x
6.   Print the value of sum
7.   Stop

•What’s the two algorithms for?
•Calculate 1+2+3+4+…+100
•Which one is elegant?
•Which one is easy to understand?
•What if we want to calculate the sum of 1 to 1000?

6
Attributes of Algorithms (III)
• Space Efficiency
– Be judged by the amount of information the algorithm must store in
the computer’s memory in order to do the job, in addition to the initial
data on which the algorithm is operating
• Time Efficiency
– An indication of the amount the work required by the nature of the
algorithm itself and the approach it uses
– Measure of the inherent efficiency of the method, independent of the
machine speed or the specific working data
– Count the fundamental unit or units of work of an algorithm
• It is the number of steps each algorithm requires, not the time the
algorithm takes on a particular machine, that is important for
comparing two algorithms that do the same task

7
A Choice of Algorithm

Develop algorithms for the “data cleanup”
problem as an example

8
Data CleanUp Problem

• Problem Definition: Given a list, remove the 0 entries from
the list
– Application: List = age data; compute the average age

0   24 16   0   36 42 23 21       0   27

24 16 36 42 23 21 27 NIL NIL NIL           legit = 7

9
The Shuffle-Left Algorithm (I)
legit = 10                               • Solve the problem as we might
0   24 16    0   36 42 23 21   0   27      solve it using a pencil and paper
(and an eraser) to modify the list
– Proceed through the list from
left to right and pass over
legit = 9                                      nonzero values
24 24 16     0   36 42 23 21   0   27            • Point with a finger on the
left hand to keep our place
– Encounter a zero – squeeze it
out by taking each remaining
legit = 9                                      item and copy it over one cell to
the left
24 16 16     0   36 42 23 21   0   27
– legit  keep the total nonzero
items

10
The Shuffle-Left Algorithm (II)
legit = 9                                legit = 9
24 16 0      0   36 42 23 21   0   27    24 16 0          36 42 23 21    0       27 27

legit = 9                                    legit = 9
24 16 0      36 42 23 21   0   27 27         24 16 0       36 42 23 21       0    27 27

legit = 9                                    legit = 8
24 16 0      36 42 23 21   0   27 27         24 16 36 42 23 21      0    27 27 27

11
The Shuffle-Left Algorithm (III)
legit = 8                                      legit = 7
24 16 36 42 23 21         0   27 27 27         24 16 36 42 23 21 27 27 27 27

legit = 7                                      Algorithm stops, as the left-hand
24 16 36 42 23 21 27 27 27 27                   finger is past the number of legitimate
data items (legit = 7)

 This algorithm (on this list) requires examining all 10 data items, to see which
ones are 0, and copying 9 + 7 + 3 = 19 items
 In addition to the memory for the list, the algorithm requires four memory
locations to store n, legit, left, and right

12
The Shuffle-Left Algorithm (IV)

13
The Copy-Over Algorithm (I)
• Algorithm outline
Input List                                         – Proceed by scanning the list
0   24 16        0   36 42 23 21     0   27              from left to right
– Every nonzero value is copied
into a new list
Copy Over
• When finish the original list still
New List
exists, but so does a new list in
24 16 36 42 23 21 27                             the desired form

 Every item gets examined to see if it is 0 (as in the shuffle-left algorithm), and
every nonzero item get copied once into the new list (seven copies for the
example)
 A lot of extra memory space is required because an almost complete second
copy of the list is stored

14
The Copy-Over Algorithm (II)

15
The Converging-Pointers
Algorithm (I)
legit = 10
• We move one finger along the
0 24 16     0   36 42 23 21   0   27
list from left to right and another
finger from right to left
• The left finger slides to the right
legit = 9                                     over nonzero values
27 24 16     0   36 42 23 21   0   27        • Whenever the left finger
encounters a 0 item, we reduce
the value of legit by one, copy
whatever item is at the right
legit = 9                                     finger into the left-finger position,
27 24 16     0   36 42 23 21   0   27          and slide the right finger one cell
left

16
The Converging-Pointers
Algorithm (II)
legit = 8                                        legit = 8
27 24 16     0   36 42 23 21       0   27        27 24 16     0   36 42 23 21     0     27

legit = 7                                        legit = 7
27 24 16 21 36 42 23 21            0   27        27 24 16 21 36 42 23 21          0     27

 This algorithm stops when the left finger meets the right finger, which is pointing
to a nonzero element
 This algorithm (on this list) requires examining all 10 data items, and a total of
three copies are done
 This algorithm requires no more memory space than algorithm one, and do fewer
copies than algorithm two

17
The Converging-Pointers
Algorithm (III)

18
Insertion Sort

19
Introduction
• Solve the sorting problem
– Input: A sequence of n numbers  a1 , a2 ,..., an 
– Output: A permutation (reordering)  a1 , a'2 ,..., a'n  of the input
'

sequence such that a1  a'2  ...  a'n
'

– An instance of the sorting problem
• (31, 41, 59, 26, 41, 58)  (26, 31, 41, 41, 58, 59)
• Insertion sort
–   Efficient for sorting a small number of elements
–   How do you sort a hand of playing cards?
–   Input: an array A[1..n] containing a sequence that is to be sorted
–   the input numbers are sorted in place: the numbers are rearranged
within the array

20
Sorting A Hand of Cards
Using Insertion Sort

21
Insertion Sort Algorithm

22
23
Loop Invariants

• Loop invariant of insertion sort
– At the start of each iteration of the for loop of lines 1 – 8, the
subarray A[1..j-1] consists of the elements originally in A[1..j-1] but in
sorted order
• Must show three things about a loop invariant – similar to
mathematical induction
– Initialization: it is true prior to the first iteration of the loop
– Maintenance: if it is true before an iteration of the loop, it remains
true before the next iteration
– Termination: when the loop terminates, the invariant gives us a
useful property that helps show that the algorithm is correct

24
Measuring Efficiency

 Analysis of algorithms: the study of the
efficiency of various algorithms
 Use Sequential Search as an example
 Then introduce order of magnitude

25
Time Analysis of Insertion
Sort
• How to measure the running time of an algorithm?
– The running time usually depends on input size
• The running time may depend on which input of that size is given
• For insertion sort, the input size is the array length
– The running time of an algorithm on a particular input is the number of
primitive operations or “step” executed
• A constant amount of time is required to execute each step (a line in
pseudo code)
• Time analysis of insertion sort  next slide
– Best case: if the array is already sorted – linear function
T (n)  c1n  c2 (n  1)  c4 (n  1)  c5 (n  1)  c8 (n  1)
 (c1  c2  c4  c5  c8 )n  (c2  c4  c5  c8 )  an  b
– Worst case: if the array is in reverse sorted order – quadratic function
(homework!!!)

26
Running Time of Insertion
Sort

27
Sequential Algorithm
(Review )

28
Times Analysis of
Sequential Search
• Step 4 is the central unit of work
– Steps 3 & 7 add a few extra work
• Add a constant factor for each comparison
– The constant factor can be ignored. Why???
• How many times is step 4 executed?
–   Depend on how many times the loop is executed
–   Best case: 要找的名字是List的頭一個 (只需比較一次)
–   Worst case: 要找的名字是最後一個 or 找不到 (需比較n次)
–   Average case: 需比較n/2次

29
Worst-case and Average-
case Analysis
• Concentrate on worst-case analysis
– The worst-case running time is an upper-bound on the running time
for any input
– For some algorithms, the worst case occurs fairly often
• For sequential search  searches for absent information
– The “average case” is often roughly as bad as the worst case
• How long does it take to determine where in subarray A[1..j-1] to
insert A[j]?
– Almost half  tj = j/2  quadratic function

30
Why Is Analysis of Algorithm
Important?
• Analysis of algorithm is critical when the input size is very
very large
– In the New York city, n > 20,000,000
• 50,000 comparisons per second
20000000                 1 sec onds
comparisons *                          200 sec onds
2                  50000        comparison

31
Space Analysis of
Sequential Search
• Space Efficiency
– Be judged by the amount of information the algorithm must store in
the computer’s memory in order to do the job, in addition to the initial
data on which the algorithm is operating
• Sequential search: very space-efficient
– Initial data: list of names and the target NAME
– Additional information: i and Found

32
Order of Magnitude (數量級)
• Order of Growth
• Why can we ignore the constants?
– The worst-case behavior of sequential search: cn
• See next slide for different constant factors (c=2, 1, ½)
– Increase at different rate, but same basic straight line shape
• Anything that varies as a constant time n (and whose graph
follows the basic shape of n) is said to be order of magnitude n,
written  (n)
• Sequential search is an  (n) algorithm in both the worst and
average case
• Consider only the leading term of a formula
– an2 + bn + c   ( n )
2

33
The same straight-line shape

34
See Another Example
1            2               3         4
1          243          187             314       244
2          215          420             345       172
3          197          352             385       261
4          238          764             125       552

Write an algorithm to print each cell –
For each of rows 1 through 4 do the following
For each of columns 1 through 4 do the following
write the entry in this row and column

 What’s the order of magnitude of this algorithm?  ( n )
2

 Think about you have n rows and n columns
 Can you convert the above algorithm into the while or repeat formats?

35
The same n2 shape

36
Now We Have Two Shape
Classifications…
• The value of the constants does not affect the classification,
which is why we can generally ignore it
• Is it important to distinguish the two different order of
magnitude of n and n2 ?
– Yes. Because cn2 grows at a much faster rate than c’n
• Even if a small c and large c’

n2 > n when n > 1  an  (n 2 )
algorithm does more work
than an  (n) algorithm

37
For large enough n, 0.25n2
has larger values than 10n

 (n 2 ) eventually has larger
values than  (n)

38
A Comparison of Two Extreme
 (n ) and  (n) Algorithms
2

Input size of 1,000,000 are not uncommon – think of the New York City
telephone list                       39
Brief Summary – Order of
Magnitude
• If an  (n 2 ) algorithm and an  (n) algorithm exist for the
same task, then for large enough n, the  (n 2 ) one does more
work and takes longer time to execute, regardless of the
constant factors for peripheral work.
– This is the rationale for ignoring the constant factors and
concentrating on the basic order of magnitude of algorithms
• It is for large values of input n that we need to be concerned
about the time resources being used
– We need to have an algorithm with smaller order of magnitude
• For small n, the value of the constant factor is significant
• Don’t make assumptions about input size

40
Designing Algorithms

41
Divide-and-Conquer
Approach
• Divide-and-conquer approach
– To solve a given problem, the algorithm call themselves recursively
one or more times to handle closely related sub-problems.
– Break a problem into several sub-problems that are similar to the
original problem but smaller in size, solve the sub-problem
recursively, and then combine these solutions to create a solution to
the original problem.
• Three steps in a divide-and-conquer approach
– Divide the problem into a number of sub-problems
– Conquer the sub-problems by solving them recursively. If the sub-
problem sizes are small enough, just solve them in a straightforward
manner
– Combine the solutions to the sub-problems into the solution for the
original problem

42
Merge Sort

• Divide: divide the n-element sequence to be sorted into two
subsequences of n/2 elements each
• Conquer: sort the two subsequences recursively using
merge sort
• Combine: merge the two sorted subsequences to produce

43
Initial Call:
MERGE-SORT(A, 1, length[A])

If p  r, the subarray has at most one
element and is therefore already sorted.

44
45
Merge

• MERGE(A, p, q, r)
– A is an array and p, q, r are indices numbering elements of A such
that p  q < r
– A[p..q] and A[q+1..r] are in sorted order
– Merge A[p..q] and A[q+1..r] to form a single sorted subarray that
replaces the current subarray A[p..r]
–    (n) : n = r – p + 1

46
// Compute # of elements in L
// Compute # of elements in R

// Copy A[p, q] to L

// Copy A[q+1, r] to R

// Put a sentinel card at the end of L
// Put a sentinel card at the end of R

// Put the smaller of
L[i] and R[j] to A[K]

47
48
Merge
49
Merge
Loop Invariant of Merge

• At the start of each iteration of the for loop of lines 12 – 17,
the subarray A[p..k-1] contains the k-p smallest elements of
L[1..n1+1] and R[1..n2+1], in sorted order. Moreover, L[i] and
R[j] are the smallest elements of their arrays that have not
been copied back into A.
• Proof: see pp. 30 – 31

50
Analyzing Divide-and-
Conquer Algorithms
• Describe the running time of a divide-and-conquer algorithm
by recurrence equation

T ( n)    
 (1)
aT ( n / b ) D ( n ) C ( n )
nc
Otherwise
– T(n): the running time on a problem of size n
– If the problem size is small enough (n  c), constant time (Θ(1))
– Divide the problem to a sub-problems, each of which is 1/b the size
of the original
• For Merge sort, the worst case running time  Θ(nlgn)


 (1)
T (n)  2T ( n / 2)( n )
If n = 1
If n > 1
lgn = log2n

51
Another view of
the Analysis of
+1        merge sort

n=2m
n/2i = 1
i = m = lg n

52
Example
n=8
8c
n=4

4c                              4c                                   4c

2c                2c           2c                  2c                2c                2c

c         c        c        c   c        c         c         c        c        c        c        c

53

```
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
 views: 5 posted: 7/23/2011 language: English pages: 53
pptfiles