# Computer Programming II — Complexity Analysis

Document Sample

```					Computer Programming II —
Complexity Analysis

January 26, 2010
Programming Objectives

• a correct solution for the correct problem;

• easy to maintain solution (readable, modi-
ﬁable, evolvable)

• eﬃcient solution (in terms of time and re-
source, or space, requirements)

– compared with other possible solutions

– compared with the diﬃculty of the prob-
lem itself, i.e., with the best possible
solution

1
Program eﬃciency

• Program eﬃciency is usually used to com-
pare algorithms, deciding which one is faster.
Sometimes it is also used to judge how
close we are to an optimal algorithm for
a particular problem.

• Two ways to measure the time eﬃciency
of a program:

– timing the executable on a variety of
datasets
— impractical because the time depends
heavily on the type of system and asso-
ciated devices

– theoretical analysis of the algorithm or
source code
— diﬃcult, so needs to simplify a bit,
make some assumptions

2
Steps of Theoretical Analysis

• Deﬁne the problem size.

• Pick the most representative operation of
the algorithm.

• Count only the number of times the picked
operation gets performed in the algorithm,
use that to represent the eﬃciency of the
algorithm. And the number is usually a
function of the problem size (or input size).

• Decide the order of the algorithm (i.e., keep
only the most dominant term in the func-
tion).

3
Step 2 – Commonly picked operations

• the number of function calls invoked

• the number of multiplication operations car-
ried out

• the number of assignment operations

• for sorting/searching, the number of com-
parisons

• for array algorithms, the number of mem-
ory accesses

• for linked list algorithms, the number of
nodes visited

4
Step 4 — Orders of eﬃciency

• System loads, diﬀerent processors and co-
processors, and I/O capabilities can easily
aﬀect program speed by an order of magni-
tude, and these things change all the time
regardless of our speciﬁc choice of solution
for a problem

• If we want to compare possible algorithms
before we code them up, and possibly be-
fore we know exactly what kind of hardware
they’ll be running on, we need a general
approach that can ignore ”minor” levels of
detail

• When comparing algorithms on a theoret-
ical level we’ll only consider major diﬀer-
ences of scale — we talk about the order
of an algorithm as a representation of the
eﬃciency at a very general scale

5
Step 4 — Orders of eﬃciency (cont)

• For a data set of size N (e.g. an array of N
elements), the algorithms running with the
following time eﬃciency are said to have
the same ORDER:

3N 2, 9N 2, 77N 2, 12N 2 + 99

• To describe all the functions which are of
the same order, we determine the core un-
derlying function (e.g. f (N ) = N 2) and
use the notation O(f (N )) to describe the
set of all function in the same order. So
all the above functions are in the family of
O(N 2).

• In fact, when the core underlying function
is a polynomial, the only term we are in-
terested in is the highest degree term.

6
Step 4 — Orders of eﬃciency (cont)

• To see why this simpliﬁcation is generally
accepted, observe the following:
N   N2     N2 +    120N + 10000
100   104    3.200   * 104
1000   106    1.130   * 106
10000   108    1.012   * 108
100000   1010   1.001   * 1010

Note that for large values of N the diﬀer-
ence that the constants make (and the dif-
ference the ”lesser degree” terms make) is
trivial when comparing algorithms of dif-
ferent orders

7
Some common orders

• The big-O notation (e.g. O(N 2)) is used
to describe the order of an algorithm, i.e.
an approximation of its running time given
a data set of size N.

• We write the orders in the simpliﬁed ver-
sions - i.e. with scalar constants and lesser
terms dropped

• Some of the common orders, and their val-
ues for a few values of N, are:
N    log2(N )   N log(N )          N2            N3
2         1           2            4              8
8         3          24           64           512
32         5         160         1024         32768
128         7         896        16384       2097152
512         9        4608       262144     134217728
1024        10       10240      1048576   1073741824
16384        14     229376     268435456   4.398 * 1012

8
Big-O for searching and sorting

• Here are the performance orders for best-
case, average-case, and worst-case behaviour
of some sorting and searching algorithms

• All assume a list of N data items
Algorithm           Best-case   Average-case   Worst-case
Sequential search   O(1)        O(N)           O(N)
Binary search       O(1)        O(logN)        O(logN)
Selection sort      O(N 2)      O(N 2)         O(N 2)
Bubblesort          O(N)        O(N 2)         O(N 2)
Insertion sort      O(N)        O(N 2)         O(N 2)
Quicksort           O(NlogN)    O(NlogN)       O(N 2)

O(1) means at most some constant num-
ber of operations are performed, e.g. solv-
ing the problem in less than or equal to 3
steps

• It can be proven that you cannot have a
general sorting algorithm with worst case
behaviour better than O(NlogN)

9
Some examples

• We try to measure algorithm complexity
relative to the inputs and parameters, e.g.
suppose M and N are input values below

for (int m = 0; m < M; m++) {
cout << "m is " << m;
for (int n = 0; n < N; n++) {
sum = sum + n;
foo = foo * n + m;
}
cout << " sum and foo are ";
cout << sum << " " << foo << endl;
}

• The code fragment performs O(M * N)
operations:

– it goes through the outer loop M times,

– for each outer loop pass it goes through
the inner loop N times, (and performs a
couple of extra operations)
10
– for each inner loop pass it performs a
ﬁxed number of operations

• M ∗ (c1 + (N ∗ c2)) ==> O(M ∗ N )
Example calculations

• show the following is O(M)

for (int i = 6; i < (2*M); i++) {
foo[i] = 3 * i * i;
foo[i-6] = i;
}

• show the following is O(M + N)

for (int i = 0; i < M; i++) {
cout << i;
}
for (int j = 0; j < N; j++) {
cin >> arr[j];
}

• show the following is O(M)

for (int i = 0; i < 600; i++) {
count = 0;
while (count < M) {
11
cout << i * M << endl;
count++;
}
}

• show the following is O(M * N)

for (int i = 0; i < (M/2); i++) {
for (int j = 0; j < N; j += 3) {
cout << i * j << endl;
}
}
Pick the optimal solution

• One thing to observe is that diﬀerent algo-
rithms might be better for solving a prob-
lem at diﬀerent sizes of N. Suppose:

– We have one bubblesort-based algorithm
whose eﬃciency we calculated at (2N 2+
7) operations

– We have a merge-sort-based algorithm
whose eﬃciency we calculated at (50N log2(N ))
operations
N     bubble sort   merge sort
64    8199          19200
1024   2097159       512000

• Thus whenever you choose an algorithm
based on eﬃciency analysis it is important
to know something about the expected size