# CS221 Algorithms and Data Structures Lecture #1 Complexity Theory

Document Sample

```					                                                                                    Today’s Outline
CS221: Algorithms and
Data Structures                                     •    Programming Project #1 and Forming Teams
•    Brief Proof Reminder
Lecture #1
•    Asymptotic Analysis, Briefly
Complexity Theory and                                  •    Silicon Downs and the SD Cheat Sheet
Asymptotic Analysis                                   •    Asymptotic Analysis, Proofs and Programs
Steve Wolfman                              •    Examples and Exercises
2009W1

1                                                                  2

Learning Goals                                                     Learning Goals
By the end of this unit, you will be able to…                     By the end of this unit, you will be able to…
•    Define which program operations we measure in an             •    Compute the worst-case asymptotic complexity of an
algorithm in order to approximate its efficiency.                 algorithm (e.g., the worst possible running time based on
•    Define “input size” and determine the effect (in terms of         the size of the input (N)).
performance) that input size has on an algorithm.            •    Categorize an algorithm into one of the common
•    Give examples of common practical limits of problem size          complexity classes.
for each complexity class.                                   •    Explain the differences between best-, worst-, and average-
•    Give examples of tractable, intractable, and undecidable          case analysis.
problems.                                                    •    Describe why best-case analysis is rarely relevant and how
•    Given code, write a formula which measures the number of          worst-case analysis may never be encountered in practice.
steps executed as a function of the size of the input (N).   •    Given two or more algorithms, rank them in terms of their
time and space complexity.
3                                                                  4
Continued…

Proof by...
Prog Proj #1 & Teams                                  • Counterexample
– show an example which does not fit with the theorem
– QED (the theorem is disproven)
– assume the opposite of the theorem
– QED (the theorem is proven)
• Induction
–   prove for a base case (e.g., n = 1)
–   assume for an anonymous value (n)
–   prove for the next value (n + 1)
5                                                                  6
–   QED
Example Proof by Induction                                       Example Still Continued
A number is divisible by 3 iff the sum of its digits is       Induction Step:
divisible by three

Base Case:

7                                                          8

Example Proof by Induction                                    Example Proof by Induction
(Worked)                                                      (Worked)
“A number is divisible by 3 iff the sum of its digits is      “A number is divisible by 3 iff the sum of its digits is
divisible by three.”                                          divisible by three.”
First, some definitions:                                      There are many ways to solve this, here’s one.
Consider a positive integer x to be made up of its n          We’ll prove a somewhat stronger property, that for a
digits: x1x2x3..xn.        n
non-negative integer x with any positive integral
For convenience, let’s call ∑ xi SD(x).                         number of digits n, SD(x) mod 3 = x mod 3.
i =1

9                                                         10

Example Proof by Induction                                    Example Proof by Induction
(Worked)                                                      (Worked)
Base case:                                                    Induction hypothesis:
Consider any number x with one digit (0-9).                   Assume for an arbitrary integer n > 0 that for any
non-negative integer x with n digits:
i
SD(x) =   ∑x
i =1
i   = x1 = x.                                SD(x) mod 3 = x mod 3.

So, it’s trivially true that SD(x) mod 3 = x mod 3.

11                                                            12
Example Proof by Induction                                     Example Proof by Induction
(Worked)                                                       (Worked)
Inductive step:                                             Inductive step continued:
Consider an arbitrary number y with n + 1 digits.           Now, note that y = z*10 + yn+1.
We can think of y as being made up of its digits:           So: y mod 3 = (z*10 + yn+1) mod 3
y1y2…ynyn+1. Clearly, y1y2…yn (which we’ll call                           = (z*9 + z + yn+1) mod 3
z) is itself an n digit number; so, the induction
hypothesis applies:
z*9 is divisible by 3; so, it has no impact on the
SD(z) mod 3 = z mod 3.                                        remainder of the quantity when divided by 3:

13                        = (z + yn+1) mod 3               14

Example Proof by Induction
Today’s Outline
(Worked)
Inductive step continued:                                   •   Programming Project #1 and Forming Teams
By the IH, we know z mod 3 = SD(z) mod 3.                   •   Brief Proof Reminder
So:                                                         •   Asymptotic Analysis, Briefly
= (z + yn+1) mod 3                       •   Silicon Downs and the SD Cheat Sheet
= (SD(z) + yn+1) mod 3                   •   Asymptotic Analysis, Proofs and Programs
= (y1 + y2 + … + yn + yn+1) mod 3        •   Examples and Exercises
= SD(y) mod 3

QED!                                                  15                                                         16

Analysis of Algorithms
A Task to Solve and Analyze                             • Analysis of an algorithm gives insight into how long
the program runs and how much memory it uses
Find a student’s name in a class given her student ID           – time complexity
– space complexity
• Analysis can provide insight into alternative algorithms
• Input size is indicated by a number n (sometimes there
are multiple inputs)
• Running time is a function of n (Z0 → R0) such as
T(n) = 4n + 5
T(n) = 0.5 n log n - 2n + 7
T(n) = 2n + n3 + 3n
17                                                         18
• But...
Rates of Growth
Asymptotic Analysis Hacks
• Suppose a computer executes 1012 ops per second:
• Eliminate low order terms                                      n=          10       100      1,000    10,000 1012
– 4n + 5 ⇒ 4n                                                 n           10-11s   10-10s   10-9s    10-8s     1s
– 0.5 n log n - 2n + 7 ⇒ 0.5 n log n
n log n 10-11s       10-9s    10-8s    10-7s     40s
–   2n   +   n3   + 3n ⇒   2n
n2          10-10s   10-8s    10-6s    10-4s     1012s
• Eliminate coefficients
– 4n ⇒ n                                                      n3          10-9s    10-6s    10-3s    1s        1024s
– 0.5 n log n ⇒ n log n                                       2n          10-9s    1018s    10289s
– n log (n2) = 2 n log n ⇒ n log n

19
104s = 2.8 hrs             1018s = 30 billion years   20

Order Notation                                             Order Notation
• T(n) ∈ O(f(n)) if there are constants c and n0 such        • T(n) ∈ O(f(n)) if there are constants c and n0 such
that T(n) ≤ c f(n) for all n ≥ n0                            that T(n) ≤ c f(n) for all n ≥ n0
• T(n) ∈ Ω (f(n)) if there are constants c and n0 such
that T(n) ≥ c f(n) for all n ≥ n0
• T(n) ∈ θ(f(n)) if T(n) ∈ O(f(n)) and T(n) ∈ Ω (f(n))
• T(n) ∈o(f(n)) if T(n) ∈ O(f(n)) and T(n) ∉ θ(f(n))
• T(n) ∈ω(f(n)) if T(n) ∈ O(f(n)) and T(n) ∉ θ(f(n))

21                                                                  22

Examples                                            Today’s Outline
10,000 n2 + 25 n ∈ θ(n2)                                      •   Programming Project #1 and Forming Teams
10-10 n2 ∈ θ(n2)                                              •   Brief Proof Reminder
n log n ∈ O(n2)                                               •   Asymptotic Analysis, Briefly
n log n ∈ Ω(n)                                                •   Silicon Downs and the SD Cheat Sheet
n3 + 4 ∈ o(n4)                                                •   Asymptotic Analysis, Proofs and Programs
n3 + 4 ∈ω(n2)                                                 •   Examples and Exercises

23                                                                  24
a.   Left
Silicon Downs                                                      b.
c.
Right
Tied

Post #1            Post #2           For each race,                    Race I       d.   It depends

which “horse”
n3 + 2n2           100n2 + 1000      is “faster”.
n3 + 2n2       vs. 100n2 + 1000
Note that faster
n0.1               log n             means smaller,
not larger!
n + 100n0.1        2n + 10 log n
a.    Left
5n5                n!                b.    Right
c.    Tied
n-152n/100         1000n15           d.    It depends
e.    I am
82log n            3n7 + 7n                opposed to
algorithm
mn3                2mn                     racing. 25                                            26

a.    Left                                    a.   Left
b.    Right                                   b.   Right
c.    Tied                                    c.   Tied
Race II           d.    It depends
Race III      d.   It depends

n0.1             vs.     log n                 n + 100n0.1     vs. 2n + 10 log n

27                                           28

a.    Left                                    a.   Left
b.    Right                                   b.   Right
c.    Tied                                    c.   Tied
Race IV           d.    It depends
Race V        d.   It depends

5n5             vs.      n!                   n-152n/100      vs.     1000n15

29                                           30
a.   Left                                                                        a.   Left
b.   Right                                                                       b.   Right
c.   Tied                                                                        c.   Tied
Race VI            d.   It depends
Race VII                           d.   It depends

82log(n)           vs.     3n7 + 7n                          mn3                      vs.                   2mn

31                                                                               32

Silicon Downs
Mounties Find Silicon Downs Fixed
Post #1           Post #2            Winner
• The fix sheet (typical growth rates in order)
n3 + 2n2          100n2 + 1000       O(n2)                 –   constant:           O(1)
–   logarithmic:        O(log n)              (logkn, log n2 ∈ O(log n))
n0.1              log n              O(log n)
–   poly-log:           O(logk n)
n + 100n0.1       2n + 10 log n      TIE O(n)              –   linear:             O(n)
–   log-linear:         O(n log n)
5n5               n!                 O(n5)                 –   superlinear:        O(n1+c)               (c is a constant > 0)
n-152n/100        1000n15            O(n15)
–   cubic:              O(n3)
82log n           3n7 + 7n           O(n6)                 –   polynomial:         O(nk)                 (k is a constant) “tractable”
–   exponential:        O(cn)                 (c is a constant > 1)
mn3               2mn                IT DEPE 33
DS                                                                                   34
“intractable”

Terminology
Today’s Outline
Given an algorithm whose running time is T(n)
•   Programming Project #1 and Forming Teams                   – T(n) ∈ O(f(n)) if there are constants c and n0 such that
T(n) ≤ c f(n) for all n ≥ n0
•   Brief Proof Reminder                                            • 1, log n, n, 100n ∈ O(n)
•   Asymptotic Analysis, Briefly                               – T(n) ∈ Ω(f(n)) if there are constants c and n0 such that
•   Silicon Downs and the SD Cheat Sheet                         T(n) ≥ c f(n) for all n ≥ n0
• n, n2, 100 . 2n, n3 log n ∈ Ω(n)
•   Asymptotic Analysis, Proofs and Programs
– T(n) ∈ θ(f(n)) if T(n) ∈ O(f(n)) and T(n) ∈ Ω(f(n))
•   Examples and Exercises                                          • n, 2n, 100n, 0.01 n + log n ∈ θ(n)
– T(n) ∈ o(f(n)) if T(n) ∈ O(f(n)) and T(n) ∉ θ(f(n))
• 1, log n, n0.99 ∈ o(n)
– T(n) ∈ ω(f(n)) if T(n) ∈ O(f(n)) and T(n) ∉ θ(f(n))
35                                                                               36
• n1.01, n2, 100 . 2n, n3 log n ∈ ω(n)
Types of analysis
Analyzing Code
Orthogonal axes
– bound flavor                                                         •   C++ operations              - constant time
• upper bound (O, o)
•   consecutive stmts           - sum of times
• lower bound (Ω, ω)
• asymptotically tight (θ)                                          •   conditionals                - sum of branches, condition
– analysis case                                                        •   loops                       - sum of iterations
•   function calls              - cost of function body
•   average case
•   best case
•   “common” case
– analysis quality
37
• tight bound (no better bound which is asymptotically different)                                                            38

Analyzing Code                                                              Analyzing Code
// Linear search                                                          // Linear search
find(key, array)                                                          find(key, array)
for i = 1 to length(array) - 1 do                                         for i = 1 to length(array) - 1 do
if array[i] == key                                                        if array[i] == key
return i                                                                  return i
return -1                                                                 return -1

Step 1: What’s the input size n?                                          Step 2: What kind of analysis should we perform?
Worst-case? Best-case? Average-case?
Expected-case, amortized, …

39                                                            40

Analyzing Code                                                              Analyzing Code
// Linear search                                                          // Linear search
find(key, array)                                                          find(key, array)
for i = 1 to length(array) - 1 do                                         for i = 1 to length(array) - 1 do
if array[i] == key                                                        if array[i] == key
return i                                                                  return i
return -1                                                                 return -1

Step 3: How much does each line cost? (Are lines                          Step 4: What’s T(n) in its raw form?
the right unit?)

41                                                            42
Analyzing Code                                                         Analyzing Code
// Linear search                                                      // Linear search
find(key, array)                                                      find(key, array)
for i = 1 to length(array) - 1 do                                     for i = 1 to length(array) - 1 do
if array[i] == key                                                    if array[i] == key
return i                                                              return i
return -1                                                             return -1

Step 5: Simplify T(n) and convert to order notation.                  Step 6: Casually name-drop the appropriate complexity class
(Also, which order notation: O, o, Θ, Ω, ω?)                           in order to sound bracingly cool to colleagues: “Oh, linear
search? That’s tractable, polynomial time. What
polynomial? Linear, duh. See the name?! I hear it’s sub-
43
linear on quantum computers, though. Wild, eh?”            44

Analyzing Code                                                         Today’s Outline
// Linear search                                                      •   Programming Project #1 and Forming Teams
find(key, array)
for i = 1 to length(array) - 1 do                                   •   Brief Proof Reminder
if array[i] == key                                                •   Asymptotic Analysis, Briefly
return i
return -1
•   Silicon Downs and the SD Cheat Sheet
•   Asymptotic Analysis, Proofs and Programs
Step 7: Prove your complexity class by finding constants c            •   Examples and Exercises
and n0 such that for all n > n0, T(n) ≤ cn.

45                                                                     46
You usually won’t do this in practice.

More Examples Than You Can
METYCSSA (#1)
Shake a Stick At (#0)
// Linear search                                                      for i = 1 to n do
find(key, array)                                                        for j = 1 to n do
for i = 1 to length(array) - 1 do                                       sum = sum + 1
if array[i] == key
return i
return -1
Time complexity:
a. O(n)
Here’s a whack-load of examples for us to:                                                          b. O(n lg n)
1. find a function T(n) describing its runtime                                                   c. O(n2)
2. find T(n)’s complexity class                                                                  d. O(n2 lg n)
3. find c and n0 to prove the complexity class                                                   e. None of these
47                                                                     48
METYCSSA (#2)                                                                METYCSSA (#3)
i = 1                                                                       i = 1
while i < n do                                                              while i < n do
for j = i to n do                                                           for j = 1 to i do
sum = sum + 1                                                               sum = sum + 1
i++                                                                         i += i

Time complexity:                                                            Time complexity:
a. O(n)                                                                     a. O(n)
b. O(n lg n)                                                                b. O(n lg n)
c. O(n2)                                                                    c. O(n2)
d. O(n2 lg n)                                                               d. O(n2 lg n)
e. None of these                                                            e. None of these
49                                                                    50

METYCSSA (#4)                                                                METYCSSA (#5)
• Conditional                                                             • Recursion almost always yields a recurrence
if C then S1 else S2
• Recursive max:
if length == 1: return arr[0]
else: return larger of arr[0] and max(arr[1..length-1])
T(1) <= b
• Loops                                                                     T(n) <= c + T(n - 1)                       if n > 0
while C do S                                                            • Analysis
T(n)   <=   c + c + T(n - 2)          (by substitution)
T(n)   <=   c + c + c + T(n - 3)      (by substitution, again)
T(n)   <=   kc + T(n - k)             (extrapolating 0 < k ≤ n)
T(n)   <=   (n – 1)c + T(1) = (n – 1)c + b (for k = n - 1)
• T(n) ∈
51                                                                    52

METYCSSA (#6): Mergesort
METYCSSA (#7): Fibonacci
• Mergesort algorithm
– split list in half, sort first half, sort second half, merge together   • Recursive Fibonacci:
• T(1) <= b                                                                   int Fib(n)
T(n) <= 2T(n/2) + cn                         if n > 1                         if (n == 0 or n == 1) return 1
• Analysis                                                                      else return Fib(n - 1) + Fib(n - 2)
T(n) <=   2T(n/2) + cn                                                    • Lower bound analysis
<=   2(2T(n/4) + c(n/2)) + cn                                        • T(0), T(1) >= b
=   4T(n/4) + cn + cn                                                 T(n) <= T(n - 1) + T(n - 2) + c            if n > 1
<=   4(2T(n/8) + c(n/4)) + cn + cn
=   8T(n/8) + cn + cn + cn
• Analysis
<=   2kT(n/2k) + kcn           (extrapolating 1 < k ≤ n)               let φ be (1 + √5)/2 which satisfies φ2 = φ + 1
<=   nT(1) + cn lg n           (for 2k = n or k = lg n)                                                       φ
show by induction on n that T(n) >= bφn - 1

• T(n) ∈                                                           53                                                                    54
Example #7:
Example #7 continued
Learning from Analysis
φ
• Basis: T(0) ≥ b > bφ-1 and T(1) ≥ b = bφ0  φ                      • To avoid recursive calls
φ
• Inductive step: Assume T(m) ≥ bφm - 1 for all m < n                  – store all basis values in a table
T(n) ≥   T(n - 1) + T(n - 2) + c                                     – each time you calculate an answer, store it in the table
≥    φ       φ
bφn-2 + bφn-3 + c                                           – before performing any calculation for a value n
≥    φ    φ
bφn-3(φ + 1) + c                                                 • check if a valid answer for n is in the table
=    φ
bφn-3φ2 + c                                                      • if so, return it
≥    φ
bφn-1
• This strategy is called “memoization” and is
• T(n) ∈                                                              closely related to “dynamic programming”
• Why? Same recursive call is made numerous times.

55    • How much time does this version take?                         56

Final Concrete Example (#8):                                              Abstract Example (#9):
Longest Common Subsequence                                                        It’s Log!
• Problem: given two strings (m and n), find the                   Problem: find a tight bound on T(n) = lg(n!)
longest sequence of characters which appears in
order in both strings
– lots of applications, DNA sequencing, blah, blah, blah

Time complexity:
• Example:                                                                                             a. O(n)
– “search me” and “insane method” = “same”                                                          b. O(n lg n)
c. O(n2)
d. O(n2 lg n)
e. None of these
57                                                                    58

Log Aside                                      Asymptotic Analysis Summary
logab means “the exponent that turns a into b”                    • Determine what characterizes a problem’s size
lg x means “log2x” (our usual log in CS)                          • Express how much resources (time, memory, etc.)
log x means “log10x” (the common log)                               an algorithm requires as a function of input size
ln x means “logex” (the natural log)                                using O(•), Ω(•), θ(•)
–   worst case
–   best case
But… O(lg n) = O(log n) = O(ln n) because:                           –   average case
logab = logcb / logca (for c > 1)                             –   common case
so, there’s just a constant factor between log bases               –   overall
59                                                                    60
Some Well-Known Horses                                       Aside: Who Cares About Ω(lg (n!))?
from the Downs                                            Can You Beat O(n lg n) Search?
For general problems (not particular algorithms):                    Chew these over:
We can prove lower bounds on any solution.                         1. How many values can you represent with n bits?
We can give example algorithms to establish “upper                 2. Comparing two values (x < y) gives you one bit of
bounds” for the best possible solution.                               information.
3. There are n! possible ways to reorder a list. We
Searching an unsorted list using comparisons:                           could number them: 0, 1, 2, …, n!
provably Ω(n), linear search is O(n).
4. Sorting basically means choosing which of those
Sorting a list using comparisons:                                       reorderings/numbers you’ll apply to your input.
provably Ω(n lg n), mergesort is O(n).                             5. How many comparisons does it take to pick among
n! numbers?
61                                                                    62

Some Well-Known Horses                                               Some Well-Known Horses
from the Downs                                                       from the Downs
• Searching and Sorting: polynomial time, tractable                  • Searching and Sorting numbers: P, tractable
• Traveling Salesman Problem: non-deterministic                      • Traveling Salesman Problem: NP, intractable
polynomial… can check a guess in polynomial                        • Kolmogorov Complexity: uncomputable
time, maybe exponential time to solve.
The Kolmogorov Complexity of a string is its shortest
representation in a given language. In other words, it tells
us how much the string can possibly be compressed.

Can’t be computed. Pithy but hand-wavy proof: What’s:
“The smallest positive integer that cannot be described in
63        fewer than fourteen words.”                                64
Are problems in NP really in P? \$1M prize to prove yea or nay.

Learning Goals                                                        Learning Goals
By the end of this unit, you will be able to…                        By the end of this unit, you will be able to…
•    Define which program operations we measure in an                •    Compute the worst-case asymptotic complexity of an
algorithm in order to approximate its efficiency.                    algorithm (e.g., the worst possible running time based on
•    Define “input size” and determine the effect (in terms of            the size of the input (N)).
performance) that input size has on an algorithm.               •    Categorize an algorithm into one of the common
•    Give examples of common practical limits of problem size             complexity classes.
for each complexity class.                                      •    Explain the differences between best-, worst-, and average-
•    Give examples of tractable, intractable, and undecidable             case analysis.
problems.                                                       •    Describe why best-case analysis is rarely relevant and how
•    Given code, write a formula which measures the number of             worst-case analysis may never be encountered in practice.
steps executed as a function of the size of the input (N).      •    Given two or more algorithms, rank them in terms of their
time and space complexity.
65                                                                    66
Continued…
To Do                                         Coming Up
• Sign up as a pair on Vista…                   •   Recursion and Induction
with someone from your lab!        •   Loop Invariants and Proving Program Correctness
• Start HW 1 (posted soon!)                     •   Call Stacks and Tail Recursion
• Start Project 1 (posted soon!)                •   First Written Homework due (Sep 23)
• Read Epp 9.2-9.3 and Koffman 2.6              •   First Programming Project due (Sep 30)
• Prepare for Lab 1 (Read and Ponder)

67                                                     68

```
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
 views: 131 posted: 7/30/2010 language: English pages: 12
How are you planning on using Docstoc?