Divide-and-Conquer
• a technique for designing algorithms
– decomposing instance to be solved into
subinstances of the same problem
– solving each subinstance
– combining subsolutions to obtain the
solution to the original instance
981×1234 ⇒ 0981×1234
Multiply Shift Result
n
09 12 4 108––––
n/2
09 34 2 306––
n/2
81 12 2 972––
81 34 0 2754
1210554
1
Multiplying Large
Integers
• Multiplying two n-figure integers
– classic algorithm takes Θ(n2) time
– divide-and-conquer: no improvement
• 4 × Θ(n2 / 4) = Θ(n2)
• reducing original multiplication to
three half-size multiplicantions
– pad the shorter operand
• 981 ⇒ 0981
– split each operand into two halves
• 0981 ⇒ w = 09, x = 81
• 1234 ⇒ y = 12, z = 34
• 981 × 1234 = (102w + x) × (102y + z)
= 104wy + 102(wz + xy) + xz
• let p = wy, q = xz, r = (w + x) × (y + z) = wy +
(wz + xy) + xz
• 981 × 1234 = 104p + 102(r - p - q) + q
⇒ three half-size multiplications
2
Three Half-Sized
Multiplications
• Is it worth?
– we perform four more additions to
save one multiplication
– it is worthwhile when the numbers
to be multiplied are large
• the time needed
– classic algorithm takes h(n) = n2 time
– if each multiplication is carried out
by the classic algorithm for additions,
shifts, etc.
• 3h(n/2) + g(n) = 3/4 h(n) + g(n)
– using our new algorithm recursively
to solve subinstances
• t(n)= 3t(n/2)+g(n)= Θ(nlg3|n is a power of 2)
– in fact the new algorithm is slower
than the classic one on instances
that are too small
3
The General Template
function DC(x)
if x is sufficiently small or simple then
r etur n adhoc(x)
decompose x into smaller instances x1, x2, … , xl
for i ← 1 to l do yi ← DC(xi)
recombine the yi’s to obtain a solution y for x
r etur n y
• The basic subalgorithm
– adhoc: a simple algorithm that can
solve small instances efficiently
• l is usually small
– and independent of the particular
instance to be solved
– l = 1⇒ simplification algorithm
4
More on Divide-and-
Conquer
• Three conditions must be met
– the decision when to use the basic
subalgorithm must be taken
– must be possible to decompose an
instance into subinstances
– possible to recombine subsolutions
fairly efficiently
• Running-time analysis
– the size of each subinstance is n/b
– t(n) = l t(n ÷ b) + g(n)
– if there exists integer k s.t. g(n)= Θ(nk)
Θ(n k ) if l b k
5
When to Use the
Basic Subalgorithm?
• Usually
– when instance size does not exceed
a threshold n0
• The running time for multiplying
large integers
h ( n ) if n ≤ n0
t (n) =
3t ( n / 2) + g ( n) otherwise
– where h(n)=Θ(n2) and g(n)=Θ(n)
• Given 5000-figure numbers
– the classic algorithm takes 25 sec
– if n0=1, it takes more than 41 sec
– if n0=64, it takes just over 6 sec
– the threshold can be determined
empirically
6
Binary Search
• An application of simplification
function binsearch(T[1..n], x)
if n = 0 or x > T[n] then r etur n n + 1
else r etur n binrec(T[1..n], x)
function binrec(T[i..j], x)
if i = j then r etur n i
k ← (i + j) ÷ 2
if x ≤ T[k] then r etur n binrec(T[i..k], x)
else r etur n binrec(T[k+1..j], x)
1 2 3 4 5 6 7 8 9 10 11 x = 12
-5 -2 0 3 8 8 9 12 12 26 31 x ≤ T[k]?
i k j no
i k j yes
i k j yes
ik j no
ij i = j : stop
7
Binary Search: Time
Analysis
• Let t(m) be the time required for a
call on binrec(T[i..j], x)
– m=j-i+1
• When m > 1
– t(m) = t(m / 2) + g(m)
– g(m)=O(1)=O(m0)
t(n) = l t(n ÷ b) + g(n)
if there exists integer k s.t. g(n)= Θ(nk)
Θ( n k ) if l bk
– l = 1, b = 2, k = 0
⇒ t(m) = Θ(log m)
8
Sorting by Merging
• To sort elements in T[1..n] into
ascending order
• A divide-and-conquer approach
– separating T into two parts
– sorting these parts by recursive calls
– merging the solutions for each part
• need an efficient algorithm for
merging sorted arrays U and V
pr ocedur e merge(U[1..m+1],V[1..n+1],T[1..m+n])
i, j ← 1
U[m+1], V[n+1] ← ∞
for k ← 1 to m + n do
if U[i] p or k ≥ j
r epeat l ← l - 1 until T[l] ≤ p
while k p
the r epeat l ← l - 1 until T[l] ≤ p
pivot swap T[i] and T[l]
3 1 4 1 5 9 2 6 5 3 5 8 9
3 1 4 1 5 9 2 6 5 3 5 8 9
3 1 3 1 5 9 2 6 5 4 5 8 9
3 1 3 1 5 9 2 6 5 4 5 8 9
3 1 3 1 2 9 5 6 5 4 5 8 9
3 1 3 1 2 9 5 6 5 4 5 8 9
2 1 3 1 3 9 5 6 5 4 5 8 9
14
k l
The Quicksort
Algorithm
pr ocedur e quicksort(T[i..j])
if j - i is sufficiently small then insert(T[i..j])
else
pivot(T[i..j], l)
quicksort(T[i..l - 1])
quicksort(T[l + 1..j])
• If T is already sorted
– we get l = i each time
– quicksort takes a time of Ω(n2)
⇒ use the median element as the pivot
• If T is initially in random order
– assume that
• all elements of T are distinct
• each of the n! possible permutations of
the elements is equally likely
– average time: O(n log n)
15
Algorithm Pivotbis
• Quicksort takes quadratic time in
the worst case
– even if the median is chosen as pivot
– occurs if all elements of T are equal
• pr ocedur e pivotbis(T[i..j], p; var k, l)
– partitions T into three sections
i j
p
i k k+1 l-1 l j
to be sorted to be sorted
– sorting an array of equal elements
takes linear time
– worst-case time: O(n log n)
16
Finding the Median
• The s-th smallest element of T
– is in the s-th position if T were sorted
– median: the n/2-th smallest element
• selection problem
– finding the s-th smallest element of T
– an algorithm for selection problem can
be used to find the median
– takes Θ(n log n) time if • sort T ‚
extracting its s-th entry
• finding the s-th smallest element d
– Let p be median(T[1..n])
– pivot T around p using pivotbis(T, p, k, l)
– we are done if k < s < l
– if s ≤ k, d is the s-th smallest element of
T[1..k]
– if s ≥ l, d is the (s - l + 1)-th smallest
element of T[l..n]
17
Selection Algorithm
function selection(T[1..n], s)
i ← 1; j ← n
r epeat
p ← median(T[i..j])
pivotbis(T[i..j], p, k, l)
if s ≤ k then j ← k
Example: else if s ≥ l then i ← l
s=4 p else r etur n p
3 1 4 1 5 9 2 6 5 3 5 8 9
k l pivotbis
3 1 4 1 2 3 5 5 5 9 6 8 9
j
3 1 4 1 2 3 • • • • • • •
k l pivotbis
1 1 2 3 4 3 • • • • • • •
i
• • • 3 4 3 • • • • • • •
l pivotbis
• • • 3 3 4 • • • • • • •
18
Need to Choose the
Median as Pivot?
• Not necessary
– selection works regardless of which
element is chosen as pivot
– using the median is only for efficiency
• at least halved each time round the loop
– simply choosing T[i] as pivot?
• quadratic time in the worst case
• linear time on the average
• good approximation to the median
– divide n elements into n/5 groups of
5 elements each
– find mi, the median of each group i
– find mm, the median of {mi|1≤i ≤ n/5}
– mm is an approximation of the median
19
Median Approximation
function pseudomed(T[1..n])
if n ≤ 5 then r etur n adhocmed(T)
z ← n / 5
ar r ay Z[1..z]
for i ← 1 to z do Z[i] ← adhocmed(T[5i-4..5i])
r etur n selection(Z, z/2)
• Z[i]: the median of T[5i-4..5i]
– at least 3 are less than or equal to it
.at least z/2 elements of Z are less
than or equal to mm
⇒at least 3z/2 elements of T are
less than or equal to mm
. z = n / 5 ≥ (n - 4) / 5
⇒at least (3n - 12)/10 elements of T
are less than or equal to mm
20
Matrix Multiplication
• Let A and B be two n×n matrices,
and let C be their products
• classic matrix multiplication
n
C ij = ∑A
k =1
ik Bkj
M M B1 j
L Cij L A1 A2 L A L B2 j L
= i i in
M M M
Bnj
• Assuming scalar addition and
multiplication are elementary
⇒Θ(n3) time
21
s
Strassen’ Algorithm
• Let a a b b
A = 11 12 and B = 11 12
a b
21 a 22 21 b22
• Consider
m1 = (a 21 + a 22 - a 11)(b22 - b12 + b11)
m2 = a 11b12
m3 = a 12b21
m4 = (a 11 - a 21)(b22 - b12)
m5 = (a 21 + a 22)(b12 - b11)
m6 = (a 12 - a 21 + a 11 - a 22) b22
m7 = a 22(b11 + b22 - b12 - b21)
• We have
m2 + m3 m1 + m2 + m5 + m6
C =
m + m + m − m
1 2 4 7 m1 + m2 + m4 + m5
• use only 7 scalar multiplications
22
s
Strassen’ Algorithm:
Time Analysis
• A divide-and-conquer algorithm
– replacing each entry of A and B by
an n×n matrix
⇒multiply two 2n×2n matrices by only
7 multiplications of n×n matrices
• t(n): time needed to multiply two
n×n matrices (n is a power of 2)
– t(n) = 7t(n/2) + g(n)
• g(n) = Θ(n2) the time needed for matrix
addition and subtraction
– Eq. 7.1 applies with l=7,b=2, and k=2
⇒t(n) = Θ(nlg7) = O(n2.81)
• fastest matrix multiplication
known: O(n2.376)
23
Exponentiation
• Compute the exponentiation x =a n
function exposeq(a , n)
r←a
for i ← 1 to n - 1 do r ← a × r
r etur n r
• this algorithm takes Θ(n) time
– provided the multiplications are
counted as elementary operations
• However, even small values of n
and a cause integer overflow
– 1517 does not fit in a 64-bit integer
• we must take account of the time
required for each multiplication
24
Time for Each
Multiplication
• Notation
– M(q, s): the time needed to multiply
two integers of sizes q and s
• in decimals, in bits, or in any other basis
– m: the size of a
– r i and mi: value and size of r at the
beginning of the i-th loop iteration
– T(m, n): total time on computing a n
• product of two integers of sizes i
and j is of size at least i + j - 1 and
at most i + j
1 r 1 = a ⇒ m1 = m
1 r i+1 = ar i ⇒ m + mi - 1 ≤ mi+1 ≤ m + mi
⇒ im - i + 1 ≤ mi ≤ im for all i ⇒ …
⇒ Σ M(m, im- i+1) ≤ T(m, n) ≤ Σ M(m, im)
classic multiplication divide-and-conquer
M(q, s) = Θ(qs) M(q, s) = Θ(sqlg(3/2))
T(m, n) = Θ(m2n2) T(m, n) = Θ(mlg3n2) 25
Improving exposeq
• Key observation
– a n = (a n/2)2 when n is even
. a n/2 can be computed about four
times faster than a n with exposeq
. plus a single squaring
(multiplication)
a if n = 1
a n = (a n / 2 ) 2 if n is even
a × ( a n / 2 ) 2 otherwise
– a 29 = aa 28 = a (a 14)2 = a ((a 7)2)2 = …
function expoDC(a , n)
if n = 1 then r etur n a
if n is even then r etur n [expoDC(a , n/2)]2
r etur n a × expoDC(a , n - 1)
26
Time Analysis:
expoDC
• N(n): the number of multiplication
0 if n = 1
N ( n ) = N ( n / 2) + 1 if n is even
N ( n − 1) + 1 otherwise
⇒ N(n) is Θ(log n)
• T(m, n): time spent multiplying by
a call on expoDC(a , n)
multiplication
classic D&C
exposeq Θ(m2n2) Θ(mlg3n2)
expoDC Θ(m2n2) Θ(mlg3nlg3)
27