Document Sample

Analysis of Algorithms Complexity Analysis Number of CPU cycles it takes to run an algorithm depends on the computer on which the algorithm is run. Count of how many instructions are executed depends on the programming language used to implement the algorithm and the way the programmer writes the program. We want a measure of algorithm efficiency that is independent of the computer, the programming language, the programmer, and all the complex details of the algorithm such as incrementing of loop indices, setting of pointers, etc. In general, the running time of an algorithm increases with the size of the input, and the total running time is roughly proportional to how many times some basic operation (such as a comparison instruction) is done. We therefore analyze the algorithm’s efficiency by determining the number of times some basic operation is done as a function of the input size. This is called a time complexity analysis of an algorithm. The basic operation may be a single instruction or a group of instructions; in some cases, we may want to consider more than one basic operation. The input size may be easy to determine – such as the size of an array for Sequential or Binary Search -- or it may be more difficult. In some algorithms, the basic operation is always done the same number of times for every instance of size N. When this is the case, the every-case time complexity of the algorithm, T(n), is defined as the number of times the algorithm does the basic operation for an instance of size n. Example Algorithm 1: Adding up the elements of an array. for (i = 0, sum = 0; i < n; i++) sum += array[i]; Every-Case Time Complexity Analysis Other than control instructions, the only instruction in the loop is the one that adds an item in the array to sum. Therefore, we will call that instruction the basic operation. Basic operation: the addition of an item in the array to sum. Input size: n, the number of items in the array. Regardless of the values of the numbers in the array, there are n passes through the for loop. Therefore, the basic operation is always done n times and T(n) = n. In some cases, the time complexity analysis may be dependent on not only the input size but also on the input’s values: Example Algorithm 2: Sequential search of an unsorted array. for (i = 0; i < n; i++) if (searchkey == array[i]) break; if (i == n) cout << "Unsuccessful search\n"; else cout << "Found " << array[i] << endl; In this algorithm, the basic operation is the comparison of searchkey and an array element, but it is not done the same number of times for all instances of size n. So this algorithm does not have an every-case time complexity. However, we can still analyze such an algorithm, because there are three other analysis techniques that can be tried. The first is to consider the maximum number of times the basic operation is done. For a given algorithm, W(n) is defined as the maximum number of times the algorithm will ever do its basic operation for an input size of n. So W(n) is called the worst-case time complexity of the algorithm. If T(n) exists, then clearly W(n) = T(n). The following is an analysis of W(n) in a case in which T(n) does not exist. Worst-Case Time Complexity Analysis Basic operation: the comparison of an item in the array with searchkey. Input size: n, the number of items in the array. The basic operation is done at most n times, which is the case if searchkey is the last item in the array or if searchkey is not in the array. Therefore, W(n) = n. Second, we may be interested in how an algorithm performs on average. For a given algorithm, A(n) is defined as the average (expected value) number of times the algorithm does the basic operation for an input size of n. A(n) is called the average-case time complexity of the algorithm. If T(n) exists, then A(n) = T(n). If not, to compute A(n) we need to assign probabilities to all possible inputs of size n. Average-Case Time Complexity Analysis Basic operation: the comparison of an item in the array with searchkey. Input size: n, the number of items in the array. We first analyze the case in which it is known that searchkey is in the array, where the items in the array are all distinct, and where we have no reason to believe that searchkey is more likely to be in one array slot than it is to be in another. Based on this information, for 1 ≤ k ≤ n, the probability that searchkey is in the kth array slot is 1/n. If searchkey is in the kth array slot, the number of times the basic operation is done to locate searchkey (and therefore, to exit the loop) is k. This means that the average time complexity is given by n n A(n) = ∑ (k * 1/n) = 1/n * ∑ k = 1/n * n * (n + 1) / 2 = (n + 1) / 2 k=1 k=1 Next we analyze the case in which searchkey may not be in the array. To analyze this case we must assign some probability p to the event that searchkey is in the array. If searchkey is in the array, we will again assume that it is equally likely to be in any of the slots from 1 to n. The probability searchkey is in the kth slot is then p * 1/n or p/n, and the probability that it is not in the array is 1 – p. Recall that there are k passes through the loop if searchkey is found in the kth slot, and n passes through the loop if searchkey is not in the array. The average time complexity is therefore given by n A(n) = ∑ (k * p/n) + n * (1 – p) = p/n * n * (n + 1) / 2 + n * (1 – p) = n * (1 – p/2) + p/2 k=1 If p = 1, A(n) = (n + 1) / 2, as before, whereas if p = 1/2, A(n) = 3n / 4 + 1/4. This means that about 3/4 of the array is searched on the average. A final type of time complexity analysis is the determination of the smallest number of times the basic operation is done. For a given algorithm, B(n) is defined as the minimum number of times the algorithm will ever do its basic operation for an input size of n. B(n) is called the best-case time complexity of the algorithm. If T(n) exists, then B(n) = T(n). B(n) for sequential search: Best-Case Time Complexity Analysis Basic operation: the comparison of an item in the array with searchkey. Input size: n, the number of items in the array. Because n ≥ 1, there must be at least one pass through the loop. If searchkey = array[0], there will be one pass through the loop regardless of the size of n. Therefore, B(n) = 1. For algorithms that do not have every-case time complexities, we do worst-case and average-case analyses much more often than best-case analyses. Order In general, a complexity function can be any function that maps the positive integers to the nonnegative reals. When not referring to the time complexity for some particular algorithm, we usually use standard function notation, such as f(n) and g(n), to represent complexity functions. Example: The functions f(n) = n f(n) = n2 f(n) = lg n f(n) = 3n2 + 4n are all examples of complexity functions because they all map the positive integers to the nonnegative reals. When applying the theory of algorithm analysis, one must sometimes be aware of the time that it takes to execute the basic operation, the overhead instructions, and the control instructions on the actual computer on which the algorithm is implemented. “Overhead instructions” includes things such as initialization instructions before a loop; the number of times these instructions execute does not increase with input size. “Control instructions” means instructions such as incrementing an index to control a loop; the number of times these instructions execute increases with input size. The basic operation, overhead instructions, and control instructions are all properties of an algorithm and the implementation of the algorithm; they are not properties of a problem. This means they are usually different for two different algorithms for the same problem. Example: Suppose we have two algorithms for the same problem with the following every-case time complexities: n for the first algorithm and n2 for the second algorithm. The first algorithm appears more efficient. Suppose, however, a given computer takes 1,000 times as long to process the basic operation once in the first algorithm as it takes to process the basic operation once in the second algorithm. By “process” we mean that we are including the time it takes to execute the control instructions. If t is the time required to process the basic operation once in the second algorithm, 1,000t is the time required to process the basic operation once in the first algorithm. For simplicity, let’s assume that the time it takes to execute the overhead instructions is negligible in both algorithms. When does the first algorithm become more efficient? n2 * t > n * 1,000t is true when n > 1,000. Algorithms with time complexities such as n and 100n are called linear-time algorithms, because their time complexities are linear in the input size n. Algorithms with time complexities such as n2 and 0.01n2 are called quadratic-time algorithms, because their time complexities are quadratic in the input size n. A fundamental principle: any linear-time algorithm is eventually more efficient than any quadratic-time algorithm. In the theoretical analysis of an algorithm, we are interested in eventual behavior. Algorithms can be grouped into orders according to their eventual behavior. “Big O” Definition: For a given complexity function f(n), O(f(n)) is the set of complexity functions g(n) for which there exists some positive real constant c and some nonnegative integer N such that for all n ≥ N, g(n) ≤ c * f(n). If g(n) O(f(n)), we say that g(n) is big O of f(n). The following figure illustrates that n2 + 10n O(n2) using c = 2 and N = 10. 500 400 300 2n² 200 n² + 10n 100 0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 We say that “big O” puts an asymptotic upper bound on a function. “Omega” Definition: For a given complexity function f(n), Ω(f(n)) is the set of complexity functions g(n) for which there exists some positive real constant c and some nonnegative integer N such that for all n ≥ N, g(n) ≥ c * f(n). If g(n) Ω(f(n)), we say that g(n) is omega of f(n). The following figure illustrates that n2 + 10n Ω(n2) using c = 1 and N = 0. 400 350 300 250 n² 200 n² + 10n 150 100 50 0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 We say that “omega” puts an asymptotic lower bound on a function. “Theta” Definition: For a given complexity function f(n), Θ(f(n)) = O(f(n)) ∩ Ω(f(n)). This means that Θ(f(n)) is the set of complexity functions g(n) for which there exists some positive real constants c and d and some nonnegative integer N such that for all n ≥ N, c * f(n) ≤ g(n) ≤ d * f(n). If g(n) Θ(f(n)), we say that g(n) is order of f(n). In other words, the function g(n) has a rate of growth that is basically similar to f(n). 3 lg n + 8 4n2 4n2 4n3 + 3n2 5n + 7 4n2 + 9 4n2 + 9 6n6 + n4 2n lg 5n2 + 2n 5n2 + 2n 2n + 4n n O(n2) Ω(n2) 4n2 4n2 + 9 5n2 + 2n Θ(n2) Complexity Categories When determining the order of a complexity function, there are several mathematical properties that simplify the analysis. 1. You can ignore the low-order terms in an algorithm’s complexity function. For example, if an algorithm is Θ(n2 + 2n), it is also Θ(n2). 2. You can ignore a multiplicative constant in the high-order term of an algorithm’s complexity function. For example, if an algorithm is Θ(5n2), it is also Θ(n2). 3. Θ(f(n)) + Θ(g(n)) = Θ(f(n) + g(n)). You can combine complexity functions. For example, if an algorithm is Θ(n2) + Θ(n), it is also Θ(n2 + n), which you write simply as Θ(n2) by applying property 1. Analogous rules hold for multiplication. Order Name Θ(1) or Θ(c) Constant Θ(lg n) Logarithmic Θ(n) Linear Θ(n lg n) Linear-log Θ(n2) Quadratic Θ(nk), where k > 2 Cubic (or worse) Θ(2n) Exponential Θ(an), where a > 2 Θ(n!) Factorial Sequential Search Overhead: 2 instructions to search N items CPU Speed: 1,000,000 instructions / second Algorithm Complexity: O(N) Binary Search Overhead: 50 instructions to search N items CPU Speed: 10,000 instructions / second Algorithm Complexity: O(log2N) N = 500 Sequential search: T(N) = 2 * 500 instructions / 1,000,000 instructions per second = 0.001 seconds Binary search: T(N) = 50 * log2 500 instructions / 10,000 instructions per second = 0.0448 seconds N = 1,000,000 Sequential search: T(N) = 2 * 1,000,000 instructions / 1,000,000 instructions per second = 2 seconds Binary search: T(N) = 50 * log2 1,000,000 / 10,000 instructions per second = 0.0997 seconds Binary search: T(N) = 50 * log2 1,000,000 / 1,000,000 instructions per second = 0.0010 seconds

DOCUMENT INFO

Shared By:

Categories:

Tags:

Stats:

views: | 9 |

posted: | 8/7/2012 |

language: | English |

pages: | 7 |

OTHER DOCS BY oW7vCj9

How are you planning on using Docstoc?
BUSINESS
PERSONAL

By registering with docstoc.com you agree to our
privacy policy and
terms of service, and to receive content and offer notifications.

Docstoc is the premier online destination to start and grow small businesses. It hosts the best quality and widest selection of professional documents (over 20 million) and resources including expert videos, articles and productivity tools to make every small business better.

Search or Browse for any specific document or resource you need for your business. Or explore our curated resources for Starting a Business, Growing a Business or for Professional Development.

Feel free to Contact Us with any questions you might have.