VIEWS: 0 PAGES: 8 POSTED ON: 2/14/2012 Public Domain
Computational Theory Quick Notes 1. Complexity When picking an algorithm there are 2 contradicting goals. 1. Ease of understanding, debugging and coding. 2. Efficiency. Run Time Consider: 1. Input to the program. 2. Quality of code generated by the compiler. 3. Nature and speed of machine code. 4. Time complexity of the algorithm. RUNNING TIME: T(n) This is not on the input itself but the size of the input. SPACE REQUIREMENTS: S(n) Linear Search worst case T(n) = n. Binary Search Worst Case on sorted input T(n) = 4. Big O-Notation This places an upper bound on the running time of an algorithm, stated as a function of its input. When doing this, always ask what happens as n approaches infinity. The M is a constant. N0 is the threshold. The most interesting part of this is the f(n) part as it governs growth. The lower the order the better. If A is O(f(n)) and B is O(g(n)) then A is considered to be lower. Lowest order will not give you exactly how much time or space is needed to solve a problem but shows us, for problems over a certain size, the lower order algorithm will take less time or space than a higher order algorithm. Must be careful as it’s an upper bound and the quantity in question might be much lower, input that causes the worst case may be unlikely to occur in practise, M and n0 are unknown and need not be small. Linear Search: T(n) = n O(n) = m.n, ∀n > n0 Ignore m & n0 as constant factors are of no interest. Larger constant factor may change graph but not the characteristics of an equation. In some cases algorithms are sensitive to the permutations of input ordering. The statement that the running time of an algorithm is O(f (n))does not imply that the algorithm ever takes that long: it only says that the analyst has been able to show that it never takes longer. The actual running time might always be much lower. Even if the worse-case input is known, it can be the case that the inputs encountered in practice lead to much lower running times. Many extremely useful algorithms have a bad worst-case. For example, Quicksort has a running time of O(n2), but it is possible to arrange things so that the running time of inputs encountered in practice is closer to O(n log n). The constants n0 and M, implicit in the O-notation, often hide implementation details that are important in practice. An algorithm with running time O(f (n)) says nothing about the running time if n < n0, and M might be hiding a large amount of overhead designed to avoid a bad worst-case. We would prefer an algorithm using N2 nanoseconds rather than log N centuries, but we cannot make that choice based on O-notation. Analysing an Algorithm Various runtime properties of algorithms: frequency of instruction execution 1. Executed Frequency (Loops) 2. Executed Once (Initialization Time) 3. Never Executed (Branching [if/else]) Definition: (i) It is central to the function of the algorithm, and its behaviour typifies the overall behaviour of the algorithm (ii) It is inside the main loops of the algorithm, and is executed as often as any other part of the algorithm. Sequential Search: Algorithm if O(n). And so is of Linear Complexity As input Size Doubles so does the worst case scenario Binary Search: The critical operation is the comparison (L[mid] = key) In the worst case, the max number of comparisons will be k, where k is the first integer such that 2k _ n In other words k = |log2n|, i.e. the complexity of Binary search is O(log2n). It has a logarithmic complexity Since log2n grows more slowly than n, we know that above a certain size of list, Binary search will always run faster. For Binary Search, data must be sorted for it to work. Comparison of O(n) and O(log2 n) A constant time algorithm, written O(1) Bubble Sort Selection Sort First find smallest element then swap it with element in first position. Then repeat swapping the elements with that in the next lowest position (1st time in 1st position, then 2nd, 3rd .....) void selectionSort(int arr[], int n) { int i, j, minIndex, tmp; for (i = 0; i < n - 1; i++) { minIndex = i; for (j = i + 1; j < n; j++) { if (arr[j] < arr[minIndex]) { minIndex = j; } } if (minIndex != i) { tmp = arr[i]; arr[i] = arr[minIndex]; arr[minIndex] = tmp; } } } Outer Loop = n Inner Loop = n – i Selection Sort = T(n) In the worst case we can expect every element will need to be swapped which is (n - 1) swaps. To analyse an algorithm first identify looping bounds and the number of loops. Then derived a math. Estimate for the work in each of these loops: n ∑ (n - i) i=1 Derived T(n) estimate (actual Work) : T(n) = n2/2 – n/2 Hamiltonian Cycle Problem (HCP0)in G Each node can only be visited once. Enumerate of possible permutations of G. Evaluate each permutation to see if a H.C. exists along the path. V1-V2-V3- so on. By Definition a H.C in G visits each node and returns to starting node. Each node must have at least one edge. Complete graph with n nodes is one in which each node is connected to every other node. As there are only 2 routes to each node and you can only move along an unused route you only have one usable route. As you only have one route each time you must eventually return to the root node. Travelling Salesman (TSP) Enumerate all possible paths in G. Examine each path for optimization criteria. (max length/ min length/ lowest cost) Permutation Problem: Find all possible permutations of elements in set. Complexity of n! Simplifications of complexity. 1 Graph symmetric in nature e.g. cost (b-a) = cost (a-b). 123 = 312 2 Starting node is not important when considering a given path in G. Models of Computation Not all problems can be solved by computers and not all computations are equal in complexity. Finite State Machines M= [S, I, O, Fs, Fo] S = set of states I = Input Alphabet O = Output Alphabet Fs Fo By Default machine boots into state 0. Ignore first output as it’s the default output of the boot. There must be no unlabeled transitions, can be no 2 or more transitions from state Sn with the same input. We look to minimize number of states in machine. FSM’s mostly contain cycles. A FSM without a cycle is limited to input sequences with length equal to its longest path. A regular set is a compact way to describe the sets. Kleene’s Theorem Any set recognised by a FSM is regular, and any regular set can be recognised by some FSM Regular sets are exactly the sets FSMs are capable of recognising It follows that if a set is not regular, there is no FSM which can recognise it Ex: The set S = {0n1n|n _ 0}, where 0n is the string containing n 0’s. So this is the set of strings which have a specific number of 0’s followed by the same number of 1’s. Infinite -> FSM has a looping structure if we id a test for regularity we would be able to test if a given string can be accepted by some FSM. FSM fail to accept strings which are not regular. FSM model of computation is therefore deficient. We need some test to determine regularity. Our Deterministic Finite Automaton (DFA) is denoted by a quintuple A = (Q,P, _, s, F) where Q is a finite set of states; P is a finite input alphabet; _ is a transition function from Q × P to Q; s 2 Q is the initial state of the automaton; F _ Q is the set of favourable states. If the automaton, being in state q, has read the symbol a 2 P, it enters the state q′ = _(q, a). The new state is completely determined by the content of the cell and the internal state of the automaton. At some moment the reading head reaches the end of the input word a If at this moment the automaton is in a favourable state q 2 F, the input word is said to be accepted by the automaton. Otherwise the input word is not accepted. The set of all input words accepted by the automaton A is called the language accepted by A .We denote this language L(A). As we have seen, a convenient way to represent the finite automaton is a finite state diagram. Favourable states are circled The transition diagram shown in Fig. 1 accepts the language containing all the strings anbm for n = 0, 1, 2, ... and m = 1, 2, .... The Pumping Lemma Some languages are not regular and not amenable to the solution via FSM. How do we determine the regularity of a language? Using the P.L. we can demonstrate that if a regular language containing long strings it must contain an infinite set of strings of a particular form. If a language can be shown not to posses strings of this form we there by demonstrate the language is not regular. pigeonhole principle: the pigeonhole principle states that if n items are put into m pigeonholes with n > m, then at least one pigeonhole must contain more than one item. Let L be a regular language. There exists an integer n > 0 such that any string w 2 L, with length |w| _ n can be represented as the concatenation xyz such that, The substring y is non-empty. |xy| _ n, and xyiz 2 L for each i _ 0. If L is finite, then choose any n longer than the longest word in L and the theorem follows since there are no words of length at least n. Suppose then that L is infinite. Let A be a finite automaton that accepts L. Let n be the number of states in A. Consider any string w = w1w2...wm 2 L that has the length m _ n. Consider a computation of A on the initial segment w1w2...wn of the string w. For any k, 0 _ k _ n, let (qk,wk+1wk+2...wn) be the configuration of A after k steps of the computation. Since A only has n states, and there are more than n configurations in the above fragment, by the pigeonhole principle, there exist r and j, 0 _ r < j _ n, such that such that qr = qj. This means that the string y = wr+1wr+2...wj brings the automaton A from state qr back to the same state. Note that the string is non-empty, since r < j. Now, if we remove string y from the original string w, or insert yi instead of y for any i, we get a string that will still be accepted by A. Thus, any string xyiz, where x = w1w2...wr, y = wr+1wr+2...wj , and z = wj+1wj+2...wm, is accepted. Moreover, the total length of the prefix x and the substring y does not exceed n. Turing Machines FSM are insufficient models of computation. Some algorithms that can’t be implemented using FSM. For example recognising strings in L = {0n1n|n>=0}. To solve the problem or accepting strings in L the computing model wound need to store at least the first half of strings in L. FSM cannot store their input. Turing concluded FSM were inadequate and furthermore any model of computation must. A, Erase & Write to some storage medium. B, Decision as to which character is read or written is based on configuration of the machine. This leads to 2 main differences between FSM and TM. A, TM can write & overwrite its input contents. B, Reread inputs already seen (rewind) or backup. In designing we normally move the head right or left there is no concept of staying where you are. All programming languages support 3 key features: A, Sequence Instructions. B, Selection (if). C, Iteration (If, Else). Here the 1, 0 is the then. (0, 0, 1, 0, R) (0, 1, 1, 0, R) TM: quintuple. To iterate just set the next state to the same as the current state. As Function Computers 0 is encoded as 1 1 is encoded as 11, 2 is 111 ..... Church Turing Any algorithmic procedure that can be carried out by a human being or by a computer can be carried out by a Turing Machine. It is conceivable that it could be discarded in the future if someone can identify a legitimate algorithm that cannot be implemented by a Turing machine. In the 70 or so years since the thesis was postulated, this has not happened. Implementation 1. According to the principle, if we have managed to derive even an informal or verbal description of a computational procedure, this procedure can be implemented as a Turing Machine, and thus, as a computer program. 2. Thus, to show that some function is computable, we need not write a full set of transitions for our Turing Machine, it merely suffices to derive a clear and unambiguous description of the algorithm acceptable to the theoretical computer science community. Implementation 2. 1. Another implication of the thesis is perhaps even more important. 2. A Turing machine is a mathematically precise model. 3. It opens a possibility to show that some problems cannot be solved by any algorithm. 4. What we have to show is that there is no Turing machine that can solve the problem in question. Universal Turing Machine: A universal Turing machine is a Turing machine that can simulate an arbitrary Turing machine on arbitrary input. The universal machine essentially achieves this by reading both the description of machine to be simulated as well as the input thereof from its own tape. A good example of a UTM is a computer as it is the UTM in which other TM’s (applications and programs) run in. The Halting Problem: Can we design an algorithm halt that determines if a program will halt or not based on an input x? No. If such an algorithm did exist it could be implemented as a program called, let’s say, HALT, implemented in the same programming language as any other program P. This program, HALT(P,X) would then take two parameters, P and X, and outputs 1 if P halted on X and 0 otherwise. We can now slightly modify HALT to build another program which we call DIAGONAL(P). DIAGONAL(P) takes one parameter the text of a program called P, and then calls HALT on input HALT(P, P). That is, DIAGONAL calls HALT to see if the program P halts on input of program P. Finally, we build another machine, which we call CONTR(P) that calls DIAGONAL as a module and does not halt if and only if DIAGONAL(P) = 1 (that is, HALT determines in this case whether P halts on the input being its own text). Proof by Contradiction: Suppose CONTR(CONTR) halts. Notice that the result of the computation HALT(CONTR,CONTR) = 0, that is: HALT answers that CONTR does not halt on CONTR. Now suppose that CONTR(CONTR) does not halt. Then the result of the computation HALT(CONTR,CONTR) = 1 that is: HALT answers that CONTR halts on CONTR. What caused this contradiction was our assumption that Halt could be programmed in the same programming language as any other program. However, this assumption is a consequence of the more general Church-Turing thesis stating that some programming languages (for example, Turing Machines) are universal mathematical models representing all conceivable algorithms.