VIEWS: 204 PAGES: 30 POSTED ON: 3/6/2010 Public Domain
Integer factorization 1 Prime decomposition 2 Practical applications 3 Current state of the art In number theory, the integer factorization problem is the problem of finding a non-trivial divisor of a composite number; for example, given a number like 91, the challenge is to find a number such as 7 which divides it. When the numbers are very large, no efficient algorithm is known; a recent effort which factored a 200 digit number (RSA-200) took eighteen months and used over half a century of computer time. The supposed difficulty of this problem is at the heart of certain algorithms in cryptography such as RSA. Many areas of mathematics and computer science have been brought to bear on the problem, including elliptic curves, algebraic number theory, and quantum computing. Prime decomposition By the fundamental theorem of arithmetic, every integer has a unique prime factorization. Given an algorithm for integer factorization, one can factor any integer down to its constituent primes by repeated application of this algorithm. Practical applications The hardness of this problem is at the heart of several important cryptographic systems. A fast integer factorization algorithm would mean that the RSA public- key algorithm was insecure. Some cryptographic systems, such as the Rabin public-key algorithm and the Blum Blum Shub pseudo-random number generator can make a stronger guarantee - any means of breaking them can be used to build a fast integer factorization algorithm, so if integer factorization is hard then they are strong. In contrast, it may turn out that there are attacks on the RSA problem more efficient than integer factorization, though none are currently known. A similar hard problem with cryptographic applications is the discrete logarithm problem. Current state of the art As of 2005, the largest semiprime factored using general-purpose methods as part of public research is RSA-200, which is 663 bits long. This was done using a network of computers running in parallel between Christmas 2003 and May 2005, and used very approximately 75 years total computer time. The method used was the general number field sieve. Difficulty and complexity If a large, n-bit number is the product of two primes that are roughly the same size, then no algorithm is known that can factor in polynomial time. That means there is no known algorithm that can factor it in time O(nk) for any constant k. There are algorithms, however, that are faster than Θ(en). In other words, the best known algorithms are sub-exponential, but super-polynomial. In particular, the best known asymptotic running time is for the general number field sieve (GNFS) algorithm, which is: For an ordinary computer, GNFS is the best known algorithm for large n. For a quantum computer, however, Peter Shor discovered an algorithm in 1994 that solves it in polynomial time. This will have significant implications for cryptography if a large quantum computer is ever built. Shor's algorithm takes only O((log n)3) time and O(log n) space. In 2001, the first 7-qubit quantum computer became the first to run Shor's algorithm. It factored the number 15. Shor's algorithm 1 Procedure o 1.1 Classical part o 1.2 Quantum part: Period-finding subroutine: 2 Explanation of the algorithm o 2.1 I. Obtaining factors from period o 2.2 II. Finding the period 3 Modifications to Shor's Algorithm Shor's algorithm is a quantum algorithm for factoring a number N in O((log N)3) time and O(log N) space, named after Peter Shor. The algorithm is significant because it implies that public key cryptography might be easily broken, given a sufficiently large quantum computer. RSA, for example, uses a public key N which is the product of two large prime numbers. One way to crack RSA encryption is by factoring N, but with classical algorithms, factoring becomes increasingly time-consuming as N grows large; more specifically, no classical algorithm is known that can factor in time O((log N)k) for any k. By contrast, Shor's algorithm can crack RSA in polynomial time. It has also been extended to attack many other public key cryptosystems. Like all quantum computer algorithms, Shor's algorithm is probabilistic: it gives the correct answer with high probability, and the probability of failure can be decreased by repeating the algorithm. Shor's algorithm was demonstrated in 2001 by a group at IBM, which factored 15 into 3 and 5, using a quantum computer with 7 qubits. Like most factorization algorithms, Shor’s algorithm reduces the factorization problem to problem of finding the period of a function. Uses quantum parallelism to find a superposition of all values of the function In one step. Then it calculated the QFT of the function, which sets the amplitudes into multiples of the fundamental frequency, the reciprocal of the period. Integer Factoring Order Finding Phase Estimation Quantum Fourier Transform Procedure The problem we are trying to solve is that, given an integer N, we try to find another integer p between 1 and N that divides N. Shor's algorithm consists of two parts: 1. A reduction of the factoring problem to the problem of order-finding, which can be done on a classical computer. 2. A quantum algorithm to solve the order-finding problem. Classical part 1. Pick a pseudo-random number a < N 2. Compute gcd(a, N). This may be done using the Euclidean algorithm. 3. If gcd(a, N) ≠ 1, then there is a nontrivial factor of N, so we are done. 4. Otherwise, use the period-finding subroutine (below) to find r, the period of the following function: , i.e. the smallest integer r for which f(x + r) = f(x). 5. If r is odd, go back to step 1. 6. If a r/2 ≡ -1 (mod N), go back to step 1. 7. The factors of N are gcd(ar/2 ± 1, N). We are done. Euclidean algorithm The Euclidean algorithm (also called Euclid's algorithm) is an algorithm to determine the greatest common divisor (gcd) of two integers. It is one of the oldest algorithms known, since it appeared in Euclid's Elements around 300 BC. However, the algorithm probably was not discovered by Euclid and it may have been known up to 200 years earlier. It was almost certainly known by Eudoxus of Cnidus (about 375 BC); and Aristotle (about 330 BC) hinted at it in his Topics, 158b, 29-35. The algorithm does not require factoring the two integers. Algorithm and implementation Given two natural numbers a and b, check if b is zero. If yes, a is the gcd. If not, repeat the process using b and the remainder after integer division of a and b (written a modulo b below). The algorithm can be naturally expressed using tail recursion. By keeping track of the quotients occurring during the algorithm, one can also determine integers p and q with ap + bq = gcd(a, b). This is known as the extended Euclidean algorithm. Euclid originally formulated the problem geometrically, as the problem of finding a common "measure" for two line lengths, and his algorithm proceeded by repeated subtraction of the shorter from the longer segment. This is equivalent to the following implementation, which is considerably less efficient than the method explained above: function gcd(a, b) while a ≠ b if a > b a := a - b else b := b - a return a Proof of correctness Suppose a and b are the numbers whose gcd has to be determined. And suppose the remainder of the division of a by b is t. Therefore a = qb + t where q is the quotient of the division. Now any common divisor of a and b also divides t (since t can be written as t = a − qb); similarly, any common divisor of b and t will also divide a. Thus the greatest common divisor of a and b is the same as the greatest common divisor of b and t. Therefore it is enough if we continue the process with the numbers b and t. Since t is smaller in absolute value than b, we will reach t = 0 after finitely many steps. Running time Plot of the running time for gcd(x,y) When analyzing the running time of Euclid's algorithm, it turns out that the inputs requiring the most divisions are two successive Fibonacci numbers, and the worst case requires O(n) divisions, where n is the number of digits in the input. However, it must be noted that the divisions themselves are not atomic operations (if the numbers are larger than the natural size of the computer's arithmetic operations), since the size of the operands could be as large as n digits. The actual running time is therefore O(n²). This is, nevertheless, considerably better than Euclid's original algorithm, in which the modulus operation is effectively performed using repeated subtraction in O(2n) steps. Consequently, this version of the algorithm requires O(n2n) time for n-digit numbers, or O(mlog m) time for the number m. Euclid's algorithm is widely used in practice, especially for small numbers, due to its simplicity. An alternative algorithm, the binary GCD algorithm, exploits the binary representation used by computers to avoid divisions and thereby increase efficiency, although it too is O(n²); it merely shrinks the constant hidden by the big-O notation on many real machines. Continued fractions The quotients that appear when the Euclidean algorithm is applied to the inputs a and b are precisely the numbers occurring in the continued fraction representation of a/b. Take for instance the example of a = 1071 and b = 1029 used above. Here is the calculation with highlighted quotients: 1071 = 1029 × 1 + 42 1029 = 42 × 24 + 21 42 = 21 × 2 + 0 From this, one can read off that . This method can even be used for real inputs a and b; if a/b is irrational, then the Euclidean algorithm will not terminate, but the computed sequence of quotients still represents the (now infinite) continued fraction representation of a/b. C/C++ code int gcd(int a, int b) { if (b == 0) return a; else return gcd(b, a % b); } This can be rewritten iteratively as: function gcd(a, b) while b ≠ 0 var t := b Note: This is in Pascal. b := a modulo b a := t return a For example, the gcd of 1071 and 1029 is computed by this algorithm to be 21 with the following steps: a b t 1071 1029 42 1029 42 21 42 21 0 21 0 Quantum part: Period-finding subroutine: 1. Start with a pair of input and output qubit registers with log2N qubits each, and initialize them to where x runs from 0 to N - 1. 2. Construct f(x) as a quantum function and apply it to the above state, to obtain 3. Apply the quantum Fourier transform on the input register. The quantum Fourier transform on N points is defined by: This leaves us in the following state: 4. Perform a measurement. We obtain some outcome y in the input register and f(x0) in the output register. Since f is periodic, the probability to measure some y is given by Analysis now shows that this probability is higher, the closer yr/N is to an integer. 5. Turn y/N into an irreducible fraction, and extract the denominator r′, which is a candidate for r. 6. Check if f(x) = f(x + r′). If so, we are done. 7. Otherwise, obtain more candidates for r by using values near y, or multiples of r′. If any candidate works, we are done. 8. Otherwise, go back to step 1 of the subroutine. Explanation of the algorithm The algorithm is composed of two parts. The first part of the algorithm turns the factoring problem into the problem of finding the period of a function, and may be implemented classically. The second part finds the period using the quantum Fourier transform, and is responsible for the quantum speedup. I. Obtaining factors from period The integers less than N and coprime with N form a finite group under multiplication modulo N, which is typically denoted (Z/NZ)×. By the end of step 3, we have an integer a in this group. Since the group is finite, a must have a finite order r, the smallest positive integer such that Therefore, N | (a r − 1). Suppose we are able to obtain r, and it is even. Then r is the smallest positive integer such that a r ≡ 1, so N cannot divide (a r / 2 − 1). If N also does not divide (a r / 2 + 1), then N must have a nontrivial common factor with each of (a r / 2 − 1) and (a r / 2 + 1). Proof: For simplicity, denote (a r / 2 − 1) and (a r / 2 + 1) by u and v respectively. N | uv, so kN = uv for some integer k. Suppose gcd(u, N) = 1; then mu + nN = 1 for some integers m and n (this is a property of the greatest common divisor.) Multiplying both sides by v, we find that mkN + nvN = v, so N | v. By contradiction, gcd(u, N) ≠ 1. By a similar argument, gcd(v, N) ≠ 1. This supplies us with a factorization of N. If N is the product of two primes, this is the only possible factorization. II. Finding the period Shor's period-finding algorithm relies heavily on the ability of a quantum computer to be in many states simultaneously. Physicists call this behavior a "superposition" of states. To compute the period of a function f, we evaluate the function at all points simultaneously. Quantum physics does not allow us to access all this information directly, though. A measurement will yield only one of all possible values, destroying all others. Therefore we have to carefully transform the superposition to another state that will return the correct answer with high probability. This is achieved by the quantum Fourier transform. Shor thus had to solve three "implementation" problems. All of them had to be implemented "fast", which means that they can be implemented with a number of quantum gates that is polynomial in logN. 1. Create a superposition of states. This can be done by applying Hadamard gates to all qubits in the input register. Another approach would be to use the quantum Fourier transform (see below). 2. Implement the function f as a quantum transform. To achieve this, Shor used repeated squaring for his modular exponentiation transformation. 3. Perform a quantum Fourier transform. By using controlled NOT gates and single qubit rotation gates Shor designed a circuit for the quantum Fourier transform that uses just O((logN)2) gates. After all these transformations a measurement will yield an approximation to the period r. For simplicity assume that there is a y such that yr/N is an integer. Then the probability to measure y is 1. To see that we notice that then e2πibyr / N = 1 for all integers b. Therefore the sum that gives us the probability to measure y will be N/r since b takes roughly N/r values and thus the probability is 1/r. There are r y such that yr/N is an integer, so the probabilities sum to 1. Note: another way to explain Shor's algorithm is by noting that it is just the quantum phase estimation algorithm in disguise. Modifications to Shor's Algorithm There have been many modifications to Shor's algorithm. For example, whereas, an order of twenty to thirty runs are required on a quantum computer in the case of Shor's original algorithm, and with some of the other modifications, in the case of the modification done by David McAnally at the University of Queensland an order of only four to eight runs on the quantum computer is required. Realizations 7-qubit, NMR • Nuclear Magnetic Resonance : IBM 15=3x5 !! (Shor’s algorithm) Initial State After Doing Hadamard Transform on all bits of a After modular exponentiation a b=x (mod N) State After Fourier Transform • Produce a periodic register in the target target ancilla period !!! Efficient quantum Fourier transform