VIEWS: 15 PAGES: 69 POSTED ON: 11/25/2011 Public Domain
Public Key Cryptosystem Sheng Zhong 1 Recall Definition • A public key cryptosystem is (M, C, K, G, E, D): – M: cleartext message space – C: ciphertext space – K: key space – G: generate encryption/decryption key pair from key length – E: encrypt cleartext given encryption key – D: decrypt ciphertext given decryption key 2 RSA Cryptosystem • Most well known and widely used public- key cryptosystem. • Named after inventors: Rivest, Shamir, and Adleman. – Got Turing award for RSA • Based on factoring of large number – In fact, more than that. 3 RSA Key Generation (1) • Generate two large random primes p, q – p, q are usually of the same length. • N=pq • Choose appropriate exponent e (of the same length). • Compute d such that for all m, ed 1 m 1(mod n) 4 RSA Key Generation (2) • Public Key: (n,e) • Private Key: (n,d) • Discard p,q – Very important for security of RSA. 5 RSA Encryption/Decryption • Cleartext space: (an appropriate subset of ) {1, …, n-1} • Ciphertext space: (an appropriate subset of ) {1, …, n-1} • E((n,e), m)=me mod n. • D((n,d), c)=cd mod n. 6 Why does the decryption work? c mod n (m ) mod n d e d m mod n ed ( ed 1) 1 (m ) mod n ed 1 (m mod n) (m mod n) 1 (m mod n) m mod n m 7 Exponentiation Algorithm • Both encryption and decryption need to compute modular exponentiation. – What algorithm do we use to do this? • By definition, ae is multiplication of a for e times. – Then even computing something like 754238 takes a lot of time. – How about computing ae for a=2497347974111112432432432243243242 34324234234320043543570 and e=…? 8 Fast Exponentiation(1) • Let’s write e in binary, for example: e=1010000000110 • Then e a a 1010001000110 b a 1000000000000 b a 10000000000 b a 100b a 10b a a a a 212 210 22 21 • Enough if we can fast compute 212 210 22 21 a ,a ,a ,a 9 Fast Exponentiation (2) • But how can we fast compute 212 210 22 21 a ,a ,a ,a – Starting from a; – Keep squaring, we get 21 22 23 24 a , a , a , a ,... – Until we have all items we need 10 Fast Exponentiation Algorithm • Input: a, e • Output: y=a^e int b=a, y=1, ee=e; while(ee!=0){ //invariant (b^ee) y = a^e if(ee&1){ //is odd y*= b;} //multiply result by power b*=b; ee>>=1;} //compute next power 11 Computing d? • Now we see how encryption/decryption algorithms work. – But how does key generation algorithm work? – In particular, how to find d? – Recall we need for all m, med-1=1(mod n). – This requires a little number theory. 12 Residue Class (1) • For any modulus n, any integer a, we can define A={a’: a’=a mod n} This is called a residue class. • For any modulus n, the residue classes mod n constitute a partition of integers. – Any two residue classes mod n are disjoint. – Every integer is in a residue class mod n. 13 Residue Class (2) • Being in the same residue class is called modular equivalent mod n. This is an equivalence relation. – Any integer is modular equivalent to itself. – If a and b are modular equivalent, then b and a are modular equivalent. – If a and b are modular equivalent, and if b and c are modular equivalent, then a and c are modular equivalent. 14 Residue Class (3) • Residue classes preserves basic arithmetics. – If a=a’ (mod n), b=b’ (mod n), then a+b=a’+b’ (mod n) – Similar properties hold for subtraction, multiplication, division, etc. • So we usually use one element of a residue class to represent it. – For example, we use 1 to represent {1, n+1, - n+1, 2n+1, -2n+1, …} 15 Complete Residue System • There are n residue classes in total with respect to modulus n. – We use 0, 1, 2, …, n-1 to represent each of them, respectively. – Then we have a complete system of residue classes. • Formally, we define Zn={0,1,…, n-1}. 16 Reduced Residue System • We are particularly interested in the residue classes that are coprime to n. – Note that if a number in a residue class is coprime to n, then so are all other numbers in this class. – This is called a reduced system of residue classes. • Formally, we define Zn*={a: gcd(a,n)=1 and a is in Zn} 17 Multiplication in Zn* • For multiplication (mod n---we often omit mod n in the follows), Zn* is a group. This means the following properties hold: – It’s closed: if a and b are in Zn*, then ab are also in Zn*. – If a,b,c are in Zn*, then (ab)c=a(bc). – 1 is in Zn*. Note that for all a, 1a=a1=a. – For all a in Zn*, 1/a is also in Zn*. • Note 1/a (mod n) is defined as b such that ab=1 (mod n) 18 Euler Totient Function • We define Φ(n)=|Zn*| – This is called Euler Totient Function. – Intuitively, this is the number of integers in {0,1,…, n-1} that are coprime to n. • Suppose that n=ab. – If gcd(a,b)=1, then Φ(n)=Φ(a) Φ(b). 19 Computing Φ(n) (1) • Suppose n p ...pk n1 nk 1 Then (n) ( p1 )... ( pk ) n1 nk 20 Computing Φ(n) (2) • What is Φ(pn) then? – Φ(pn) is the number of integers in {0,1, …, pn-1} that are coprime to pn – This is equal to the number of integers in {0,1, …, pn-1} that are coprime to p – There are pn-1 multiples of p. – So Φ(pn)=pn-pn-1=pn-1(p-1) • Therefore, n1 1 nk 1 (n) p 1 ( p1 1)...pk ( pk 1) 21 (Fermat-)Euler Theorem • Another property of Euler Totient Function: For all a in Zn*, ( n) a 1 • Why is this true? – Because Zn* is a group and Φ(n) is its size… 22 Proof of Euler Theorem (1) • In general, we show that, for any group G, any a in G, a|G|=1 • Suppose the elements of G are b1, …, b|G|. Consider ab1, …, ab|G|. – Clearly, any two elements abi and abk in this sequence are not equal. (Otherwise bi and bk are also equal.) – So this is a permuation of b1, …, b|G|. 23 Proof of Euler Theorem (2) • Since ab1, …, ab|G| is a permuation of b1, …, b|G|, we have ab1 … ab|G| = b1 … b|G| • The above is equivalent to a|G| b1 … b|G| = b1 … b|G| Which means a|G|=1 24 Returning to Key Generation • Recall we need to compute d such that for all m (in Zn*), med-1=1 (mod n) • Due to Euler theorem, this can be implied by Φ(n) | ed-1 which is equivalent to ed=1 (mod Φ(n) ) 25 Returning to Computing d • In fact, we can show that (for all m) med- 1=1 (mod n) is not only implied by ed=1 (mod Φ(n) ) , but also equivalent to it. – So to compute d for key generation, we only need to find d such that ed=1 (mod Φ(n) ). – Let u= Φ(n). Now the question is to find an algorithm that computes d such that ed=1 (mod u) when given e and u. 26 Compute Modular Inverse • Recall the definition of 1/e mod something. – Here we exactly want to find d=1/e mod u. – This is called modular inverse. • To compute d such that ed=1(mod u), we note it is equivalent to that (there exists v such that) ed+uv=1. • Now the problem becomes: given e, u, find d such that ed+uv=1. 27 Euclidean Algorithm • Suppose e is coprime to u. – Then gcd(e,u)=1. – We have an Euclidean algorithm that finds gcd(e,u) given e,u. – This algorithm happens to give d,v such that ed+uv=gcd(e,u) (if it is extended). – So we can apply it here to find the d we want! 28 How Euclidean Algorithm Works? • Question: Given e, u, we need to compute gcd(e,u). • We note that gcd(e,u) divides e mod u and u mod e, since it divides both e and u. – So we start from e and u. – Repeat dividing: • If the smaller number divides the greater number, stop and output the smaller. • Otherwise, replace the greater number with (greater mod smaller). 29 Why Euclidean Algorithm Works? • It must stop. – Otherwise, the sum of the two numbers is always decreasing. – But the sum is always a positive number. – This is impossible. • When it stops, it gives gcd(e,u). – Clearly, gcd(e,u) always divides the two variables in the algorithm; so it also divides the output. – Then we only need to show the output divides gcd(e,u). 30 Output Divides gcd(e,u) • Since all common divisors of e and u divide gcd(e,u), we only need to show the output is a common divisor. – In the last iteration, the output is a common divisor of the two variables. – When we go back one step, we have the same property. – Keep going until we reach e, u. 31 Extended Euclidean Algorithm • We still need to extend Euclidean algorithm to find d, v such that ed+uv=gcd(e,u). • Think of the first iteration of Euclidean Algorithm: – We start from e=1e+0u and u=0e+1u. – Suppose u>e; then we compute u mod e and use it to replace u. – Here u mod e = 1 u - (u div e) e. – Note we always know the coefficiencies of u and e. – Keep going this way; finally we have the coefficiencies of u and e for the output---d and v. • Done with computing d! And done with key generation. 32 RSA Assumption • The security of RSA cryptosystem is based on RSA assumption: For all efficient algorithm A, for uniformly random k-bit primes p, q, for n=pq, for e uniformly picked from Z Φ(n) *, for m uniformly picked from Zn*, Pr[A(n,e,me mod n)=m] is negligible. 33 What does RSA Assumption imply? (1) • Clearly, it implies that for a uniformly random message m, the adversary can’t get any advantage for finding m when given the ciphertext. • Does it imply finding d is hard? – Also clearly, since if given d it is easy to compute m. 34 What does RSA Assumption imply? (2) • It implies computing Φ(n) from n is hard. – Otherwise, we can compute d=1/e (mod Φ(n) ) from Φ(n) and e using Extended Euclidean Algorithm. • It implies factoring n is hard. – Otherwise, we can factor n to get p and q; then we get Φ(n) =(p-1)(q-1). 35 What does RSA Assumption imply? (3) • It implies finding (a, b) (a ≠ b or -b(mod n) ; a, b ≠ 0 (mod n) ) such that a2=b2 (mod n) is hard. – Otherwise, we have a2-b2 =0(mod n), which means n divides (a-b)(a+b). – Without Loss Of Generality, suppose 0<b<a<n. Then 0< a-b <n, 0<a+b<2n. – Since a ≠-b (mod n), a+b ≠n, which means a+b and a- b each contains one of the two prime factors. – Using Euclidean Algorithm, we can find gcd(a+b,n) and gcd(a-b,n), which are the two primes. 36 Factoring vs. RSA • We have seen RSA assumption implies factoring is hard. But does the hardness of factoring imply the RSA assumption? – This is an open question. – New results show probably not. 37 Security of RSA • Under RSA assumption, RSA is not only “secure” against a passive adversary (as we have explained). – It is also “secure” against a Chosen Plaintext Attack. – Here “secure” means hard to find cleartext (which is weak). • Chosen Plaintext Attack (CPA): Stronger model than passive adversary. – Active adversary can obtain the ciphertexts corresponding to the cleartexts he chooses. – Any public key cryptosystem MUST be secure against CPA due to Kerchoff principle. 38 More on Active Adversary • Chosen Ciphertext Attack (CCA): – Adversary can obtain the cleartexts corresponding to the ciphertexts he chooses. – However, the above help is only available before the ciphertext is given to adversary. • Adaptive Chosen Ciphertext Attack (CCA2): – Unlike CCA, the help with decryption is always available. – However, adversary can’t use the above help to decrypt the ciphertext he is given. (Otherwise, no encyption algorithm can be secure.) 39 Attack on RSA: Short Message • RSA is not randomized: each cleartext has only one ciphertext. – If we know the message is only a few bits (say, 3 bits), we can test all possible cleartexts. – We encrypt each of them and compare it with the ciphertext. – The above attack is also valid in case the message is known to belong to any small set. – Imagine if you encrypt your vote in president election using RSA! 40 Attack on RSA: Meet in the Middle • The attack on short message can be enhanced by meet-in-the-middle attack. – With a good probability, a short message can be factored into two even shorter messages: m=m1m2 – With RSA, E(m)=me mod n= m1e m2e mod n=E(m1)E(m2). – Then we can encrypt the even shorter messages and compare their product with the ciphertext. 41 Attack in Real Life • Such attack is especially effective when RSA is used to encrypt DES key. – DES key is only 56-bit. – Thus finding DES key is easy with meet-in- the-middle attack. • We can also use it to attack RSA- encrypted passwords. • So do NOT simply use RSA to encrypt passwords and DES keys. 42 Attack on Short Message and RSA Security • Why the above attack is possible while RSA is “secure”? – Because RSA is secure only in the sense that it is hard to find cleartext. – Not in the sense that it is hard to find any partial information about the cleartext (i.e., not semantically secure). – With a short message, we essentially have known all except a small part about the message. – This remaining small part can’t be protected by RSA. 43 Attack on RSA: CCA2 • Consider an active adversary in CCA2 model. – The adversary is given ciphertext c, but he can’t use the help of decryption to get the cleartext of c. – However, he can choose m’ and compute c’=c(m’)e. – Then he asks decryption for c’. – He gets (c’)d= (c(m’)e)d = cd (m’)ed = cd m’ – So he can compute cd from (c’)d. • Thus RSA is not secure against CCA2 attack. 44 Common Misuse of RSA • In network security, to accelerate computing, people often use small exponent e. – Usually e=3; sometimes e=11. – Many protocols/systems include such examples. – Is this secure? • Unfortunately, it is easy to find d with small e. – Efficient algorithms have been published for such e (e.g., the Coppersmith algorithm). – Similarly, using a small d is also subject to attack. 45 Rabin Cryptosystem • Another popular public key cryptosystem. • Invented by Mike Rabin (Turing award winner). • Based on the hardness of computing square root mod n. 46 Rabin: Key Generation • Choose two primes p, q of the same length. • Compute n=pq. • Pick integer b in Zn*. • Public Key: (n,b) • Private Key: (p,q) 47 Rabin: Encryption & Decryption • For cleartext message m, the ciphertext c=m(m+b) mod n • For ciphertext c, the cleartext is m=(-b+sqrt(b2+4c))/2 where sqrt(b2+4c) is a square root of b2+4c mod n that is less than n and greater than 0. 48 Computing Square Root Mod n • For decryption, we need sqrt(b2+4c), a square root of b2+4c mod n. – First, we compute a square root of b2+4c mod p. Suppose this is r1. – Second, we compute a square root of b2+4c mod q. Suppose this is r2. – Third, we compute r such that r= r1 (mod p) and r= r2 (mod q). – r is what we want. 49 Chinese Remainder Theorem • Why r is what we want? This is based on the Chinese Remainder Theorem: For n=n1…nk (where n1, …, nk are pairwise coprime), r=x (mod n) if and only if r=x (mod n1), …, r=x (mod nk). 50 Computing Square Root Mod Prime • But we still need an algorithm to compute a square root of b2+4c mod prime (where the prime is p and q, respectively). – When prime p=4k+3, it is easy to compute square roots mod p: r1= (+ or -) (b2+4c)(p+1)/4 (mod p) – For p=4k+1 we also have algorithms. But we often have both p and q =3 (mod 4); in this case, n is called a Blum integer. 51 Proof for Square Root Mod Prime (1) • Why is r1 a square root? – It is easy to see (r1)2=(b2+4c)(p+1)/2 (mod p) – Since b2+4c does have square roots, (b2+4c)1/2 is meaningful. So we can rewrite the above as (r1)2=((b2+4c)1/2 ) (p+1) (mod p) 52 Proof for Square Root Mod Prime (2) • Recall Φ(p)=p-1. So we have (r1)2=((b2+4c)1/2 ) (p+1) (mod p) =((b2+4c)1/2 ) (Φ(p)+2) (mod p) By Euler Theorem, =((b2+4c)1/2 ) 2 (mod p) =b2+4c (mod p) 53 Number of Roots • Note that we have two square roots mod each prime. – So there are four square roots mod n. • Which of these four is the original cleartext? – Should add redundancy into the cleartext, so that it can be recognized. 54 Security of Rabin • It is hard to find a cleartext of Rabin Cryptosystem under CPA attack. – This is true under the assumption that factoring n is hard. • Why? – Suppose there is an efficient algorithm A that computes m from n, b, c. – Then we can construct an efficient algorithm A’ that factors n. 55 Reduction of Factorization to Rabin Decryption • Choose m at random. • Compute c=m(m+b) mod n. • Use A to decrypt c---suppose we get m’. • Claim: gcd(m-m’, n) is a prime factor of n with probability 1/2. – But why? We need a closer look at the cleartext. 56 Closer Look at Cleartext • What is m’? Is it necessarily equal to m? – Not really. – Recall the decryption is (-b+sqrt(b2+4c))/2. – There are four possible roots of b2+4c. – So there are four possible cleartexts, one of which is equal to m, and one of which is equal to m’. • No algorithm can distinguish m from other three cleartexts, given n, b, c. – Since you get the same c when you start from n, b, and any of other three cleartexts. 57 Four Possible Cleartexts • What are these four possible cleartexts? – Let r1 be a square root of b2+4c mod p. – Let r2 be a square root of b2+4c mod q. – Due to Chinese remainder theorem: • m1 corresponds to a root R1 such that R1= r1 (mod p), R1 = r2 (mod q). • m2: R2 = -r1 (mod p), R2 = r2 (mod q) • m3: R3 =r1 (mod p); R3 = -r2 (mod q) • m4: R4 = -r1 (mod p); R4 = -r2 (mod q) 58 Grouping Cleartexts • We put the four cleartexts (and the corresponding roots) on the corners of a rectangle: m1 (R1): + + m3 (R3): + - m2 (R2): - + m4 (R4): - - – If we are lucky enough to get two cleartexts that are neighbors, we can factor n. 59 Factoring n based on Two Neighbor Cleartexts • Note that any two neighbor roots are – Modular equivalent mod one prime factor; – Modular negative mod the other prime factor. • So their difference is – 0 mod the first prime factor; – Non-zero mod the second prime factor. • Thus gcd(difference, n) is the first prime factor. 60 Back to m and m’ • The difference between two roots are 2 times the difference between the cleartexts. gcd (R1-R2, n) = gcd (2(m1-m2), n) = gcd(m1-m2, n) • So we succeed in factoring if m and m’ are neighbors on the rectangle. 61 Probability of Success • What is the probability of success? – m is uniform and independent from m’ – So the probability is ½. • Done with security of Rabin cryptosystem. 62 Semantic Insecurity. • Although Rabin cryptosystem makes it hard to find cleartext, it does not provide stronger security guarantee. • Just like RSA, Rabin cryptosystem is deterministic. – Each cleartext has a unique encryption. – Thus it can’t be semantically secure. 63 Insecurity against CCA • Recall the security is w.r.t CPA. – No longer true w.r.t CCA. • With CCA, the adversary can ask for decryptions. – So he picks a message at random, encrypts it, and asks for decryption. – Then he factors n using his original picked cleartext and the decryption he gets. – The Rabin cryptosystem is totally broken. 64 Optional Topic: Formal Treatment of Public Key Cryptosystem • For public key cryptosystems like RSA and Rabin, we can model them as trapdoor one-way functions. • A trapdoor one-way function is actually a family of functions {fi} with efficient algorithms I, D, F and two security properties. – For convenience, in the following we assume all fi are bijections. 65 Trapdoor One-way Function (1) • The three efficient algorithms: – F: computes fi(x) from i and x. – D: samples x from the domain of fi(). – I: takes security parameter k as input and outputs an index i and the corresponding trapdoor t. 66 Trapdoor One-way Function (2) • Security property: Hardness to invert For index i (distributed as output of I) and x (distributed as output of D), for all efficient algorithm A, for all polynomial p(), for all sufficiently large k, Prob[A(fi(x))=x] <1/p(k) (This definition only works for fi() with large domains) 67 Trapdoor One-way Function (3) • Security Property: Easiness to Invert with Trapdoor. There exists an efficient algorithm A such that, for all (i, t) in the range of I, for all x, A(fi(x), t)=x 68 Trapdoor OWF and PKC • Index: Encryption key. • Trapdoor: Decryption key. • I: Key generation algorithm. • D: Cleartext sampling algorithm. • F: Encryption algorithm. • Hardness to invert: Security guarantee of PKC. • Easiness to invert with trapdoor: Existence of decryption algorithm. 69