Public Key Cryptosystem by 011z8k8


									Public Key Cryptosystem

      Sheng Zhong

           Recall Definition
• A public key cryptosystem is (M, C, K, G,
  E, D):
  – M: cleartext message space
  – C: ciphertext space
  – K: key space
  – G: generate encryption/decryption key pair
    from key length
  – E: encrypt cleartext given encryption key
  – D: decrypt ciphertext given decryption key

          RSA Cryptosystem
• Most well known and widely used public-
  key cryptosystem.
• Named after inventors: Rivest, Shamir,
  and Adleman.
  – Got Turing award for RSA
• Based on factoring of large number
  – In fact, more than that.

      RSA Key Generation (1)
• Generate two large random primes p, q
  – p, q are usually of the same length.
• N=pq
• Choose appropriate exponent e (of the same
• Compute d such that for all m,

               ed 1
           m            1(mod n)
     RSA Key Generation (2)
• Public Key: (n,e)
• Private Key: (n,d)
• Discard p,q
  – Very important for security of RSA.

   RSA Encryption/Decryption
• Cleartext space: (an appropriate subset
  of ) {1, …, n-1}
• Ciphertext space: (an appropriate subset
  of ) {1, …, n-1}
• E((n,e), m)=me mod n.
• D((n,d), c)=cd mod n.

Why does the decryption work?
c mod n  (m ) mod n
 d             e d

         m mod n

               ( ed 1) 1
         (m                 ) mod n
               ed 1
         (m mod n)  (m mod n)
         1 (m mod n)
         m mod n
     Exponentiation Algorithm
• Both encryption and decryption need to
  compute modular exponentiation.
  – What algorithm do we use to do this?
• By definition, ae is multiplication of a for e
  – Then even computing something like 754238
    takes a lot of time.
  – How about computing ae for
    34324234234320043543570 and e=…?          8
        Fast Exponentiation(1)
• Let’s write e in binary, for example:
• Then e
       a a     1010001000110    b

           a   1000000000000 b
                                     a    10000000000 b
                                                          a   100b
                                                                     a    10b

           a a a a
                212   210   22       21

• Enough if we can fast compute                  212      210        22             21
                                              a ,a ,a ,a
      Fast Exponentiation (2)
• But how can we fast compute
                    212        210        22   21
               a ,a ,a ,a
  – Starting from a;
  – Keep squaring, we get

         21    22         23         24
       a , a , a , a ,...
  – Until we have all items we need

 Fast Exponentiation Algorithm
• Input: a, e
• Output: y=a^e
int b=a, y=1, ee=e;
while(ee!=0){      //invariant (b^ee) y = a^e
  if(ee&1){        //is odd
     y*= b;}       //multiply result by power
  b*=b; ee>>=1;}     //compute next power

            Computing d?
• Now we see how encryption/decryption
  algorithms work.
  – But how does key generation algorithm work?
  – In particular, how to find d?
  – Recall we need for all m, med-1=1(mod n).
  – This requires a little number theory.

          Residue Class (1)
• For any modulus n, any integer a, we can
              A={a’: a’=a mod n}
This is called a residue class.
• For any modulus n, the residue classes
  mod n constitute a partition of integers.
  – Any two residue classes mod n are disjoint.
  – Every integer is in a residue class mod n.
          Residue Class (2)
• Being in the same residue class is called
  modular equivalent mod n. This is an
  equivalence relation.
  – Any integer is modular equivalent to itself.
  – If a and b are modular equivalent, then b and
    a are modular equivalent.
  – If a and b are modular equivalent, and if b and
    c are modular equivalent, then a and c are
    modular equivalent.
           Residue Class (3)
• Residue classes preserves basic arithmetics.
  – If a=a’ (mod n), b=b’ (mod n), then a+b=a’+b’
    (mod n)
  – Similar properties hold for subtraction,
    multiplication, division, etc.
• So we usually use one element of a residue
  class to represent it.
  – For example, we use 1 to represent {1, n+1, -
    n+1, 2n+1, -2n+1, …}
    Complete Residue System
• There are n residue classes in total with
  respect to modulus n.
  – We use 0, 1, 2, …, n-1 to represent each of
    them, respectively.
  – Then we have a complete system of residue
• Formally, we define Zn={0,1,…, n-1}.

    Reduced Residue System
• We are particularly interested in the
  residue classes that are coprime to n.
  – Note that if a number in a residue class is
    coprime to n, then so are all other numbers in
    this class.
  – This is called a reduced system of residue
• Formally, we define Zn*={a: gcd(a,n)=1
  and a is in Zn}
          Multiplication in Zn*
• For multiplication (mod n---we often omit
  mod n in the follows), Zn* is a group. This
  means the following properties hold:
  – It’s closed: if a and b are in Zn*, then ab are
    also in Zn*.
  – If a,b,c are in Zn*, then (ab)c=a(bc).
  – 1 is in Zn*. Note that for all a, 1a=a1=a.
  – For all a in Zn*, 1/a is also in Zn*.
     • Note 1/a (mod n) is defined as b such that ab=1
       (mod n)
       Euler Totient Function
• We define Φ(n)=|Zn*|
  – This is called Euler Totient Function.
  – Intuitively, this is the number of integers in
    {0,1,…, n-1} that are coprime to n.
• Suppose that n=ab.
  – If gcd(a,b)=1, then Φ(n)=Φ(a) Φ(b).

         Computing Φ(n) (1)
• Suppose

            n  p
                   n1      nk

        (n)   ( p1 )... ( pk )
                    n1          nk

          Computing Φ(n) (2)
• What is Φ(pn) then?
  – Φ(pn) is the number of integers in {0,1, …, pn-1}
    that are coprime to pn
  – This is equal to the number of integers in {0,1, …,
    pn-1} that are coprime to p
  – There are pn-1 multiples of p.
  – So Φ(pn)=pn-pn-1=pn-1(p-1)
• Therefore,
                    n1 1                   nk 1
        (n)  p   1        ( p1 1)           ( pk 1)
      (Fermat-)Euler Theorem
• Another property of Euler Totient Function: For
  all a in Zn*,

                       ( n)
                a              1
• Why is this true?
  – Because Zn* is a group and Φ(n) is its size…

   Proof of Euler Theorem (1)
• In general, we show that, for any group G,
  any a in G,
• Suppose the elements of G are b1, …, b|G|.
  Consider ab1, …, ab|G|.
  – Clearly, any two elements abi and abk in this
    sequence are not equal. (Otherwise bi and bk
    are also equal.)
  – So this is a permuation of b1, …, b|G|.

    Proof of Euler Theorem (2)
• Since ab1, …, ab|G| is a permuation of b1,
  …, b|G|, we have
            ab1 … ab|G| = b1 … b|G|
• The above is equivalent to
           a|G| b1 … b|G| = b1 … b|G|
Which means

  Returning to Key Generation
• Recall we need to compute d such that for
  all m (in Zn*),
                med-1=1 (mod n)
• Due to Euler theorem, this can be implied
                  Φ(n) | ed-1
which is equivalent to
               ed=1 (mod Φ(n) )
    Returning to Computing d
• In fact, we can show that (for all m) med-
  1=1 (mod n) is not only implied by ed=1

  (mod Φ(n) ) , but also equivalent to it.
  – So to compute d for key generation, we only
    need to find d such that ed=1 (mod Φ(n) ).
  – Let u= Φ(n). Now the question is to find an
    algorithm that computes d such that ed=1
    (mod u) when given e and u.

    Compute Modular Inverse
• Recall the definition of 1/e mod something.
  – Here we exactly want to find d=1/e mod u.
  – This is called modular inverse.
• To compute d such that ed=1(mod u), we
  note it is equivalent to that (there exists v
  such that)
• Now the problem becomes: given e, u, find
  d such that ed+uv=1.
        Euclidean Algorithm
• Suppose e is coprime to u.
  – Then gcd(e,u)=1.
  – We have an Euclidean algorithm that finds
    gcd(e,u) given e,u.
  – This algorithm happens to give d,v such that
    ed+uv=gcd(e,u) (if it is extended).
  – So we can apply it here to find the d we want!

 How Euclidean Algorithm Works?
• Question: Given e, u, we need to compute
• We note that gcd(e,u) divides e mod u and
  u mod e, since it divides both e and u.
  – So we start from e and u.
  – Repeat dividing:
    • If the smaller number divides the greater number,
      stop and output the smaller.
    • Otherwise, replace the greater number with
      (greater mod smaller).

 Why Euclidean Algorithm Works?
• It must stop.
   – Otherwise, the sum of the two numbers is always
   – But the sum is always a positive number.
   – This is impossible.
• When it stops, it gives gcd(e,u).
   – Clearly, gcd(e,u) always divides the two variables in
     the algorithm; so it also divides the output.
   – Then we only need to show the output divides

      Output Divides gcd(e,u)
• Since all common divisors of e and u
  divide gcd(e,u), we only need to show the
  output is a common divisor.
  – In the last iteration, the output is a common
    divisor of the two variables.
  – When we go back one step, we have the
    same property.
  – Keep going until we reach e, u.

 Extended Euclidean Algorithm
• We still need to extend Euclidean algorithm to find
  d, v such that ed+uv=gcd(e,u).
• Think of the first iteration of Euclidean Algorithm:
  – We start from e=1e+0u and u=0e+1u.
  – Suppose u>e; then we compute u mod e and use it to
    replace u.
  – Here u mod e = 1 u - (u div e) e.
  – Note we always know the coefficiencies of u and e.
  – Keep going this way; finally we have the coefficiencies
    of u and e for the output---d and v.
• Done with computing d! And done with key
  generation.                                            32
           RSA Assumption
• The security of RSA cryptosystem is
  based on RSA assumption:
For all efficient algorithm A, for uniformly
  random k-bit primes p, q, for n=pq, for e
  uniformly picked from Z Φ(n) *, for m
  uniformly picked from Zn*,
             Pr[A(n,e,me mod n)=m]
is negligible.
What does RSA Assumption imply?
• Clearly, it implies that for a uniformly
  random message m, the adversary can’t
  get any advantage for finding m when
  given the ciphertext.
• Does it imply finding d is hard?
  – Also clearly, since if given d it is easy to
    compute m.

What does RSA Assumption imply?
• It implies computing Φ(n) from n is hard.
  – Otherwise, we can compute d=1/e (mod Φ(n) )
    from Φ(n) and e using Extended Euclidean
• It implies factoring n is hard.
  – Otherwise, we can factor n to get p and q; then
    we get Φ(n) =(p-1)(q-1).

What does RSA Assumption imply?
• It implies finding (a, b) (a ≠ b or -b(mod n) ; a, b ≠
  0 (mod n) ) such that a2=b2 (mod n) is hard.
   – Otherwise, we have a2-b2 =0(mod n), which means n
     divides (a-b)(a+b).
   – Without Loss Of Generality, suppose 0<b<a<n. Then
     0< a-b <n, 0<a+b<2n.
   – Since a ≠-b (mod n), a+b ≠n, which means a+b and a-
     b each contains one of the two prime factors.
   – Using Euclidean Algorithm, we can find gcd(a+b,n) and
     gcd(a-b,n), which are the two primes.

          Factoring vs. RSA
• We have seen RSA assumption implies
  factoring is hard. But does the hardness of
  factoring imply the RSA assumption?
  – This is an open question.
  – New results show probably not.

             Security of RSA
• Under RSA assumption, RSA is not only “secure”
  against a passive adversary (as we have
  – It is also “secure” against a Chosen Plaintext Attack.
  – Here “secure” means hard to find cleartext (which is
• Chosen Plaintext Attack (CPA): Stronger model
  than passive adversary.
  – Active adversary can obtain the ciphertexts
    corresponding to the cleartexts he chooses.
  – Any public key cryptosystem MUST be secure against
    CPA due to Kerchoff principle.
    More on Active Adversary
• Chosen Ciphertext Attack (CCA):
  – Adversary can obtain the cleartexts
    corresponding to the ciphertexts he chooses.
  – However, the above help is only available
    before the ciphertext is given to adversary.
• Adaptive Chosen Ciphertext Attack (CCA2):
  – Unlike CCA, the help with decryption is always
  – However, adversary can’t use the above help
    to decrypt the ciphertext he is given.
    (Otherwise, no encyption algorithm can be
    secure.)                                     39
Attack on RSA: Short Message
• RSA is not randomized: each cleartext has
  only one ciphertext.
  – If we know the message is only a few bits (say, 3
    bits), we can test all possible cleartexts.
  – We encrypt each of them and compare it with the
  – The above attack is also valid in case the
    message is known to belong to any small set.
  – Imagine if you encrypt your vote in president
    election using RSA!                           40
Attack on RSA: Meet in the Middle
• The attack on short message can be
  enhanced by meet-in-the-middle attack.
  – With a good probability, a short message can
    be factored into two even shorter messages:
  – With RSA, E(m)=me mod n= m1e m2e mod
  – Then we can encrypt the even shorter
    messages and compare their product with the
          Attack in Real Life
• Such attack is especially effective when
  RSA is used to encrypt DES key.
  – DES key is only 56-bit.
  – Thus finding DES key is easy with meet-in-
    the-middle attack.
• We can also use it to attack RSA-
  encrypted passwords.
• So do NOT simply use RSA to encrypt
  passwords and DES keys.
Attack on Short Message and RSA
• Why the above attack is possible while RSA is
  – Because RSA is secure only in the sense that it is
    hard to find cleartext.
  – Not in the sense that it is hard to find any partial
    information about the cleartext (i.e., not semantically
  – With a short message, we essentially have known all
    except a small part about the message.
  – This remaining small part can’t be protected by RSA.

         Attack on RSA: CCA2
• Consider an active adversary in CCA2 model.
  – The adversary is given ciphertext c, but he can’t
    use the help of decryption to get the cleartext of
  – However, he can choose m’ and compute
  – Then he asks decryption for c’.
  – He gets (c’)d= (c(m’)e)d = cd (m’)ed = cd m’
  – So he can compute cd from (c’)d.
• Thus RSA is not secure against CCA2 attack.
      Common Misuse of RSA
• In network security, to accelerate computing,
  people often use small exponent e.
   – Usually e=3; sometimes e=11.
   – Many protocols/systems include such examples.
   – Is this secure?
• Unfortunately, it is easy to find d with small e.
   – Efficient algorithms have been published for such e
     (e.g., the Coppersmith algorithm).
   – Similarly, using a small d is also subject to attack.

        Rabin Cryptosystem
• Another popular public key cryptosystem.
• Invented by Mike Rabin (Turing award
• Based on the hardness of computing
  square root mod n.

      Rabin: Key Generation
• Choose two primes p, q of the same
• Compute n=pq.
• Pick integer b in Zn*.
• Public Key: (n,b)
• Private Key: (p,q)

Rabin: Encryption & Decryption
• For cleartext message m, the ciphertext
              c=m(m+b) mod n
• For ciphertext c, the cleartext is
where sqrt(b2+4c) is a square root of b2+4c
  mod n that is less than n and greater than

Computing Square Root Mod n
• For decryption, we need sqrt(b2+4c), a
  square root of b2+4c mod n.
  – First, we compute a square root of b2+4c mod
    p. Suppose this is r1.
  – Second, we compute a square root of b2+4c
    mod q. Suppose this is r2.
  – Third, we compute r such that r= r1 (mod p)
    and r= r2 (mod q).
  – r is what we want.
 Chinese Remainder Theorem
• Why r is what we want? This is based on
  the Chinese Remainder Theorem:
For n=n1…nk (where n1, …, nk are pairwise
  coprime), r=x (mod n) if and only if r=x
  (mod n1), …, r=x (mod nk).

    Computing Square Root Mod
• But we still need an algorithm to compute
  a square root of b2+4c mod prime (where
  the prime is p and q, respectively).
  – When prime p=4k+3, it is easy to compute
    square roots mod p:
           r1= (+ or -) (b2+4c)(p+1)/4 (mod p)
  – For p=4k+1 we also have algorithms. But we
    often have both p and q =3 (mod 4); in this
    case, n is called a Blum integer.
 Proof for Square Root Mod Prime
• Why is r1 a square root?
  – It is easy to see
               (r1)2=(b2+4c)(p+1)/2 (mod p)
  – Since b2+4c does have square roots,
    (b2+4c)1/2 is meaningful. So we can rewrite the
    above as
             (r1)2=((b2+4c)1/2 ) (p+1) (mod p)

 Proof for Square Root Mod Prime
• Recall Φ(p)=p-1. So we have
           (r1)2=((b2+4c)1/2 ) (p+1) (mod p)
                =((b2+4c)1/2 ) (Φ(p)+2) (mod p)
  By Euler Theorem,
               =((b2+4c)1/2 ) 2 (mod p)
               =b2+4c       (mod p)

           Number of Roots
• Note that we have two square roots mod
  each prime.
  – So there are four square roots mod n.
• Which of these four is the original
  – Should add redundancy into the cleartext, so
    that it can be recognized.

           Security of Rabin
• It is hard to find a cleartext of Rabin
  Cryptosystem under CPA attack.
  – This is true under the assumption that
    factoring n is hard.
• Why?
  – Suppose there is an efficient algorithm A that
    computes m from n, b, c.
  – Then we can construct an efficient algorithm
    A’ that factors n.
      Reduction of Factorization to
          Rabin Decryption
•   Choose m at random.
•   Compute c=m(m+b) mod n.
•   Use A to decrypt c---suppose we get m’.
•   Claim: gcd(m-m’, n) is a prime factor of n
    with probability 1/2.
    – But why? We need a closer look at the

        Closer Look at Cleartext
• What is m’? Is it necessarily equal to m?
  –   Not really.
  –   Recall the decryption is (-b+sqrt(b2+4c))/2.
  –   There are four possible roots of b2+4c.
  –   So there are four possible cleartexts, one of which is
      equal to m, and one of which is equal to m’.
• No algorithm can distinguish m from other three
  cleartexts, given n, b, c.
  – Since you get the same c when you start from n, b,
    and any of other three cleartexts.
     Four Possible Cleartexts
• What are these four possible cleartexts?
  – Let r1 be a square root of b2+4c mod p.
  – Let r2 be a square root of b2+4c mod            q.
  – Due to Chinese remainder theorem:
     • m1 corresponds to a root R1 such that R1= r1 (mod
       p), R1 = r2 (mod q).
     • m2: R2 = -r1 (mod p), R2 = r2 (mod q)
     • m3: R3 =r1 (mod p); R3 = -r2 (mod q)
     • m4: R4 = -r1 (mod p); R4 = -r2 (mod q)

           Grouping Cleartexts
• We put the four cleartexts (and the
  corresponding roots) on the corners of a
    m1 (R1): + +                        m3 (R3): + -

    m2 (R2): - +                        m4 (R4): - -

  – If we are lucky enough to get two cleartexts
    that are neighbors, we can factor n.

      Factoring n based on Two
        Neighbor Cleartexts
• Note that any two neighbor roots are
  – Modular equivalent mod one prime factor;
  – Modular negative mod the other prime factor.
• So their difference is
  – 0 mod the first prime factor;
  – Non-zero mod the second prime factor.
• Thus gcd(difference, n) is the first prime
          Back to m and m’
• The difference between two roots are 2
  times the difference between the
   gcd (R1-R2, n) = gcd (2(m1-m2), n)
                  = gcd(m1-m2, n)

• So we succeed in factoring if m and m’ are
  neighbors on the rectangle.

       Probability of Success
• What is the probability of success?
  – m is uniform and independent from m’
  – So the probability is ½.
• Done with security of Rabin cryptosystem.

        Semantic Insecurity.
• Although Rabin cryptosystem makes it
  hard to find cleartext, it does not provide
  stronger security guarantee.
• Just like RSA, Rabin cryptosystem is
  – Each cleartext has a unique encryption.
  – Thus it can’t be semantically secure.

      Insecurity against CCA
• Recall the security is w.r.t CPA.
  – No longer true w.r.t CCA.
• With CCA, the adversary can ask for
  – So he picks a message at random, encrypts it,
    and asks for decryption.
  – Then he factors n using his original picked
    cleartext and the decryption he gets.
  – The Rabin cryptosystem is totally broken.

 Optional Topic: Formal Treatment
   of Public Key Cryptosystem

• For public key cryptosystems like RSA and
  Rabin, we can model them as trapdoor one-way
• A trapdoor one-way function is actually a family
  of functions {fi} with efficient algorithms I, D, F
  and two security properties.
   – For convenience, in the following we assume all fi are

Trapdoor One-way Function (1)
• The three efficient algorithms:
  – F: computes fi(x) from i and x.
  – D: samples x from the domain of fi().
  – I: takes security parameter k as input and
    outputs an index i and the corresponding
    trapdoor t.

Trapdoor One-way Function (2)
• Security property: Hardness to invert
  For index i (distributed as output of I)
  and x (distributed as output of D), for
  all efficient algorithm A, for all
  polynomial p(), for all sufficiently large
            Prob[A(fi(x))=x] <1/p(k)
(This definition only works for fi() with large
Trapdoor One-way Function (3)
• Security Property: Easiness to Invert with
  There exists an efficient algorithm A
  such that, for all (i, t) in the range of I,
  for all x,
                 A(fi(x), t)=x

       Trapdoor OWF and PKC
•   Index: Encryption key.
•   Trapdoor: Decryption key.
•   I: Key generation algorithm.
•   D: Cleartext sampling algorithm.
•   F: Encryption algorithm.
•   Hardness to invert: Security guarantee of PKC.
•   Easiness to invert with trapdoor: Existence of
    decryption algorithm.


To top