CS Z03 4030 Distributed Systems

Document Sample
scope of work template
							User Authentication and
Cryptographic Primitives
         Brad Karp
    UCL Computer Science




       CS GZ03 / M030
     11th November, 2008
                       Outline
• Authenticating users
  – Local users: hashed passwords
  – Remote users: s/key
  – Unexpected covert channel: the Tenex password-
    guessing attack
• Symmetric-key-cryptography
• Public-key cryptography usage model
• RSA algorithm for public-key cryptography
  –   Number theory background
  –   Algorithm definition
  –   Cryptographic strength of RSA
  –   Ease of misusing RSA
                                                     2
    Authentication of Local Users
• Goal: only file’s owner can access file
• UNIX authentication policy:
   – Each file has an owner principal: an integer user ID
   – Each file has associated owner permissions (read,
     write, execute, &c.)
   – Each process runs with integer user ID; only can
     access file as owner if matches file’s owner user ID
   – OS assigns user ID to user’s shell process at login
     time, authenticated by username and password
   – Shell process creates new child processes with same
     user ID
• How does UNIX know the correspondence
  among <username, user ID, password>, for all
  users?                                                    3
               Straw Man:
      Plaintext Password Database
• Keep password database in a file, e.g.:
  bkarp:3715:secretpw
  mjh:4212:multicast
• Passwords stored in file in plaintext
• Make file readable only by privileged superuser
  (root)
• /bin/login program prompts for usernames
  and passwords on console; runs as root, so can
  read password database
• How well does this scheme meet original goal?
                                                    4
        Cryptographic Primitive:
      Cryptographic Hash Function
• Don’t want someone who sees the password
  database to learn users’ passwords
• Cryptographic hash function, y=H(x) such that:
  – H() is preimage-resistant: given y, and with
    knowledge of H(), computationally infeasible to
    recover x
  – H() is second-preimage-resistant: given y,
    computationally infeasible to find x’x s.t.
    H(x)=H(x’)=y
• Widely used cryptographic hash functions:
  – MD-5: output is 128 bits, broken
  – SHA-1: output is 160 bits; on verge of being broken
  – SHA-256: output is 256 bits, best current practice
                                                          5
             Better Plan:
      Hashed Password Database
• Keep password database in a file:
  bkarp:3715:Xc8zOP0ZHJkp
  mjh:4212:p6FsAtQl4cwi
• Instead of password plaintext x, store
  H(x)
• Make file readable by all (!)
• One-wayness of H() means no one can
  recover x from H(x), right?
  – WRONG! Users choose memorable
    passwords…
                                           6
 Insight: Counting Possible Passwords

• If users pick random n-character passwords
  using c possible characters, how many guesses
  expected to guess one password?
             cn/2
  e.g., 8 characters, each ~90 possibilities, 2.15 x 1015
• Do users pick random passwords?
  – Of course not; very hard to remember
  – Common choice: word in native language
• How many words in common use in modern
  English?
  – 50,000-70,000 (or far fewer, if you read Metro)
                                                            7
Dictionary Attack on Hashed Password
              Databases
• Suppose hacker obtains copy of password file
  (until recently, world-readable on UNIX)
• Compute H(x) for 50K common words
• String compare resulting hashed words against
  passwords in file
• Learn all users’ passwords that are
  common English words after only 50K
  computations of H(x)!
• Same hashed dictionary works on all
  password files in world!
                                                  8
        Salted Password Hashes

• Generate a random string of bytes, r
• For user attack still possible after attacker
Dictionarypassword x, store [H(r,x), r] in
   password file
sees password file!
• Result: same password produces different close
Users should pick passwords that aren’tresult
to on every machine
   dictionary words.
   – So must see password file before can hash dictionary
   – …and single hashed dictionary won’t work for multiple
     hosts
• Modern UNIX: password hashes salted; hashed
  password database readable only by root
                                                         9
           Tenex Password Attack:
            An Information Leak
• Tenex OS stored directory passwords in plaintext
• OS supported system call:
Lessons:
   – pw_validate(directory, pw)
Don’t store passwords in cleartext. to stored
• Implementation simply compared pw
  password in directory, real, and can
Information leaks are char by char be
extremely difficult to find and eliminate.
• Clever attack:
   – Make pw span two VM pages, put 1st char of guess in
     first page, rest of guess in second page
   – See whether get a page fault—if not, try next value
     for 1st char, &c.; if so, first char correct!
   – Now position 2nd char of guess at end of 1st page, &c.
   – Result: guess password in time linear in length!
                                                         10
    Remote User Authentication

• Consider the case where Alice wants to log in
  remotely, across LAN or WAN from server
• Suppose network links can be eavesdropped by
  adversary, Eve
• Want scheme immune to replay: if Eve
  overhears messages, shouldn’t be able to log in
  as Alice by repeating them to server
• Clear non-solutions:
  – Alice logs in by sending {alice, password}
  – Alice logs in by sending {alice, H(password)}
                                                    11
  Remote User Authentication (2)

• Desirable properties:
  – Message from Alice must change
    unpredictably at each login
  – Message from Alice must be verifiable at
    server as matching secret value known only
    to Alice
• Can we achieve these properties using
  only a cryptographic hash function?

                                                 12
  Remote User Authentication: s/key

• Denote by Hn(x) n successive applications of
  cryptographic hash function H() to x
   – i.e., H3(x) = H(H(H(x)))
• Store in server’s user database:
   alice:99:H99(password)
• At first login, Alice sends:
   {alice, H98(password)}
• Server then updates its database to contain:
   alice:98:H98(password)
• At next login, Alice sends:
   {alice, H97(password)}
   – and so on…                                  13
          Properties of s/key

• Just as with any hashed password
  database, Alice must store her secret on
  the server securely (best if physically at
  server’s console)
• Alice must choose total number of logins
  at time of storing secret
• When logins all ―used‖, must store new
  secret on server securely again

                                               14
Secrecy through Symmetric Encryption

• Two functions: E() encrypts, D() decrypts
• Parties share secret key K
• For message M:
  – E(K, M)  C
  – D(K, C)  M
• M is plaintext; C is ciphertext
• Goal: attacker cannot derive M from C
  without K
                                              15
   Idealized Symmetric Encryption:
            One-Time Pad
• Secretly share a truly random bit string P
  at sender and receiver
• Define    as bit-wise XOR
• C = E(M) = M P
• M = D(C) = C P
• Use bits of P only once; never use them
  again!


                                           16
                Stream Ciphers:
              Pseudorandom Pads
• Generate pseudorandom bit sequence (stream)
  at sender and receiver from short key
• Encrypt and decrypt by XOR’ing message with
  sequence, as with one-time pad
• Most widely used stream cipher: RC4
• Again, never, ever re-use bits from
  pseudorandom sequence!
• What’s wrong with reusing the stream?
  –   Alice  Server: c1 = E(s, ―Visa card number‖)
  –   Server  Alice: c2 = E(s, ―Transaction confirmed‖)
  –   Suppose Eve hears both messages
  –   Eve can compute:
      m = c1 c2 ―Transaction confirmed‖
                                                           17
 Symmetric Encryption: Block Ciphers

• Divide plaintext into fixed-size blocks
  (typically 64 or 128 bits)
• Block cipher maps each plaintext block to
  same-length ciphertext block
• Best today to use AES (others include
  Blowfish, DES, …)
• Of course, message of arbitrary length;
  how to encrypt message of more than one
  block?

                                          18
  Using Block Ciphers: ECB Mode

• Electronic Code Book method
• Divide message M into blocks of cipher’s
  block size
• Simply encrypt each block individually
  using the cipher
• Send each encrypted block to receiver
• Presume cipher provides secrecy, so
  attacker cannot decrypt any block
• Does ECB mode provide secrecy?
                                             19
             Avoid ECB Mode!
• ECB mode does not provide robust secrecy!
• What if there are repeated blocks in the
  plaintext? Repeated as-is in ciphertext!
• What if sending sparse file, with long runs of
  zeroes? Non-zero regions obvious!
• WW II U-Boat example (Bob Morris):
  – Each day at same time, when no news, send
    encrypted message: ―Nichts zu melden.‖
  – When there’s news, send the news at that time.
  – Obvious when there’s news
  – Many, many ciphertexts of same known plaintext
    made available to adversary for cryptanalysis—a
    worry even if encryptions of same plaintext produce
    different ciphertexts!                                20
  Using Block Ciphers: CBC Mode




• Better plan: make encryptions of successive
  blocks depend on one another, and initialization
  vector known to receiver
                                                     21
   Integrity with Symmetric Crypto:
    Message Authentication Codes
• How does receiver know if message modified en
  route?
• Message Authentication Code:
  – Sender and receiver share secret key K
  – On message M, v = MAC(K, M)
  – Attacker cannot produce valid {M, v} without K
• Append MAC to message for tamper-resistance:
  – Sender sends {M, MAC(K, M)}
  – M could be ciphertext, M = E(K’, m)
  – Receiver of {M, v} can verify that v = MAC(K, M)
• Beware replay attacks—replay of prior {M, v} by
  Eve!
                                                       22
HMAC: A MAC Based on Cryptographic
         Hash Functions
• HMAC(K, M) =
         H(K opad . H(K ipad . M))
• where:
  – . denotes string concatenation
  – opad = 64 repetitions of 0x36
  – ipad = 64 repetitions of 0x5c
  – H() is a cryptographic hash function, like SHA-
    256
• Fixed-size output, even for long messages

                                                  23
 Public-Key Encryption: Interface

• Two keys:
  – Public key: K, published for all to see
  – Private (or secret) key: K-1, kept secret
• Encryption: E(K, M)  {M}K
• Decryption: D(K-1, {M}K)  M
• Provides secrecy, like symmetric encryption:
  – Can’t derive M from {M}K without knowing K-1
• Same public key used by all to encrypt all
  messages to same recipient
  – Can’t derive K-1 from K

                                                   24
     Number Theory Background:
     Modular Arithmetic Primer (1)
• Recall the ―mod‖ operator: returns
  remainder left after dividing one integer
  by another, the modulus
  – e.g., 15 mod 6 = 3
• That is:
  a mod n = r
    which just means
  a = kn + r    for some integers k and r
• Note that 0 <= r < n
                                              25
   Modular Arithmetic Primer (2)

• In modular arithmetic, constrain range of
  integers to be only the residues [0, n-1], for
  modulus n
   – e.g., (12 + 13) mod 24 = 1
   – We may also write
• Modular arithmetic retains familiar properties:
  commutative, associative, distributive
• Same results whether mod taken at each
  arithmetic operation, or only at end, e.g.:
   (a + b) mod n = ((a mod n) + (b mod n)) mod n
   (ab) mod n = (a mod n)(b mod n) mod n            26
  Modular Arithmetic: Advantages

• Limits precision required: working mod n,
  where n is k bits long, any single
  arithmetic operation yields at most 2k bits
   – …so results of even seemingly expensive ops,
Cryptography leveragesx “difficult”
     like exponentiation (a ) fit in same number of
operations; want reversing encryption
     bits key to be computationally
without as original operand(s)
   – Lower precision means faster arithmetic
intractable!
• Some operations in modular arithmetic are
  computationally very difficult:
   – e.g., computing discrete logarithms:
     find integer x s.t.                          27
 Modular Arithmetic: Inverses (1)

• In real arithmetic, every integer has a
  multiplicative inverse—its reciprocal—and
  their product is 1
  – e.g., 7x = 1  x = (1/7)
• What does an inverse in modular
  arithmetic (say, mod 11) look like?

  – that is, 7x = 11k + 1 for some x and k
  – so x = 8 (where k = 5)

                                              28
        Aside: Prime Numbers

• Recall: prime number is integer > 1 that is
  evenly divisible only by 1 and itself
• Two integers a and b are relatively prime
  if they share no common factors but 1;
  i.e., if gcd(a, b) = 1
• There are infinitely many primes
• Large primes (512 bits and longer) figure
  prominently in public-key cryptography

                                            29
 Modular Arithmetic: Inverses (2)

• In general, finding modular inverse means
  finding x to
Algorithm s.t.find modular inverse: extended
• Does modular inverse always requires
Euclidean Algorithm. Tractable; exist?
O(log n) divisions.
  – No! Consider
• In general, when a and n are relatively prime,
  modular inverse x exists and is unique
• When a and n not relatively prime, x doesn’t
  exist
• When n prime, all of [1…n-1] relatively prime to
  n, and have an inverse in that range

                                                 30
Euler’s Phi Function: Efficient Modular
     Inverses on Relative Primes
• φ(n) = number of integers < n that are
  relatively prime to n
• If n prime, φ(n) = n-1
• If n=pq, where p and q prime:
  φ(n) = (p-1)(q-1)
• If a and n relatively prime, Euler’s generalization
  of Fermat’s little theorem:
     aφ(n) mod n = 1
• and thus, to find inverse x s.t. x = a-1 mod n:
     x = aφ(n)-1 mod n
                                                    31
           RSA Algorithm (1)

• [Rivest, Shamir, Adleman, 1978]
• Recall that public-key cryptosystems use
  two keys per user:
  – K, the public key, made available to all
  – K-1, the private key, kept secret by user




                                                32
          RSA Algorithm (2)

• Choose two random, large primes, p and
  q, of equal length, and compute n=pq
• Randomly choose encryption key e, s.t. e
  and (p-1)(q-1) are relatively prime
• Use extended Euclidean algorithm to
  compute d, s.t. d = e-1 mod ((p-1)(q-1))
• Public key: K = (e, n)
• Private key: K-1 = d
• Discard p and q
                                             33
           RSA Algorithm (3)

• Encryption:
  – Divide message M into blocks mi, each shorter
    than n
  – Compute ciphertext blocks ci with:
    ci = mie mod n
• Decryption
  – Recover plaintext blocks mi with:
    mi = cid mod n


                                                34
   Why Does RSA Decryption Recover
         Original Plaintext?
• Observe that cid = (mie)d = mied
• Note that
   because e and d are inverses mod (p-1)(q-1)
• So:
                       , and thus ed = k(p-1)+1
                       , and thus ed = h(q-1)+1
• Consider case where mi and p are relatively prime:
                       by Euler’s generalization of Fermat’s
     little theorem
   – so
• And case where mi a multiple of p:

• Thus in all cases,
                                                           35
Why Does RSA Decryption Recover
     Original Plaintext? (2)
• Similarly,
• Now:


• Because p, q both prime and distinct:

• So


                                          36
   Misuses of RSA Break Secrecy
• When encrypting, what if plaintext drawn from
  very small set (e.g., {―yes‖, ―no‖})?
• Employees escrow secret documents, encrypted
  with company’s public key
  – Upon firing or death of one employee, company
    releases plaintext to another
  – Employee E takes employee A’s ciphertext
    c = me mod n, escrows c2e mod n
  – Employee E fired; co-conspirator F gets 2m!
• Chosen ciphertext attack (CCA): eavesdrop
  a ciphertext c; submit specially concocted
  messages for decryption; study resulting
  plaintexts; learn plaintext, m = cd mod n
                                                    37
   RSA: Not Quite Exponentiation
• At first glance, RSA operations appear to be
  raising a message to a power
• But they’re not, really…the mod n means RSA in
  fact a trap-door permutation
  – Map one element, m, of set {0, …, n-1} to another, c
  – Not invertible without knowing d
• Non-invertibility applies to whole of m and c; not
  to individual bits of m and c, or other properties
  over m and c, e.g., parity of m
  – In escrow attack, multiplicative relationship among
    RSA ciphertexts exists, despite non-invertibility
• It’s possible that learning even one bit of m may
  help recover all of m from c
                                                          38
Adaptive Chosen Ciphertext Attack on
           RSA in SSL 3.0
• SSL 3.0 encrypted with RSA by padding plaintext
  into blocks using PKCS #1 standard, as follows:
  – 0x00 | 0x02 |
    8 or more non-zero random bytes | 0x00 |
    plaintext block
• SSL decrypts received ciphertext, checks if result
  in this format; returns ―format error‖ if not!
• Bleichenbacher’s adaptive CCA attack: with
  about one million messages to server, attacker
  can recover m for previously eavesdropped
  ciphertext c = me mod n
  – When chosen ciphertext accepted by server, attacker
    knows first two plaintext bytes with certainty!    39
 Making RSA Secure Against Adaptive
            CCA Attacks
• Intuition: want plaintext input to RSA to be all-
  or-nothing transform of actual message
   – e.g., so that multiplicative property over ciphertexts
     doesn’t reveal message, and knowing one bit doesn’t
     reveal anything about whole message
• Desirable transform properties:
   – Randomness: unique plaintext for repeated identical
     messages
   – Redundancy: make most strings invalid ciphertexts
   – Entanglement: knowing partial information about
     input to RSA should reveal nothing about message
   – Invertibility: of course, must be able to recover
     original message when decrypting

                                                           40
         Practical Padding for RSA:
              OAEP+ [Shoup]




• Transforms message M into RSA input M’
• Not proven adaptive CCA secure, but heuristically so
                                                         41
     Digital Signatures with RSA
• RSA trap-door permutation also useful for digital
  signatures
• Public-key signature operations:
  – Sign: S(K-1, m)  {m}K-1
  – Verify: V(K, {m}K-1, m}  {true, false}
• Provides integrity, like a MAC:
  – Cannot produce valid <m, {m}K-1> pair without
    knowing K-1
• With RSA:
  – Sign using private key, using trap-door applied when
    decrypting
  – Verify using public key, using permutation applied
    when encrypting
                                                           42
   Multiplicative Attack Against RSA
               Signatures
• As in CCA, attacker may try to exploit
  multiplicative relationship among RSA
Lesson:
  permutation inputs and outputs, to decrypt
Don’t sign whole messages presented to you
  eavesdropped ciphertexts
by others!
• Eve stores ciphertext c encrypted for Alice,
  wants to recover corresponding m
• Using Alice’s public key, {n, e}, Eve:
  –   Chooses random number r < n
  –   Computes y = cre mod n
  –   Eve asks Alice to sign y
  –   Alice sends Eve yd mod n = cdred mod n = rcd mod n
  –   Eve computes r-1 mod n, then recovers
              m = cd mod n = r-1rcd mod n
                                                           43
 Only Sign Message Hashes with RSA!

• Again, want all-or-nothing transform over
  message before signing with trap door
• Full-domain hash:
  – Before signing message, compute hash of
    message sized to be same number of bits as
    RSA modulus n
  – Sign the hash, not the message
  – Hash reveals nothing about underlying
    message, nor messages arithmetically related
    to it
                                               44
         Costs of Cryptography

• Public-key operations significantly more
  computationally expensive than symmetric-key
  ones
• Modern CPU can symmetrically encrypt and MAC
  faster than 100 Mbps
• Public-key encryption typically 100X slower than
  symmetric crypto
  – This relationship changes as hardware changes!
• Result: tend to use public-key encryption and
  signatures only on short messages
                                                     45
        Hybrid Cryptography

• Goal: mix speed of symmetric-key
  flexibility of public-key cryptography
• Send symmetric key encrypted with public
  key; message encrypted with symmetric
  key




                                         46
   Pitfall: Public Key Provenance

• Suppose client wishes to know it’s talking
  to particular server
• Where does client get server’s public key?
• How does client know it has correct public
  key for real server, and not attacker?
• Man-in-the-middle attack:
  – Client connects to attacker
  – Attacker gives client attacker’s public key
  – Client believes communicating with real server

                                                 47
           Further Reading

• The MIT Guide to Picking Locks
• Schneier, Bruce, Applied Cryptography, 2nd
  ed.
• Bleichenbacher, Daniel, Chosen Ciphertext
  Attacks Against Protocols Based on the
  RSA Encryption Standard PKCS #1



                                           48

						
Related docs
Other docs by wuyunqing
GRADUATE HANDBOOK
Views: 181  |  Downloads: 0
Send completed form to PHSSNC_usa.redcross.org
Views: 124  |  Downloads: 0
What To Know About Cumberland County
Views: 166  |  Downloads: 0
Working together on Monitoring Care Homes
Views: 1  |  Downloads: 0
Megastudy IR
Views: 171  |  Downloads: 0