Docstoc

Byzantine fault tolerance Rice University

Document Sample
Byzantine fault tolerance Rice University Powered By Docstoc
					Byzantine fault-tolerance

        COMP 413
         Fall 2002
                Overview
• Models
  – Synchronous vs. asynchronous systems
  – Byzantine failure model
• Secure storage with self-certifying data
• Byzantine quorums
• Byzantine state machines
                  Models
Synchronous system: bounded message
  delays (implies reliable network!)
Asynchronous system: message delays are
  unbounded

In practice (Internet): reasonable to assume
  that network failures are eventually fixed
  (weak synchrony assumption).
             Model (cont’d)
• Data and services (state machines) can be
  replicated on a set of nodes R.
• Each node in R has iid probability of failing
• Can specifiy bound f on the number of
  nodes that can fail simultaneously
            Model (cont’d)
Byzantine failures
• no assumption about nature of fault
• failed nodes can behave in arbitrary ways
• may act as intelligent adversary
  (compromised node), with full knowledge
  of the protocols
• failed nodes may conspire (act as one)
Self-certifying data
          Byzantine quorums
• Data is not self-certifying (multiple writers
  without shared keys)
• Idea: replicate data on sufficient number of
  replicas (relative to f) to be able to rely on
  majority vote
Byzantine quorums: r/w variable
Representative problem: implement a
 read/write variable

Assuming no concurrent reads, writes for now
Assuming trusted clients, for now
Byzantine quorums: r/w variable
How many replicas do we need?
• clearly, need at least 2f+1, so we have a majority
  of good nodes
• write(x): send x to all replicas, wait for
  acknowledgments (must get at least f+1)
• read(x): request x from all replicas, wait for
  responses, take majority vote (if no concurrent
  writes, must get f+1 identical votes!)
         R


         W
Byzantine quorums: r/w variable
Does this work? Yes, but only if
• system is synchronous (bounded msg delay)
• faulty nodes cannot forge messages
  (messages are authenticated!)
Byzantine quorums: r/w variable
Now, assume
• Weak synchrony (network failures are fixed
  eventually)
• messages are authenticated (e.g., signed
  with sender’s private key)
Byzantine quorums: r/w variable
Let’s try 3f+1 replicas (known lower bound)
• write(x): send x to all replicas, wait for 2f+1
  responses (must have at least f+1 good replicas
  with correct value)
• read(x): request x from all replicas, wait for 2f+1
  responses, take majority vote (if no concurrent
  writes, must get f+1 identical votes!? – no, it is
  possible that the f nodes that did not respond were
  good nodes!)
              R


         W
Byzantine quorums: r/w variable
Let’s try 4f+1 replicas
• write(x): send x to all replicas, wait for 3f+1
  responses (must have at least 2f+1 good replicas
  with correct value)
• read(x): request x from all replicas, wait for 3f+1
  responses, take majority vote (if no concurrent
  writes, must get f+1 identical votes!? – no, it is
  possible that the f faulty nodes vote with the good
  nodes that have an old value of x!)
                R


           W
 Byzantine quorums: r/w variable
Let’s try 5f+1 replicas
• write(x): send x to all replicas, wait for 4f+1
  responses (must have at least 3f+1 good replicas
  with correct value)
• read(x): request x from all replicas, wait for 4f+1
  responses, take majority vote (if no concurrent
  writes, must get f+1 identical votes!)
• Actually, can use only 5f replicas if data is written
  with monotonically increasing timestamps
                    R


                W
Byzantine quorums: r/w variable
Still rely on trusted clients
• Malicious client could send different values to
  replicas, or send value to less than a full quorum
• To fix this, need a byzantine agreement protocols
  among the replicas

Still don’t handle concurrent accesses
Still don’t handle group changes
      Byzantine state machine
BFT (Castro, 2000)
• Can implement any service that behaves
  like a deterministic state machine
• Can tolerate malicious clients
• Safe with concurrent requests
• Requires 3f+1 replicas
• 5 rounds of messages
       Byzantine state machine
• Clients send requests to one replica
• Correct replicas execute all requests in same order
• Atomic multicast protocol among replicas ensures
  that all replicas receive and execute all requests in
  the same order
• Since all replicas start in same state, correct
  replicas produce identical result
• Client waits for f+1 identical results from different
  replicas
BFT protocol
       BFT: Protocol overview
• Client c sends m = <REQUEST,o,t,c>σc to the
  primary. (o=operation,t=monotonic timestamp)
• Primary p assigns seq# n to m and sends <PRE-
  PREPARE,v,n,m> σp to other replicas. (v=current
  view, i.e., replica set)
• If replica i accepts the message, it sends
  <PREPARE,v,n,d,i> σi to other replicas. (d is hash
  of the request). Signals that i agrees to assign n to
  m in v.
       BFT: Protocol overview
• Once replica i has a pre-prepare and 2f+1
  matching prepare messages, it sends
  <COMMIT,v,n,d,i> σi to other replicas. At this
  point, correct replicas agree on an order of
  requests within a view.
• Once replica i has 2f+1 matching prepare and
  commit messages, it executes m, then sends
  <REPLY,v,t,c,i,r> σi to the client. (The need for
  this last step has to do with view changes.)
                      BFT
• More complexity related to view changes and
  garbage collection of message logs
• Public-key crypto signatures are bottleneck: a
  variation of the protocol uses symmetric crypto
  (MACs) to provide authenticated channels. (Not
  easy: MACs are less powerful: can’t prove
  authenticity to a third party!)