Scalable Byzantine Fault Tolerance
March 5, 2010, 7:25 pm
Filed under: Uncategorized | Tags: , , ,

Valerie King and I recently finished a paper that I wanted to talk about here.  First, some motivation.  For a whole slew of reasons, adversarial, aka Byzantine, attacks are garnering increasing attention now in the systems community.  For example, the following position paper, from ’08, makes a great case for Byzantine Fault Tolerance (BFT): “BFT: The Time is Now“.  Unfortunately, system folk have been less than enamored with the efficiency of Byzantine agreement algorithms:

  • “Unfortunately, Byzantine agreement requires a number of messages quadratic in the number of participants, so it is infeasible for use in synchronizing a large number of replicas” [1]
  • “Eventually batching cannot compensate for the quadratic number of messages [of Practical Byzantine Fault Tolerance (PBFT)]” [2]
  • “The communication overhead of Byzantine Agreement is inherently large” [3]

There have been many attempts to circumvent this inefficiency by 1) trying to ensure Byzantine fault tolerance in a particular system without solving Byzantine agreement; and 2) trying to solve Byzantine agreement in special cases for a type of adversary that is unique to that paper.

I think this is wrong.  Why?  To build secure systems, we need secure and trusted components, and so we shouldn’t create new components (i.e. subroutines robust to Byzantine faults) from scratch for each new system.

So why the problem with Byzantine agreement?  The problem is not the problem. Despite what naysayers may say about Byzantine agreement(BA) being “too paranoid”, “too hard”, or  “too abstract”, BA is the right problem formulation for building secure networks.  To see this, read the intros of the dozen systems papers from the last several years that say that a key component of what they want to do reduces to BA.  The problem is the solution.  Or rather the lack of a solution.  We still don’t have a general algorithm to solve BA that is practical enough to plug in to the many systems that want to solve it.  So despite all the work in this area, I think BA is still a major unsolved problem in distributed computing.

So now is the point where I’d like to tell you that we solved this problem.  Sadly we haven’t.  However, we have been working hard in the past couple of years designing algorithms for BA that reduce communication overhead.  In particular, we’ve been able to design algorithms where the number of messages each node sends out is about O(sqrt(n)) instead of O(n) (technically soft-O(sqrt(n)), which ignores log factors).  The price we pay is a non-zero probability of failure, but a probability of failure that is polynomially small in the network size.  The bigger the network, the smaller the probability of failure.

So then what about this new paper?  Our new paper describes an algorithm that solves BA with about O(sqrt(n)) messages and polylog latency, against an adaptive adversary, which can choose which nodes to take over at any time during the algorithm, up to taking over up to a 1/3 fraction of the nodes.  This is the first time we went head-to-head with an adaptive adversary and it definitely makes the problem more challenging and interesting.  We had to use a slew of new techniques (samplers, a kind of iterated secret sharing, a new technique for going from almost everywhere to everywhere BA) that I’d like to talk about in another post.  Standard caveats for theory research apply (hidden constants, hidden log terms, etc, etc) but I do think there are new ideas in this paper that can be useful in designing a practical algorithm.

OK, without any additional fanfare,  here is the paper.


[1] S. Rhea, P. Eaton, D. Geels, H. Weatherspoon, B. Zhao, and J. Kubiatowicz. Pond: the OceanStore prototype. In Proceedings of the 2nd USENIX Conference on File and Storage Technologies, pages 1–14, 2003.

[2] J. Cowling, D. Myers, B. Liskov, R. Rodrigues, and L. Shrira. Hq replication: A hybrid quorum protocol for byzantine fault tolerance. In In Proceedings of Operating Systems Design and Implementation (OSDI), San Diego, CA, USA, 2005.

[3] Chien-Fu Cheng, Shu-Ching Wang, and Tyne Liang. The anatomy study of server-initial agreement for general hierarchy wired/wireless networks. Computer Standards & Interfaces, 31(1):219 – 226, 2009.


1 Comment so far
Leave a comment

Pretty exciting, Jared. All the best.

Comment by Amitabh Trehan

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: