Congratulations to my student Olumuyiwa (“M”) Oluwasanmi who successfully defended his dissertation on “Practical, Scalable Algorithms for Byzantine agreement” last week!
Usually on this blog, I allow only a 1 slide dissertation summary, but today I’m making an exception for 2 slides (at least they have the same x-axis
. The slides below summarize the performance of 4 Byzantine agreement algorithms. The CKS algorithm is from the paper “Random oracles in Constantinople: Practical asynchronous Byzantine agreement using cryptography“. The remaining 3 algorithms are from M’s dissertation, and the algorithms with the RB prefix assume the existence of a random beacon.
Filed under: Uncategorized | Tags: Byzantine agreement, distributed computing, grants, theory
The process of grant writing was completely opaque to me when I was a grad student. As a faculty member, I now take the position that grant writing is intrinsically important, and that all grad students should learn something about it in order to get insight into the long-term planning process for doing research. For this reason, it’d be nice to see more examples of funded grants publicly available.
Last January, I posted a one page project summary for one of my funded NSF proposals. This year again I was lucky enough to have a proposal funded and I’m posting the one page summary below. I hope that some readers find it useful.
Filed under: Uncategorized | Tags: algorithms, Byzantine agreement, theory
The September SIGACT distributed computing column is now available at Technion and at MIT. This month Valerie and I contributed a survey on scalable Byzantine Agreement – Marko Vukolic considers BA for cloud computing providers. Idit’s intro to the column is below.
“After almost 30 years of research on Byzantine Agreement (BA), the problem continues to be relevant and to re-invent itself in new ways. This column discusses two new research directions that further push the scale of BA. It suggests new domains where BA can, and perhaps should, be deployed. First, our main contribution, by Valerie King and Jared Saia, argues for running BA in setting with a large number of nodes (or processors). Valerie and Jared survey new BA protocols whose communication complexity is scalable in the number of participating processors. This, they argue, enables their deployment in larger-scale domains for which BA was considered infeasible before. The second contribution, by Marko Vukolic, considers another emerging domain for BA. It calls for wider-scale deployment of BA protocols, not among many processors, but rather over multiple cloud computing providers.
The column ends with a short announcement about Morgan Claypool’s new monograph series on Distributed Computing Theory, edited by Nancy Lynch.
Many thanks to Valerie, Jared, and Marko for sharing their insights!”
Filed under: Uncategorized | Tags: algorithms, Byzantine agreement, conferences, distributed computing, PODC, theory
The end of the semester here at UNM just about killed me. In addition to the usual academic hubbub, I hosted a visitor, submitted a paper, finished up the camera ready for our PODC paper, and dealt with my toddler who decided it would be a good couple of weeks to wake up every night at 2am (teething?, headache?, likes a dose of cherry-flavored children’s tylenol as a nightcap?)
The camera-ready of our paper, “Breaking the O(n^2) Bit Barrier: Scalable Byzantine agreement with an Adaptive Adversary” (that I blogged about previously) is now available here. This was probably the most time I’ve spent going from an accepted paper to a camera ready – mostly because the algorithm in the paper consisted of several small parts with many connections between the parts. Hopefully, our new version makes everything much easier to understand.
Some great news we received, right after submitting the camera ready, is that our paper was selected to be in the “best paper session” of PODC. The list of all such papers is below – I know that at least some of these papers will be invited to a special issue of the JACM, and all of them look interesting. The best paper session is a new thing for PODC. I definitely like the idea of having multiple best papers, it gives more information about the assessment of the PC than a single best paper. I’ll probably read through most of these before the conference.
Deterministic Distributed Vertex Coloring in Polylogarithmic Time
Barenboim, Elkin
Breaking the O(n^2) Bit Barrier: Scalable Byzantine agreement with an Adaptive Adversary
King, Saia
Optimal Gradient Clock Synchronization in Dynamic Networks
Lenzen, Kuhn, Locher, Oshman
Online set packing and competitive scheduling of multi-part tasks
Emek, Halldorsson, Mansour, Patt-Shamir, Radhakrishnan, Rawitz
How to Meet when you Forget: Log-space Rendezvous in Arbitrary Graphs
Czyzowicz, Kosowski, Pelc
A Modular Approach to Shared-memory Consensus, with Applications to the Probabilistic-write Model
Aspnes
Constant RMR Solutions to Reader Writer Synchronization
Bhatt, Jayanti
Filed under: Uncategorized | Tags: algorithms, Byzantine agreement, security, theory
Valerie King and I recently finished a paper that I wanted to talk about here. First, some motivation. For a whole slew of reasons, adversarial, aka Byzantine, attacks are garnering increasing attention now in the systems community. For example, the following position paper, from ’08, makes a great case for Byzantine Fault Tolerance (BFT): “BFT: The Time is Now“. Unfortunately, system folk have been less than enamored with the efficiency of Byzantine agreement algorithms:
- “Unfortunately, Byzantine agreement requires a number of messages quadratic in the number of participants, so it is infeasible for use in synchronizing a large number of replicas” [1]
- “Eventually batching cannot compensate for the quadratic number of messages [of Practical Byzantine Fault Tolerance (PBFT)]” [2]
- “The communication overhead of Byzantine Agreement is inherently large” [3]
There have been many attempts to circumvent this inefficiency by 1) trying to ensure Byzantine fault tolerance in a particular system without solving Byzantine agreement; and 2) trying to solve Byzantine agreement in special cases for a type of adversary that is unique to that paper.
I think this is wrong. Why? To build secure systems, we need secure and trusted components, and so we shouldn’t create new components (i.e. subroutines robust to Byzantine faults) from scratch for each new system.
So why the problem with Byzantine agreement? The problem is not the problem. Despite what naysayers may say about Byzantine agreement(BA) being “too paranoid”, “too hard”, or “too abstract”, BA is the right problem formulation for building secure networks. To see this, read the intros of the dozen systems papers from the last several years that say that a key component of what they want to do reduces to BA. The problem is the solution. Or rather the lack of a solution. We still don’t have a general algorithm to solve BA that is practical enough to plug in to the many systems that want to solve it. So despite all the work in this area, I think BA is still a major unsolved problem in distributed computing.
So now is the point where I’d like to tell you that we solved this problem. Sadly we haven’t. However, we have been working hard in the past couple of years designing algorithms for BA that reduce communication overhead. In particular, we’ve been able to design algorithms where the number of messages each node sends out is about O(sqrt(n)) instead of O(n) (technically soft-O(sqrt(n)), which ignores log factors). The price we pay is a non-zero probability of failure, but a probability of failure that is polynomially small in the network size. The bigger the network, the smaller the probability of failure.
So then what about this new paper? Our new paper describes an algorithm that solves BA with about O(sqrt(n)) messages and polylog latency, against an adaptive adversary, which can choose which nodes to take over at any time during the algorithm, up to taking over up to a 1/3 fraction of the nodes. This is the first time we went head-to-head with an adaptive adversary and it definitely makes the problem more challenging and interesting. We had to use a slew of new techniques (samplers, a kind of iterated secret sharing, a new technique for going from almost everywhere to everywhere BA) that I’d like to talk about in another post. Standard caveats for theory research apply (hidden constants, hidden log terms, etc, etc) but I do think there are new ideas in this paper that can be useful in designing a practical algorithm.
OK, without any additional fanfare, here is the paper.
References:
[1] S. Rhea, P. Eaton, D. Geels, H. Weatherspoon, B. Zhao, and J. Kubiatowicz. Pond: the OceanStore prototype. In Proceedings of the 2nd USENIX Conference on File and Storage Technologies, pages 1–14, 2003.
[2] J. Cowling, D. Myers, B. Liskov, R. Rodrigues, and L. Shrira. Hq replication: A hybrid quorum protocol for byzantine fault tolerance. In In Proceedings of Operating Systems Design and Implementation (OSDI), San Diego, CA, USA, 2005.
[3] Chien-Fu Cheng, Shu-Ching Wang, and Tyne Liang. The anatomy study of server-initial agreement for general hierarchy wired/wireless networks. Computer Standards & Interfaces, 31(1):219 – 226, 2009.

