Filed under: Uncategorized | Tags: consensus, distributed computing, theory

Long overdue (both the award and my reporting of it here)

Filed under: Uncategorized | Tags: algorithms, Bing, cloud computing, consensus, distributed computing, Hotmail, Idit Keidar, Lidong Zhou, Microsoft Research, Paxos, SIGACT, theory

A sneak preview of the September SIGACT News Distributed Computing Column is now available here (Technion) and here (MIT). Before I get into the newest column, I would just like to give a shout out to Idit Keidar, who I think has done an exceptional job of maintaining and advertising the SIGACT Distributed Computing column over the past few years. Archived copies of past columns are on the above web sites – check them out.

The column for September, by Lidong Zhou lead researcher at Microsoft Research Asia , is about the tension between theoretical and practical research in distributed systems. In fact, a major focus of the article (and a timely one for this blog) is practical ways to solve consensus in order to ensure reliability in large-scale distributed systems. There are several interesting things I learned (or relearned) from this column including:

- The replicated machine approach based on consensus is considered by industry (well one good researcher at Microsoft at least) to be directly applicable to the problem of cloud computing.
- Algorithmic simplicity and graceful degradation of security guarantees are two properties desired by practitioners that currently seem to be ignored by theoreticians.
- One of the reasons for the popularity of the Paxos algorithm for solving consensus is its flexibility: it captures critical parts of the consensus problem but leaves “non-essential” details such as the exact protocol for choosing a leader unspecified.
- Lidong mentions that much of the inspiration for the column came from his “responsibility for the distributed systems behind various on-line services, such as Hotmail and the Bing search engine.” I’m curious if this means that algorithms for consensus, such as Paxos, are at the heart of the robustness mechanisms for such systems. If so, then the consensus problem is even more pervasive than I thought.

Filed under: Uncategorized | Tags: algorithms, consensus, game theory, PODC, theory

**Editors Note: This is the first day of reports from PODC ’09 by my student Amitabh Trehan. Principles of Distributed Computing (PODC) is one of the premiere conferences focusing on theory of distributed computing.**

An interesting talk in the morning was the presentation of the best student paper Max Registers, Counters and Monotone Circuits (Aspnes, Attiya and Censor) presented by Keren Censor. The final message of the talk was `Lower bounds do not always have the final say ‘ (quoted from the talk). The paper deals with implementation of concurrent data structures in shared memory systems, where n processes communicate by reading and writing to shared multi-reader multi-writer registers. A lower bound given by Jayanti et al shows that these operations take Omega(n) space and Omega(n) time steps in the worst case, for many common data structures. On careful analysis of the lower bound proof, the authors realised that they could develop sub-linear algorithms for many useful applications where the worst case would not occur. These are for algorithms where the number of operations will be bounded e.g. for applications which have a limited life-time or several instances of the data structure can be changed. They go on to show how some of these structures can be constructed with some nice use of recursion and my favorite data structures – trees! This reminded me of a discussion I had the previous evening with Victor Luchangco about Nancy Lynch. He contended that one trait contributing to the success of her work (and of her students) was the rigorous questioning of every assumption and definition of interest they came across. A good lesson!

Robert Van Renesse‘s keynote talk **`Refining the way to Consensus**‘ set up the stage for later talks on consensus. His talk described several refinements to basic consensus algorithms from simple to non-blocking consensus, to election consensus, to voters consensus, and finally to recommender consensus. He stressed the importance of exposing undergraduates to the consensus problem, because of the fact that this is a problem they are likely to see again the real world. What drew chuckles was a cartoon showing the ‘ideal’ programmer, a faceless human stuck to the computer screen and keyboard coding to strictly meet the specifications, and an ‘intuitive’ programmer, a happy guy with his back to the computer smiling having no idea what the specifications he was implementing were but knowing when he was done with his coding job. **Editor Note: Look for more information on the consensus problem in a blog post in the next week or two. **

There were two excellent `game theoretic’ talks in the afternoon. Georgios Piliouras presented his paper with Robert Kleinberg and Eva Tardos `Load Balancing Without Regret in the Bulletin Board Model‘. They consider load balancing games in the bulletin board model (players can find out delay on all machines, but have no information on what their delay would have been if they had selected another machine) and show solutions using regret-minimization which are exponentially better than the correlated equilibrium. No-regret algorithms are an example of alternative (to the weaknesses of Nash equilibria) solution concepts based on average outcome of self-adapting agents who react to each other’s strategies in repeated play of the game. Martin Hoefer presented his paper with Ackermann, Berenbrink and Fischer `Concurrent Imitation Dynamics in Congestion Games‘ which discusses the dynamics emerging when agents sample and possibly imitating other agent’s strategies when to do so will improve their utility. Their main result is to show that this imitation strategy leads to rapid convergence (logarithmic in number of players) to approximate equilibria for congestion games. An approximate equilibrium is one where only a small fraction of the players have latency much better or worse than the average latency.