Machinations


The Surprising Persistence of Peer-to-Peer

Maxwell Young and I were talking the other day about the ups and downs of research in peer-to-peer systems. In 2002, when I got my PhD, I and everyone I knew at UW, felt that the popularity of p2p research was at a crescendo, and would quickly taper off in a year or two.  However, this morning, seven years later, I’m  reviewing IPDPS submissions, and I see that about 20% of the papers are on p2p (or their close relative overlay networks).  What’s going on?

Partially, I think the interest in academic circles comes from the fact that p2p research allows us to study what happens when there is no leader.  There are many challenging and fun problems in designing distributed systems that work when all components of the system are equal; that is, equal in terms of resources available, and also equal in the sense that all members of the system both use the system, and contribute to the system.   Maybe p2p thinking suits the egalitarian bent of academics?   Maybe it comes from a desire to imitate natural systems like ants and bees?

However, a perennial question is: are there legitimate uses of p2p systems?  Isn’t the trend currently in the opposite direction, with cloud computing promising that someday networks will consist of mindless clients on one side, and computationally powerful servers on the other.  In such a situation, will there be much need for direct communication between the clients?  It’s  hard for me, at least, to predict where these trends will eventually play out.  However, I would not be surprised if both the p2p extreme and the weak client, powerful server extreme continue to exist side-by-side for a long time to come.

I did want to try to list some potential legitimate uses of p2p that I have heard about recently below. I’d love to hear about others, or arguments for or against the continued existence of p2p.  Here’s are some of the big system (or ideas for systems) that I know about now.

  • Vanish: A system that prevents archiving of sensitive data.  In other words, Vanish attempts to enable that data like private email exchanges, photos, or messages can be given a deadline after which they simply can no longer be reconstructed.  To do this, Vanish breaks up content into pieces using Shamir secret sharing, distributes these pieces across a p2p network, and depends on sufficiently active churn in the peer-to-peer network to ensure that eventually enough of these pieces will leave the network, and so the original message will be lost forever.  Vanish got a nice writeup in the New York Times in July, but the original system has been shown to be vulnerable to a certain type of Sybil attack in this paper.
  • Akamai is a company with a billion dollar market cap that enables Internet content and application delivery.  As I understand it, the “peers” in the Akamai system are actually companies; and the Akamai network ensures robust and efficient delivery of content from these “peers” to end users.  This paper is what enabled Akamai to get its first round of VC funding, I hear.  But, I’m not sure if the algorithms now used still have any connection to the paper.
  • Skype is a peer-to-peer system for voice calls over the Internet that I used a lot when I was on sabbatical in Europe.  In my experience, the voice quality and reliability of google chat was much better than Skype, but somehow it was much easier for us to get friends and family to use skype than google chat.  I still use Skype nowadays for research conference calls.
  • Bittorrent is a p2p system allowing for quick collaborative downloading of large data items.  Estimates are that it consumes about one quarter to one half of all traffic on the Internet.  Don’t know how much of this traffic is “legitimate”, but at least some portion of bittorrent bandwidth has been used by publishers for distribution of free music, TV shows, and software.  Vuze is a bittorrent client with over 1 million users: clearly a well-used network, and perhaps the largest overlay network on the Internet.


Sneak Preview

A sneak preview of the September SIGACT News Distributed Computing Column is now available here (Technion) and here (MIT).  Before I get into the newest column, I would just like to give a shout out to Idit Keidar, who I think has done an exceptional job of maintaining and advertising the SIGACT Distributed Computing column over the past few years.  Archived copies of past columns are on the above web sites – check them out.

The column for September, by Lidong Zhou lead researcher at Microsoft Research Asia , is about the tension between theoretical and practical research in distributed systems.  In fact, a major focus of the article (and a timely one for this blog) is practical ways to solve consensus in order to ensure reliability in large-scale distributed systems.  There are several interesting things I learned (or relearned) from this column including:

  • The replicated machine approach based on consensus is considered by industry (well one good researcher at Microsoft at least) to be directly applicable to the problem of cloud computing.
  • Algorithmic simplicity and graceful degradation of security guarantees are two properties desired by practitioners that currently seem to be ignored by theoreticians.
  • One of the reasons for the popularity of the Paxos algorithm for solving consensus is its flexibility: it captures critical parts of the consensus problem but leaves “non-essential” details such as the exact protocol for choosing a leader unspecified.
  • Lidong mentions that much of the inspiration for the column came from his “responsibility for the distributed systems behind various on-line services, such as Hotmail and the Bing search engine.”  I’m curious if this means that algorithms for consensus, such as Paxos, are at the heart of the robustness mechanisms for such systems.  If so, then the consensus problem is even more pervasive than I thought.