Machinations


The Surprising Persistence of Peer-to-Peer

Maxwell Young and I were talking the other day about the ups and downs of research in peer-to-peer systems. In 2002, when I got my PhD, I and everyone I knew at UW, felt that the popularity of p2p research was at a crescendo, and would quickly taper off in a year or two.  However, this morning, seven years later, I’m  reviewing IPDPS submissions, and I see that about 20% of the papers are on p2p (or their close relative overlay networks).  What’s going on?

Partially, I think the interest in academic circles comes from the fact that p2p research allows us to study what happens when there is no leader.  There are many challenging and fun problems in designing distributed systems that work when all components of the system are equal; that is, equal in terms of resources available, and also equal in the sense that all members of the system both use the system, and contribute to the system.   Maybe p2p thinking suits the egalitarian bent of academics?   Maybe it comes from a desire to imitate natural systems like ants and bees?

However, a perennial question is: are there legitimate uses of p2p systems?  Isn’t the trend currently in the opposite direction, with cloud computing promising that someday networks will consist of mindless clients on one side, and computationally powerful servers on the other.  In such a situation, will there be much need for direct communication between the clients?  It’s  hard for me, at least, to predict where these trends will eventually play out.  However, I would not be surprised if both the p2p extreme and the weak client, powerful server extreme continue to exist side-by-side for a long time to come.

I did want to try to list some potential legitimate uses of p2p that I have heard about recently below. I’d love to hear about others, or arguments for or against the continued existence of p2p.  Here’s are some of the big system (or ideas for systems) that I know about now.

  • Vanish: A system that prevents archiving of sensitive data.  In other words, Vanish attempts to enable that data like private email exchanges, photos, or messages can be given a deadline after which they simply can no longer be reconstructed.  To do this, Vanish breaks up content into pieces using Shamir secret sharing, distributes these pieces across a p2p network, and depends on sufficiently active churn in the peer-to-peer network to ensure that eventually enough of these pieces will leave the network, and so the original message will be lost forever.  Vanish got a nice writeup in the New York Times in July, but the original system has been shown to be vulnerable to a certain type of Sybil attack in this paper.
  • Akamai is a company with a billion dollar market cap that enables Internet content and application delivery.  As I understand it, the “peers” in the Akamai system are actually companies; and the Akamai network ensures robust and efficient delivery of content from these “peers” to end users.  This paper is what enabled Akamai to get its first round of VC funding, I hear.  But, I’m not sure if the algorithms now used still have any connection to the paper.
  • Skype is a peer-to-peer system for voice calls over the Internet that I used a lot when I was on sabbatical in Europe.  In my experience, the voice quality and reliability of google chat was much better than Skype, but somehow it was much easier for us to get friends and family to use skype than google chat.  I still use Skype nowadays for research conference calls.
  • Bittorrent is a p2p system allowing for quick collaborative downloading of large data items.  Estimates are that it consumes about one quarter to one half of all traffic on the Internet.  Don’t know how much of this traffic is “legitimate”, but at least some portion of bittorrent bandwidth has been used by publishers for distribution of free music, TV shows, and software.  Vuze is a bittorrent client with over 1 million users: clearly a well-used network, and perhaps the largest overlay network on the Internet.
Advertisements

6 Comments so far
Leave a comment

I have the feeling that you forget two applications that have received (and will probably receive next yaers) the attention of scientists :
PPlive is a peer-to-peer video streaming system. The number of visitors of PPLive website reached 50 millions for the opening celebration of Olympics. Various theoretical problems are still open. This field includes Vide-on-Demand and live streaming problems.
– peer-to-peer virtual worlds, especially games. A conference like Netgames is a nice example of the growing interest to include peer-to-peer exchanges within games.

Comment by Gwendal

A slightly cynical explanation for the continued academic popularity of P2P systems is that they tend to yield hard problems. A hard problem is a delightful thing in academia, because it requires a sophisticated and intellectually-gratifying solution. However, hard problems are obviously undesirable for system builders — so unless the benefits of P2P systems justify their inherent complexities, I think they will continue to be a minority solution. For most problems, just elect a leader, and deal with the leader-less situation via the consensus protocol / leader-election procedure.

There are many challenging and fun problems in designing distributed systems that work when all components of the system are equal

True, although many systems exhibit asymmetrical resources or communication patterns (e.g. locality of reference). Introducing inequality into the “pure” P2P system may then be a sensible optimization — SuperNodes in Skype, for example.

Comment by neilconway

Another big DHT is KAD which according to “A Global View of KAD” by Steiner et al. gives Vuze a run for its money in terms of size. But again, who knows how much of the traffic isn’t breaking copyright 🙂

Comment by Max

On the other side, P2P makes many things so simple. Links can be created at will. The game is played at a higher layer, where market discipline can be enforced practically. Lots of resource available (now consuming globally a big share). Foundation of hot software trends (social networking). Still plenty of older generation client-server applications to be conquered. Once the cloud becomes standardized and entrenched, competition/brokering and interworking of services will invite yet more P2P. And, unlike low-level networking mostly happening at the edges, looks like P2P permits lots of activity in the core. Should take a very long time to become discredited! I look forward to practical version control systems that are truly P2P, without encountering the scaling problems that, say, virtual synchrony ran into.

Comment by TH

[…] The Surprising Persistent of Peer-To-Peer discusses the author’s view that P2P research has remaining “surprisingly” relevant and active in recent years. […]

Pingback by “Looking Up Data In P2P Systems” « Everything is Data




Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s



%d bloggers like this: