Filed under: Uncategorized | Tags: algorithms, data streams, game theory, greed, local, mapreduce, theory

We just recently finished talking about greedy algorithms in my graduate algorithms class. In the past few months, I’ve developed a new appreciation for greed. Part of this is due to a discussion with Alan Borodin at a recent workshop, who made a good case for greediness as a stand-in for simplicity. Anyone working in algorithms will agree that simple algorithms are better; even more so in the area of distributed computing, where even simple algorithms can be notoriously difficult to implement. Unfortunately, it’s hard to improve what you can’t measure, and so there is a real need for good measures of algorithmic simplicity. This makes me think of several important techniques in algorithm design and how these might intersect with the notion of simplicity.

- Local algorithms: In distributed computing, local algorithms are those that run in time independent of the network size – i.e. each node in the network only receives information from its local neighborhood. Clearly, these types of algorithms are more simple than algorithms that require information to be exchanged over long distances in the network. Approximate edge coloring of graphs and some resource allocation problems are amenable to this approach. See the two great papers: What can be computed locally? by Naor and Stockmeyer and What cannot be computed locally by Kuhn, Moscibroda and Wattenhoffer for more on this area.
- Data streams: Algorithms that use little space are likely to be simple. Data stream algorithms use
**very**little space (or more specifically the amount of space used is very small compared to the input size). While the analysis for many of these algorithms can be quite sophisticated, the algorithms themselves can usually be described in less than a quarter of a page of pseudo-code. Muthukrishnan’s book is a great resource in this area (plus it is freely available here). Recently, there has been a lot of interest in applying the data stream model in a distributed setting. See for example the Massive, Unordered Data (MUD) model, which tries to capture the approach of systems like Mapreduce. - Greedy algorithms : Many of us are familiar with several algorithms of this type. Perhaps Kruskall’s algorithm is the most frequently used greedy algorithm that is always correct. Of course, there are many, many heuristics that are greedy that are not guaranteed to be correct. I wonder what are the most frequently used greedy approximation algorithms. Nothing comes to mind except for the greedy set cover algorithm or greedy approximation algorithms for maximizing submodular functions.
- “Selfish” algorithms: This is a phrase I just made up to describe distributed algorithms that require each node to behave selfishly. In particular, these are problems for which 1) the social welfare in any Nash equilibria is close to the optimal social welfare, and 2) the players can quickly converge to a Nash by following a locally optimal strategy. There is a
**lot**of interest in these types of algorithms in the distributed computing community right now, partially, I think, because of the great fact that they are simple to describe and implement. Selfish algorithms are also simple in the sense that they may arise naturally when large groups of agents get together; in some sense, the algorithmic engineer is (at least partially) removed from the picture.

Are there other ways of formulating problems that ensure “simple” algorithms? Perhaps you, dear reader, can help me flesh out this list.

**1 Comment so far**

Leave a comment

Whenever I see that someone is teaching Greedy algorithms in an algorithms class, I feel obliged to promote my “randomized greedy” variant, Bubblesearch, which is about as simple as greedy but finds better solutions if you have more time. The paper can be found at

http://www.eecs.harvard.edu/~michaelm/postscripts/ipl2006.pdf .

Comment by Michael MitzenmacherOctober 29, 2009 @ 11:56 pm