A recent press release gives some information on how censorship is performed in the Chinese version of Skype; the results are discussed in more detail in a paper by Jeffrey Knockels, Jed Crandall and myself. Currently, a significant portion of Internet censorship is keyword based: any content that contains a keyword that is on some blacklist is censored. Countries that perform keyword censorship generally try to hide these blacklists, probably both for political reasons and to make it difficult to evade censorship by using neologisms: new words that have the same meaning but are not on the blacklist.

In some peer-to-peer applications, this censorship is done on the client side, so there is a subroutine in, e.g. Skype chat, that checks if an outgoing message contains a keyword on the blacklist. If you are the censor and you don’t care about revealing the blacklist, then there are techniques for doing this in an efficient manner (hint: FSA’s). However, it’s an interesting (but evil) theoretical problem to think about how to efficiently do keyword censorship if you are also trying to also hide the blacklist. In particular, if you want to hide it from someone who may be running your executable in a debugger. Hint: the Chinese Skype program did it incorrectly and that’s why we were able to decrypt their blacklist!

Filed under: Uncategorized | Tags: distributed computing, papers, reliability, theory

In today’s post, I want to focus on some of the highlights of the lower bound that I mentioned last time on the “Networks of Noisy Gates“ problem. Recall that, in this problem, you are given a function that can be computed with a circuit of regular gates. You want to construct a new circuit with the smallest number of noisy gates that computes correctly with probability greater than . These new noisy gates each fail independently with some small probability and when they fail, they produce the complement of their correct output.

The result I want to talk about today is the lower bound by Pippenger, Stamoulis and Tsitsiklis which shows that there are some functions that can be computed with gates without noise, but which require noisy gates. In particular, they show that this is true if is the exclusive or function.

The first part of their lower bound is a result that shows a new model of failure where wires can also fail can be made equivalent to the old model of failure where only gates can fail. In particular, in the new model, each wire fails with some probability , independently of all other wires and gates, and failure of a wire results in transmitting the complement of the bit that is sent down that wire.

Lemma 3.1 of the paper “Lower Bound for the Redundancy of Self-correcting Arrangements of Unreliable Functional Elements” by Dobrushin and Ortyukov shows the following. Let be the maximum number of input wires for any gate. Then for any and for any set of inputs to a gate, there exists a vector of malfunction probabilities on the gate such that the probability that the gate fails to produce the correct output is exactly . Essentially this shows that, the new model where we have a small probability of failure on both wires and gates is stochastically equivalent to the old model where we have a larger probability of error just on the gates.

Now we assume we have inputs to the to our exclusive-or function , and we let be these inputs. For the purposes of the lower bound, we’ll assume that each is an independent random variable that is with probability and otherwise. Now for this lower bound, we’re going to be very generous to the noisy circuit. We’re going to assume that 1) no gate ever fails and 2) the only wires that can fail are those that are directly connected to some input bit ( each of these wires fails independently with probability ).

For the -th input bit, we’ll let be the number of wires that are connected directly to that input bit, and hence carry the value if they are not faulty. Now in the backend of the circuit, which is computing the exclusive-or of the input bits, there needs to be some estimate for each value of . Not surprisingly, it is possible to prove (as the paper does) that Maximum Likelihood way to estimate is to take the majority bit over all the wires that are connected to input . We’ll let be an indicator r.v. that is iff the Maximum Likelihood estimate for is wrong. Note that , since if *every* wire that carries that value is faulty, then clearly the circuit will use the wrong value.

Next, note that the probability that the noisy circuit fails is just . So now we have a cute probability problem: assume you’re give independent indicator random variables, you know the probability that each of them is , and you want to compute the probability that their exclusive-or is . The next lemma solves this problem with the help of generating functions.

*Proof:* Let be the generating function for . Note that . Now let and let . Now comes the clever part: note that . This is true since is always and is if is even and if is odd.

But by linearity of expectation, . Next, note that since the are independent. Thus, and . This completes the proof.

The remainder of the proof for the lower bound is just algebra. We’ll make use of the inequality between arithmetic and geometric means, which states that for any set of numbers, , it holds that .

Lemma 2is

*Proof:* Let be the probability of error for the noisy circuit. By assumption, . Using Lemma~1 and our bound on , we have that

Now using the inequality between arithmetic and geometric means, we have that

Again using the inequality between arithmetic and geometric means (now on the term that is being subtracted) and the fact that , we get:

Isolating the term in the above inequality gives us that

Since is a constant depending only on and since , the above inequality completes the proof.

Finally, we note that since each gate is assumed to have constant fan in, if the number of wires is , it follows that the number of gates in the noisy circuit must also be . That basically completes the proof of the lower bound. The interesting thing is that the only difficulty we exploited is the difficulty of estimating the input values in the situation where the wires may be faulty!

The current issue of Computer Science Review is devoted to celebrating the research contributions of Papadimitriou [1]. The issue contains a nice mix of accolades and ideas. On the accolades side, I was impressed by the degree to which Papadimitriou has helped create a vibrant theory community in his home country of Greece, which has produced more than its fair share of notable theory researchers.

On the idea side, there are two survey papers of work connected to Papadimitriou that I particularly liked, one by Koutsoupias on the k-server problem and one by Kleinberg and Raghavan on internet structure, network routing and web information. One of the very first dissertations I ever read was Koutsoupias’ dissertation (done under Papa.), which showed that the work function algorithm for the k-server problem had a competitive ratio of 2k-1, where the previous best known ratio was exponential in k. This dissertation helped me fall in love with CS theory: beautiful math and ideas, a crisp problem statement, a clear understanding of when progress is made on a problem. Plus, the entire dissertation is only 41 pages – that’s perhaps the best way to appreciate what a major result it was!

Bonus Papadimitriou link: Check out the following Papa. talk on why “CS is the new math”.

[1] Tip of the hat to journal editor-in-chief and friend Josep Diaz who air-mailed me the current issue of the journal. A very enjoyable part of my sabbatical last year was spent in Barcelona at UPC working with Josep.

Piere Fraigniaud and George Giakkoupis from the University of Paris at Diderot have a really nice paper in this upcoming Principles of Distributed Computing (PODC). Their paper titled “The Effect of Power-Law Degrees on the Navigability of Small Worlds” builds on the classic paper by Jon Kleinberg on navigating small world networks.

Jon Kleinberg’s classic paper concerns navigation in a grid network where each node, in addition to its local edges, has one additional long range edge. What Kleinberg showed is that provided that, for each node, this long range link covers a distance d with probability proportional to d^2, then a greedy routing algorithm will ensure that any node can reach any other node in the network within no more than about log^2 n hops where n is the number of nodes in the network[1]. Moreover, the exponent in this probability is pretty important, even a slight deviation from an exponent of 2 results in networks that can not be efficiently navigated by greedy algorithms. In this way, Kleinberg was thus one of the first people to describe a type of network that might mimic the social network that allowed quick routing in Stanley Milgram’s famous six degrees of separation experiments.

So what about this new paper by Piere and George? Well for many years the exponent of 2 in log^2 has bothered people. Piere and George show that it is possible to get rid of it with power laws. In particular, they show that if instead of each node having exactly 1 long distance link, the number of long distance links per node follows a certain powerlaw distribution, then greedy routing works in about log n hops. A powerlaw distribution means that the number of nodes with a number of long distance links x is proportional to x^k for some fixed constant k. This is a so called heavy tail distribution which occurs in many natural complex systems. Surprisingly, Piere and George show that the type of powerlaw distribution for which greedy routing works is when k is in the range between 2 and 3, which is very similar to the exponent one observes for degree distributions in many naturally occuring social networks.

As far as I know this is one of the first papers that suggests a nice functional property of powerlaw distributions. In particular, it shows that powerlaw distributions are more powerful than other distributions in achieving a specific mathematical goal. Are there other algorithmic or mathematical problems that powerlaw distributions are “good” for. It looks like a very nice paper.

[1] I’m using the term about to meet O(log^2 n) or roughly a function that grows like C log^2 n for some fixed constant C.