Machinations


Self-healing of Byzantine Faults
May 21, 2012, 7:41 pm
Filed under: Uncategorized | Tags: , ,

“You have to have good researchers. Good researchers, all of them.” – Paraphrase of quote from Das Boot.

Sometimes writing a paper can be a challenging slog: goals move as conjectures prove wrong; facts that seem simple turn out to be slippery to prove; experiments must be rerun as the paper evolves; and notation needs to be redone for the sake or readability.

In such a situation, it’s a blessing to have coauthors who will go the extra mile to keep pushing and pushing the paper to completion.  So I would now really like to thank my students Jeffrey Knockel and George Saad for their hard work on our latest paper: “Self-Healing of Byzantine Faults” (abstract below, full paper here).

Note: I wouldn’t exactly compare the research process with serving on a U-boat mission during the decline of the Third Reich.  For me it’s most similar to mountaineering, whose ethos is captured by the motto of the Seattle mountaineers: “Like fun, only different!”

Paper Abstract:

Recent years have seen significant interest in designing networks that are self-healing in the sense that they can automatically recover from adversarial attack. Previous work shows that it is possible for a network to automatically recover, even when an adversary repeatedly deletes nodes in the network. However, there have not yet been any algorithms that self-heal in the case where an adversary takes over nodes in the network. In this paper, we address this gap.

In particular, we describe a communication network over n nodes that ensures the following properties, even when an adversary controls up to t \leq (1/4-\epsilon)n nodes, for any positive \epsilon . First, the network provides point-to-point communication with bandwidth and latency costs that are asymptotically optimal. Second, O(t (log*n)^2) message corruptions occur in expectation, before the adversarially controlled nodes are effectively quarantined so that they cause no more corruptions. We present empirical results showing that our approach may be practical.



BIRS Workshop on “Probabilistic versus Deterministic Techniques for Shared Memory Computation”
February 14, 2012, 3:41 am
Filed under: Uncategorized | Tags: , , ,

Last week, I attended a great workshop in Banff, Canada on “Probabilistic versus Deterministic Approaches to Shared Memory Computation”.  The following is an extremely biased, incomplete and watered down summary – also it only includes morning talks because I was too sleepy in the afternoon to take notes – reader beware.

On Monday morning, Jennifer Welch and I gave back-to-back talks on emulating shared memory objects in “weird” communication models.  Jennifer talked about maintaining shared memory consistency under probabilistic churn in a peer-to-peer network.  In particular, she described recent work that created a shared read-write register in such a model that loses its state only with some small probability.  The technical proofs of correctness relied on careful analysis of continuous time Markov processes.  I believe that this paper describes some of their results in this model.

My own talk was on emulating a single writer, multi-reader in a wireless communication model subject to adversarial jamming (link to slides) – this was based on work with Valerie King and Maxwell Young from last PODC.  I’ve blogged about this somewhat before so won’t repeat myself.  However, if you’re a fan of the golden ratio (and who isn’t?)  then you should read the first half of the slides.  I had a lot of good feedback at the talk on how to extend these results  including: 1) determining what happens when there are multiple communication channels available; 2) what happens if one can use signal processing in the event of a jam to determine what the underlying message was in the case where many processors broadcast the sam underlying message; and 3) determining if one can reduce the power consumption necessary to in order to maintain the state of the shared object, perhaps expending more energy only when there is a need to change state.

On Tuesday, Keren Censor-Hillel and Danny Hendler gave excellent back-to-back talks on restricted use shared objects.  Keren started with upper bounds – in particular how to implement a max count register in the restricted use model and Danny followed up with lower bounds from a paper joint with Aspnes, Attiya, and Censor-Hillel.  A restricted use object is essentially one where some restriction exists on the number of operations that may be applied to an object.  There are two common types of restrictions: 1) m limited use: at most m ops may be applied; and 2) b bounded: at most b values supported by the object.  An old result by Jayanti, Tan and Toeg in ’00 shows that Omega(n) work and space is needed for history-less primitives [Jayanti, Tan, Toueg, '00], but these lower bounds don’t apply to restricted use objects.  There have been several exciting results showing the possibility of implementing such objects in polylog step complexity (see the talk by Gilbert and Bender below for an example of cool applications of these new results).  How can we devise lower bounds for this new type of restricted use object?  Danny discussed the notion of L perturbable objects that intuitively can be perturbed at most L times.  He gave details of a result (joint with Aspnes, Attiya, and Keren) showing that a L perturbable object must have step complexity Omega(min(log L,n)) and space complexity Omega(min(L,n)).  This lower bound is for deterministic objects only; randomized lower bounds are still relatively open (lower bound of Omega(log log m/ log log log m))

On Wednesday, Lisa Higham and Wojciech Golab gave talks on the notion of strong linearizability.   I will just touch on the problem they address.  When a shared object is linearizable, it intuitively means that the history of invocations and responses to that object can be ordered in time in a way that 1) the total order extends the “happens before” partial order over all the operations; and 2) the ordering obeys correctness properties for the shared object.  (I’m probably missing something important here – maybe someone will correct me in the comments).  Many (most?) people think that if operations on an object are linearizable, then everything is great: the object acts like it is atomic in the sense that it appears to the rest of the system as if operations on it occur instantaneously.

Lisa and Wojciech showed that these people are sadly mistaken.  In a recent STOC result, joint with Wojciech Golab and Philipp Woelfel, it’s shown that linearizability does not suffice in the case where processes can use randomization.  They introduce a new concept strong linearizability that is sufficient (and more or less necessary) to ensure correctness of randomized algorithms.  Unfortunately, as discussed by Wojciech, it seems that to ensure strong linearizability for most shared objects requires significantly higher resource costs than to ensure linearizability.  Valerie King brought up an interesting question about whether for certain types of randomized algorithms, a somewhat weaker notion may work (for example, if the algorithm is Monte Carlo and it’s ok if the scheduler can slightly tweak the probability of “bad” events, so long as this probability never gets too large).

[Note from Wojciech: you said “it seems that to ensure strong linearizability for most shared objects requires significantly higher resource costs than to ensure linearizability."  Is your impression based by any chance on the impossibility results I gave during my presentation?  Those actually pertain to first-step and first-update linearizability, one of which is strictly stronger than strong linearizability, and the other is incomparable.  For strong linearizability itself, there are several upper bounds that Lisa and I didn’t mention as prominently as we should have.  In particular, known universal constructions for lock-based objects and wait-free objects tend to be strongly linearizable.  Thus, the message we wanted to get across is that strong linearizability is a practical property because it’s readily attainable, and in several important cases it comes at no additional cost beyond the cost of ordinary linearizability.  (That’s in contrast to first-step and first-update linearizability.)]

Next, Hagit Attiya led an interesting discussion on how to motivate our work in distributed computing, pointing out that other CS areas, like Machine Learning, are frequently better at “selling” their results.  Maurice pointed out that many results that originated in the PODC community (like Byzantine agreement) have been “rediscovered” by the systems community, frequently without significant attibution/homage to the distributed algorithms community.  How can we more effectively advertise these results outside our own community?  Hint: it may help to have fewer models.

On Wednesday afternoon, we took a road trip to Lake Louise where we saw a great collection of ice sculptures, walked around a beautiful snow covered lake, and learned that penny loafers aren’t the best footwear for a glacial approach.

On Thursday, Seth Gilbert and Michael Bender gave a great joint talk on their recent FOCS paper on “Mutual Exclusion with O(log^2 log n) Amortized Work”.  A recent algorithm for this problem was by Danny Hendler and Phillip Woefel (one of the workshop organizers), which showed that it was possible to breakthrough a Theta(log n) barrier (shown by Attiya, Hendler and Woefel);  this previous algorithm achieved Theta(log n/ (log log n)) work against an adaptive adversary (in the shared memory world, this essentially means an adversary in the full information communication model).  Seth and Michael’s work assumes a weaker adversary that is oblivious in that it can’t see past coin flips by the processors.  Their new algorithm is also Monte Carlo in that it ensures each processor gets into the critical section only with high prob.

Some key technical ideas behind their new result are: 1) to use a dense array to store processors that are waiting to enter the critical section; and 2) to create and use clever approximation and work-efficient counters (remember you can only afford O(log^2 log n) work per counter) in order to dynamically manage the array of processors that are waiting.  An interesting open problem: Can we prove that an adaptive adv. (i.e. one that has full information) can force at least Omega(log n/log log n) work even if the algorithm ensures access to the critical section only with high prob; or alternatively can we design an algorithm in this model that does better?

Valerie King gave a nice talk in the afternoon, half of which was on connections between the shared memory model and the message passing model for a consensus problem.  We’re likely writing a paper on this problem so I’ll probably blog about it later here.

Friday was overcast and cold, which made it a little bit easier to say goodbye to beautiful Banff.

[Thanks to Wojciech Golab for helpful corrections]



A Swarm of Quadrotors
February 3, 2012, 3:03 am
Filed under: Uncategorized | Tags: ,

I’m kind of geeking out on this right now.  This is really a great looking demonstration – interesting to think about the research problems they will face if the number of quadrotors increases (and the size of each decreases).



SIGACT Distributed Computing Column
December 17, 2011, 12:49 am
Filed under: Uncategorized | Tags: ,

There’s a nice “Year in Review” SIGACT Distributed Computing column this month.  Summary from editor, Idit Keidar, below:

The last column of the year is dedicated, as always, to a review of distributed computing awards and conferences in 2011. It includes:

* “Sharing Memories, Robustly” by Hagit Attiya
* “Prize for Innovation in Distributed Computing”
* “Review of PODC 2011″ by Maryam Helmi
* “Review of DISC 2011″ by Michael Hakimi and Adam Morrison
* “Review of SIROCCO 2011″ by Adrian Kosowski, Dominik Pajak and Zuzanna Stamirowska

The column is online on the column’s archive, both at the Technion:

http://www.ee.technion.ac.il/~idish/sigactNews

and at MIT:

http://people.csail.mit.edu/idish/sigactNews



Yao’s Millionaire Problem
November 18, 2011, 4:27 pm
Filed under: Uncategorized | Tags: , , ,

In this post, I want to talk about a research problem that I think even our new Republican legislatures can get excited about: Yao’s millionaire problem.  In this problem, two millionaires want to determine who is richer, but neither wants to reveal their private net worth.  Can we develop a protocol to help these millionaires?  This problem is important  because it kicked off the the study of secure multiparty computation.

Below is a picture of the protocol originally proposed by Yao.  Alice and Bob are the two millionaires and i and j are the private net worths of Alice and Bob respectively.  The protocol makes use of public key cryptography: Alice is assumed to be able to generate a public encryption key E_a() along with a private decryption key D_a().  To see that the protocol reveals who is richer, note that y_j = D_a(k) = x.  Thus w_j = x if j<= i and w_j = x+1 if j>i.  Showing that the protocol doesn’t reveal any information beyond who is richer is more challenging and is presented in detail in  the paper.

 

 

I should point out that this protocol takes exponential time and that this run time has been improved in subsequent papers.  However, there is a question about this problem I haven’t been able to answer.  In the case where there are n players, do we need private channels or cryptographic assumptions to solve the problem?  Are private channels needed even if we’re happy with a Monte Carlo solution?  I’ve seen several papers that remove cryptographic assumptions, but none that seem to remove the need for private channels.  Conversely, I’ve seen no papers that prove that private channels are necessary…



Networks of Noisy Gates (Part 2)
September 20, 2011, 3:28 pm
Filed under: Uncategorized | Tags: , , ,

In today’s post, I want to focus on some of the highlights of the lower bound that I mentioned last time on the “Networks of Noisy Gates“ problem. Recall that, in this problem, you are given a function {f} that can be computed with a circuit of {n} regular gates. You want to construct a new circuit with the smallest number of noisy gates that computes {f} correctly with probability greater than {1/2}. These new noisy gates each fail independently with some small probability {\epsilon} and when they fail, they produce the complement of their correct output.

The result I want to talk about today is the lower bound by Pippenger, Stamoulis and Tsitsiklis which shows that there are some functions that can be computed with {n} gates without noise, but which require {\Omega(n \log n)} noisy gates. In particular, they show that this is true if {f} is the exclusive or function.

The first part of their lower bound is a result that shows a new model of failure where wires can also fail can be made equivalent to the old model of failure where only gates can fail. In particular, in the new model, each wire fails with some probability {\delta}, independently of all other wires and gates, and failure of a wire results in transmitting the complement of the bit that is sent down that wire.

Lemma 3.1 of the paper “Lower Bound for the Redundancy of Self-correcting Arrangements of Unreliable Functional Elements” by Dobrushin and Ortyukov shows the following. Let {\Delta} be the maximum number of input wires for any gate. Then for any {\delta \in [0, \epsilon/\Delta]} and for any set of inputs to a gate, there exists a vector of malfunction probabilities on the gate such that the probability that the gate fails to produce the correct output is exactly {\epsilon}. Essentially this shows that, the new model where we have a small probability of failure on both wires and gates is stochastically equivalent to the old model where we have a larger probability of error just on the gates.

Now we assume we have {n} inputs to the to our exclusive-or function {f}, and we let {X_{1}, X_{2}, \ldots, X_{n}} be these inputs. For the purposes of the lower bound, we’ll assume that each {X_{i}} is an independent random variable that is {1} with probability {1/2} and {0} otherwise. Now for this lower bound, we’re going to be very generous to the noisy circuit. We’re going to assume that 1) no gate ever fails and 2) the only wires that can fail are those that are directly connected to some input bit ( each of these wires fails independently with probability {\delta}).

For the {i}-th input bit, we’ll let {m_{i}} be the number of wires that are connected directly to that input bit, and hence carry the value {X_{i}} if they are not faulty. Now in the backend of the circuit, which is computing the exclusive-or of the input bits, there needs to be some estimate for each value of {X_{i}}. Not surprisingly, it is possible to prove (as the paper does) that Maximum Likelihood way to estimate {X_{i}} is to take the majority bit over all the {m_{i}} wires that are connected to input {X_{i}}. We’ll let {Z_{i}} be an indicator r.v. that is {1} iff the Maximum Likelihood estimate for {X_{i}} is wrong. Note that {Pr(Z_{i} = 1) \geq \delta^{m_{i}}}, since if every wire that carries that value {X_{i}} is faulty, then clearly the circuit will use the wrong value.

Next, note that the probability that the noisy circuit fails is just {Pr(Z_{1} \oplus \cdots \oplus Z_{n} = 1)}. So now we have a cute probability problem: assume you’re give {n} independent indicator random variables, you know the probability that each of them is {1}, and you want to compute the probability that their exclusive-or is {1}. The next lemma solves this problem with the help of generating functions.

Lemma 1

\displaystyle 1 - 2 Pr(Z_{1} \oplus \cdots \oplus Z_{n} = 1) = \prod_{i=1}^{n} (1 - 2 Pr(Z_{i} = 1))

Proof: Let {\phi_{i}(t) = E(t^{Z_{i}})} be the generating function for {Z_{i}}. Note that {\phi_{i}(t) = 1- Pr(Z_{i}= 1) + t Pr(Z_{i} = 1)}. Now let {Z = \sum_{i=1} ^{n} Z_{i}} and let {\psi(t) = E(t^{Z})}. Now comes the clever part: note that {E((1^{Z} - (-1)^{Z})/2) = Pr(Z_{1} \oplus \cdots \oplus Z_{n} = 1)}. This is true since {1^{Z}} is always {1} and {-1^{Z}} is {1} if {Z} is even and {-1} if {Z} is odd.

But by linearity of expectation, {Pr(Z_{1} \oplus \cdots \oplus Z_{n} = 1) = E((1^{Z} - (-1)^{Z})/2) = (\psi(1) - \psi(-1)) / 2}. Next, note that {\psi(t) = \prod_{i=1}^{n} \phi_{i}(t)} since the {Z_{i}} are independent. Thus, {\psi(1) = 1} and {\psi(-1) = 1-2 Pr(Z_{i} = 1)}. This completes the proof. \Box

The remainder of the proof for the lower bound is just algebra. We’ll make use of the inequality between arithmetic and geometric means, which states that for any set of {n} numbers, {x_{1}, x_{2}, \ldots x_{n}}, it holds that {(1/n) \sum_{i=1}^{n} x_{i} \geq \sqrt[n] {\prod_{i=1}^{n} x_{i}}}.

Lemma 2 { \sum_{i=1}^{n} m_{i}} is {\Omega(n \log n) }

Proof: Let {p<1/2} be the probability of error for the noisy circuit. By assumption, {Pr(Z_{1} \oplus \cdots \oplus Z_{n} = 1) < p}. Using Lemma~1 and our bound on {Pr(Z_{i} = 1)}, we have that

\displaystyle 1-2p \leq \prod_{i=1}^{n} (1 - 2 \delta^{m_{i}})

Now using the inequality between arithmetic and geometric means, we have that

\displaystyle 1-2p \leq \left(1/n \sum_{i=1}^{n} (1 - 2 \delta^{m_{i}})\right)^{n} = \left(1 - 2/n \sum_{i=1}^{n} \delta^{m_{i}}\right)^{n}

Again using the inequality between arithmetic and geometric means (now on the term that is being subtracted) and the fact that {1-x \leq e^{-x}}, we get:

\displaystyle 1-2p \leq \left( 1-2\delta^{1/n \sum_{i=1}^{n} m_{i}} \right)^{n} \leq \textrm{exp}(-2n\delta^{1/n \sum_{i=1}^{n} m_{i}})

Isolating the {\sum_{i=1}^{n} m_{i}} term in the above inequality gives us that

\displaystyle \sum_{i=1}^{n} m_{i} \geq n \frac{\ln (2n) - \ln \ln (1/(1-2p))}{\ln (1/\delta)}

Since {\delta} is a constant depending only on {\epsilon} and since {p<1/2}, the above inequality completes the proof. \Box

Finally, we note that since each gate is assumed to have constant fan in, if the number of wires is {\Omega(n \log n)}, it follows that the number of gates in the noisy circuit must also be {\Omega(n \log n)}. That basically completes the proof of the lower bound. The interesting thing is that the only difficulty we exploited is the difficulty of estimating the input values in the situation where the wires may be faulty!



SODA accepted papers
September 14, 2011, 3:14 am
Filed under: Uncategorized | Tags: , ,

As has already been pointed out in several blogs, accepted papers for SODA  are now up on the web.  I’m on the program committee this year so can tell war stories about reviewing 53 papers over several weeks, but that’s another blog post.  For now, I wanted to list some of the papers that I think will be of interest to the distributed computing community:

Gathering despite mischief
Yoann Dieudonne, Andrzej Pelc and David Pele

Networks Cannot Compute Their Diameter in Sublinear Time
Silvio Frischknecht, Stephan Holzer and Roger Wattenhofer

Information Dissemination via Random Walks in d-Dimensional Space
Henry Lam, Zhenming Liu, Michael Mitzenmacher, Xiaorui Sun and Yajun Wang

Rumor Spreading and Vertex Expansion
George Giakkoupis and Thomas Sauerwald

Towards Robust and Efficient Computation in Dynamic Peer-to-Peer Networks
John Augustine, Gopal Pandurangan, Peter Robinson and Eli Upfal

Lower Bounds for Number-in-Hand Multiparty Communication Complexity, Made Easy
Jeff Phillips, Elad Verbin and Qin Zhan

SINR Diagram with Interference Cancellation
Chen Avin, Asaf Cohen, Yoram Haddad, Erez Kantor, Zvi Lotker, Merav Parter and David Peleg

Ultra-Fast Rumor Spreading in Social Networks
Nikolaos Fountoulakis, Konstantinos Panagiotou and Thomas Sauerwald

A Little Advice Can Be Very Helpful
Arkadev Chattopadhyay, Jeff Edmonds, Faith Ellen and Toniann Pitassi

Physarum Can Compute Shortest Paths
Vincenzo Bonifaci, Kurt Mehlhorn and Girish Varma

Parallelism and Time in Hierarchical Self-Assembly
Ho-Lin Chen and David Doty


Biocomputing
September 7, 2011, 9:10 pm
Filed under: Uncategorized | Tags: ,

Very interesting article: Profiler at the Cellular Level about building a diagnostic biological computing network to detect cancer cells in humans.  I’ve seen many talks on biocomputing that have proposed cancer detection as a motivating problem.  However, this is the first time I’ve seen researchers directly attacking this problem.

From what I understand about this research area, reliability is still a major problem.  In particular, biological gates are significantly less reliable than silicon gates, and it’s unclear how to best improve reliability in the case where large biological circuits are needed.  This might lead to some very interesting problems in network reliability.



NSF Proposal
August 18, 2011, 3:01 pm
Filed under: Uncategorized | Tags: , , ,

The process of grant writing was completely opaque to me when I was a grad student. As a faculty member, I now take the position that grant writing is intrinsically important, and that all grad students should learn something about it in order to get insight into the long-term planning process for doing research.  For this reason, it’d be nice to see more examples of funded grants publicly available.

Last January, I posted a one page project summary for one of my funded NSF proposals.  This year again I was lucky enough to have a proposal funded and I’m posting the one page summary below.  I hope that some readers find it useful.

consensus-summary



Congratulations
July 15, 2011, 5:05 pm
Filed under: Uncategorized | Tags: , ,

Congratulations to Dr. Maxwell Young who successfully defended his dissertation on Resource-Efficient Communication in the Presence of Adversaries at the University of Waterloo yesterday.  Max got his Masters degree with me at UNM way back when I first started as a faculty member, and he was the very first student with whom I published a paper.  He’s hard-working, talented, and also a lot of fun to work with.  One of the great pleasures of academia is seeing student develop as researchers, and then have the possibility to continue to work with them as colleagues.  Max and I have been actively working together for many years now, and I’m looking forward to visiting him in Singapore  during his post doc with Seth Gilbert next year.

Congratulations Max!!!




Follow

Get every new post delivered to your Inbox.