My student Abhinav Aggarwal has just written the inaugural post on his new blog (the name of which, I hope, is not a comment on the soporific qualities of my advisement :) His first post is on the Clever algorithm by Jon Kleinberg. This is inarguably, the paper that started the modern, spectral-based approach to web search that Google has built on so successfully.

Some questions worth pondering while reading through Abhinav’s summary (and hopefully also the seminal paper itself): 1) Why did PageRank “beat out” the Clever algorithm in real-world web search? 2) Are there domains besides web search where one might use a “top left and right eigenvector” approach like the Clever algorithm? 3) What about the “soft” results in the paper like the use of the second eigenvector for clustering? Can these be formalized in an interesting way?

*[The following report on the Counting Complexity and the Phase Transitions Boot Camp at the Simons Institute was written by my student Abhinav Aggarwal]*

### 2016 Boot Camp on Counting Complexity and Phase Transitions

### Jan 25-28, Simons Institute for the Theory of Computing, University of California, Berkeley

Recently I visited the Simons Institute at UC Berkeley to attend a boot camp on counting complexity and phase transitions. The program for the same is available here. There were a total of 16 talks spread over 4 days, with various speakers covering topics from basics of counting complexity, dichotomy theorems, Markov chain mixing times and random instances. All the videos from the lectures are available here. The following is a brief summary of talks in two areas that seemed interesting to me and are related to my research at UNM.

Nayantara Bhatnagar from University of Delaware and Ivona Bezakova from Rochester Institute of Technology presented 3 talks about properties of Markov chains and their mixing times, along with a couple of applications of the same. The talks started with basics of Markov chain properties and conditions under which a given chain is ergodic (irreducible, aperiodic and finite). Once an ergodic Markov chain is obtained, a metric called total variation distance is defined between the initial distribution on the states and the stationary distribution of this chain (which is unique because of ergodicity). This distance measures how close the former is to latter, purely on the basis of the difference in probability mass allocated by the former to the events in the sample space.

A well-known result is that for stochastic matrices (like the transition probability matrix for Markov chains), the decrease in the distance from the current distribution to the stationary distribution over time depends solely on the spectral gap. The smaller this gap, the larger the distance, and consequently, the more time taken by the Markov chain to approach its stationary distribution. This property is exploited by techniques called Markov chain Monte Carlo (MCMC), which are used heavily for sampling from non-trivial distributions. Some examples of this technique that were presented in the talks include sampling a uniformly random coloring of a given undirected graph and approximate counting of the number of matchings in a given graph. The details of the construction and proof can be found here.

The talks continued with a discussion of various techniques that are popular while sampling and counting using Markov chains. Four of these techniques are:

**Almost uniform sampler**– Given a tolerance parameter*d >*0, produce a sample from the distribution that is within*d*total variation distance from the uniform distribution.**Fully polynomial almost uniform sampler (FPAUS) –**An almost uniform sampler that runs in time polynomial in the input size and log 1/*d***Randomized approximation scheme –**Given a counting problem and a tolerance parameter ε>0, produce a count which is within ±ε of the actual count with probability at least 3/4.**Fully polynomial randomized approximation scheme (FPRAS) –**A randomized approximation scheme that runs in time polynomial in the input size and (1/ε).

Upon defining these techniques, our goal then becomes to design a Markov chain with small mixing time so that the runtime of FPRAS can be minimized. The talks further discussed details about coupling theory, both between Markov chains and probability distributions. Nayantara presented these concepts using coupling between two sequences of coin tosses and the famous card shuffling example, which is used to prove that the top-to-random shuffle takes O(n log n) steps to produce a perfect shuffle of a deck of n cards. Within these many steps, the total variation distance of the Markov chain underlying this shuffling reaches sufficiently close to the uniform distribution.

Another session that I thoroughly enjoyed was the one on approximate counting by Leslie Ann Goldberg. The main topic covered in her talk was relative complexity and its relation to the #BIS problem (Bipartite Independent Sets). She started with the discussion of the Potts model and the partition function in the context. This was in relation to the number of proper q-colorings in a given graph. The aim of the model was to approximate the count of these colorings. She discussed an FRPAS algorithm for the same, which outputs a rational number z such that for a given tolerance ε > 0 and the actual count C, we have Pr (Ce^-ε <= z <= Ce^ε) >= 3/4. The counting problem using FRPAS can be NP Hard in general., however Leslie showed that the problem cannot be much harder than that.

An outline of the proof uses the bisection technique by Vazirani and Valiant, which shows that #SAT can be approximated by a probabilistic polynomial time Turing machine using an oracle for SAT. Leslie then defined the relative complexity for approximate counting using AP-reductions from a function f to a function g. Concretely, a function f is AP-reducible to a function g if 1) there exists a randomized algorithm A to compute f using an oracle for g, and 2) A is a randomized approximation scheme for f whenever the oracle is a randomized approximation scheme for g. This makes the class of functions with an FPRAS closed under AP-reductions.

An immediate dichotomy result that comes out of this formulation is that if NP != RP, then within #P, all FPRASable problems form a class and the rest remain unFPRASable. All problems in a given class are AP-inter-reducible, but an FPRAS doesn’t necessarily exist unless NP = RP. However, within the class of unFPRASable problems, a dichotomy further exists. This class is partitioned into two subclasses, one of which consists of problems that are AP-reducible to #SAT and the other consists of problems that are AP-reducible to #BIS. Leslie concluded her talk by giving an example of this trichotomy in the context of graph-homomorphisms.

Filed under: Uncategorized | Tags: distributed computing, students, theory

Congratulations to Mahnush Movahedi who today just passed with distinction her dissertation defense on Efficient and Robust Distributed Computing. Serving on her dissertation committee were Sean Luan (UNM), David Evans (UVA) and Maxwell Young (MSU). After graduation, Mahnush will be will be moving to the University of Virginia as a postdoc to work with David Evans on Secure Computation.

It’s a great pleasure to have worked with Mahnush! She and Mahdi will both be greatly missed in our research group.

Filed under: Uncategorized | Tags: distributed computing, students, theory

Congratulations to Mahdi Zamani who just passed with distinction his

dissertation defense on “Scalable and Robust Algorithms for Privacy-Preserving Applications”. After graduation, Mahdi will be will be moving to Yale CS as a Postdoc to work with Joan Feigenbaum and Bryan Ford on the Dissent Project. Serving on his dissertation committee were Jed Crandall, Trilce Estrada and Brian Ford(EPFL).

It’s a great pleasure to have worked with Mahdi and he’ll really be missed in both our own research group and at UNM.

*[The following report on the Securing Computation Workshop at the Simons Institute was written by my students Mahnush Movahedi and Mahdi Zamani. **A polished version of this report is available HERE.]*

### 2015 Workshop on Securing Computation

### June 8-12, Simons Institute for the Theory of Computing, University of California, Berkeley

### Lower Bounds for Information-Theoretic MPC, *Ivan Damgård*,* Aarhus University*

*multi-party computation (MPC)*protocols to cryptographic techniques such as FHE: The unconditional protocols are usually much more computationally-efficient than FHE protocols. On the other hand, unconditional techniques usually require lots of interactions and their round complexity is usually large. One interesting question to ask here is that “Is it possible to design an unconditionally-secure FHE that has the benefits of both schemes?” The answer is, unfortunately, still unknown, but we have some partial answers. The goal here is to determine how many messages we need to transmit in order to compute a non-trivial function (such as the AND of bits each coming from one player) securely. Ivan argued that if there are n players with t semi-honest corruptions, the protocol must send Ω(nt) messages. For example, for n=3 and t=1, we need to send at least six messages to compute the AND functionality. The intuition behind this is that every player must communicate with at least t+1 players, and must later receive another message to compute the output. So, for each player we count (t+1)+2 messages sent or received, and thus the total becomes at least n(t+3)/2 messages sent. One thing that I should point out here (and which my own research touches on) is that they assume all-to-all communication. Several protocols (such as [DKMS12, BGT13, DKMS14]) break this lower bound by requiring each party to only communicate with a small (polylog size) set of parties (see [SZ15] for a complete discussion on sublinear MPC results).

*gate-by-gate*protocols, where a circuit is evaluated gate-by-gate: the invariant is that we secret-share the inputs to each gate, and then evaluate the gate functionality over the secret-shared inputs to generate the gate output. Known protocols with this strategy have Ω(n|C|) communication complexity and Ω(|C|) round complexity, where |C| is the circuit size. But, are these optimal? The bottleneck for gate-by-gate protocols is the multiplication gate, which Ivan argues to take at least t message to be evaluated. Thus, the best known results are asymptotically optimal. Ivan then finished his talk by showing that the same lower bounds exist for the case of dishonest majority, where a preprocessing phase is necessary for MPC. Thus, to really improve the existing protocols, we need a fundamentally different approach.

### Obfuscation: Past, Present, and Possible Futures, *Amit Sahai*,* UCLA*

In program obfuscation, a given program P1 is converted to another program P2 that represents the same functionality (i.e., the same I/O) while an adversary who can see P2 cannot understand its logic. From a cryptographic perspective, this is like having a software that can keep a secret. This leads to a new notion of program obfuscation called *indistinguishability obfuscation (iO)*, where a polynomial-time adversary cannot distinguish between the obfuscations of two equivalent programs. The first mathematical construction of an indistinguishability obfuscator was proposed by Garg-Gentry-Halevi in 2013. The main idea to obfuscate programs using structured noise rather than just random noise. When the program is evaluated, the noise cancels out and the correct output is obtained.

To this end, a *multilinear map* can be used to encode field elements with specific algebraic manipulations and a Boolean function that returns Yes/No when the encoded value is zero or not. The program is first converted to a sequence matrices using a technique called oblivious matrix branching program. Then, the Kilian’s randomization technique is used to generate structured noise: the matrices are multiplied by a sequence of random matrices. The key challenge to the current obfuscation scheme is the input-mixing attack. A technique is required to enforce honest behavior in some way. Amit finished his talk by arguing that multilinear maps provide a new hardness generating technique at the level of Diffie-Hellman and learning with error hardness assumptions, but they have mainly been used for obfuscation. One open question is that can multilinear maps be used for solving other security problems?

### Two-Round MPC via Multi-Key FHE,* Daniel Wichs, Northeastern University*

*fully homomorphic encryption (FHE)*scheme allows to perform secure computation over encrypted data without decrypting it [Gen09]. Asharov et al. [AJLTVW12] show how FHE can be used to construct constant-round MPC by designing a

*threshold FHE (TFHE)*technique that provides Byzantine-resilience and circuit-independent communication cost. All parties first encrypt their inputs under an FHE scheme (such as that of [BGV12]) and send the encrypted values to all other parties. Then, each party evaluates the desired function over the encrypted inputs via homomorphism, and eventually participates in a

*distributed decryption*protocol to decrypt the output. This protocol requires a

*distributed key generation*algorithm to agree on a common public key and a secret-shared private key. While the MPC protocol of [AJLTVW12] requires three rounds of communication, Daniel described a new FHE-based MPC protocol that requires only two rounds of communication. This result as well as [AJLTVW12] are secure in the common random string (CRS) model ( an algorithm is secure in the CRS model if it assumes that all parties have access to a common random string taken from a predetermined distribution). This assumption is required for malicious security in both work. Also, both results require a

*non-interactive zero-knowledge (NIZK)*technique to enforce honest behavior in the malicious setting.

*multi-key FHE*scheme that can compute over a set ciphertext each encrypted with a different key. The result will be another ciphertext that can only be decrypted with a certain number of private keys. Recently, Clear and McGoldrick [CM14] proposed a multi-key FHE scheme using the

*learning with error (LWE)*hardness assumption (the LWE problem asks to recover a secret from a sequence of “approximate” random linear equations// . Algorithms based on the LWE assumption belong to the so-called

*post-quantum cryptography*, where the security cannot be broken in feasible time even using a quantum computer).

### Efficient Multiparty Protocols via Log-Depth Threshold Formulae, *Ron Rothblum*,* Weizmann Institute*

*player emulation*; the computational task run by a player is emulated by other players. This allows us to reduce the construction of an // -party protocol to the construction of (small) constant-party protocols which are much easier to design. This idea consists of two steps:

- Design a protocol for a small number of parties (say, 3 or 4) which achieves security against a single corrupted party. Such protocols are typically easy to construct as they may employ techniques that do not scale well with the number of corrupted parties.
- Recursively compose with itself to obtain an efficient n-party protocol which achieves security against a constant fraction of corrupted parties.

The reduction idea is described in the following. First, consider a trivial n-party setting that includes a trusted party τ. The parties send their inputs to τ whom locally computes the MPC functionality and returns the output back to all parties. Now, replace the trusted party with a small constant number of virtual parties, say v1,v2,v3, which emulate the behavior of τ. The new protocol is secure as long as the adversary does not control a majority of the virtual parties. We now proceed by replacing v1 by with three new virtual parties w1,w2,w3. The new protocol is secure even if the adversary controls either one of v2,v3 and one of the w’s, or two of the w’s. Using this, Ron constructed a formula as shown in Figure 2↑, which can be used to build a formula for the entire protocol as shown in Figure 3↓, where the leaves are the real parties.

### Rethinking Secure Computation – A Greedy Approach, *Muthu Venkitasubramaniam*,* University of Rochester*

The general framework for secure computation consists of two steps: (1) Compile a functionality into one of several standard representations such as a circuit (Boolean or arithmetic) or an oblivious RAM; (2) Use a generic scheme such as [Yao82, GMW87, BGW88, CCD88] to securely evaluate the functionality. However, for some certain problems such as Private Set Intersection, specific solutions can result in significantly more efficient solutions. Inspired by these, Muthu presented a new algorithmic approach for designing secure two-party computation protocols. In their model (join work with Abhi Shelat from UVA), they show that several problems such as convex hull, minimum spanning tree, unit job scheduling, and single source shortest path can be securely computed only by using secure comparison.

The underlying idea among all of these problems is that each of them has some known greedy algorithm. Accordingly, Muthu described a new framework called *Greedy-Millionaire Framework*, where a function f is defined *secure greedy compatible* if there exists it has the following three properties:

- Unique Solution: Given inputs U and V of Alice and Bob, f(U,V) is unique;
- Unique Order: There is a unique order in which the output is presented in the greedy algorithm;
- Local Updatability: The local greedy heuristic can be computed using a comparison operation over the function locally computed over each of the inputs.

### Cryptocurrencies and Smart Contracts, *Elaine Shi*,* University of Maryland*

*blockchains*. A blockchain is a public ledger of all Bitcoin transactions that have ever been executed. It is constantly growing as “completed” blocks are added to it with a new set of recordings (read more about blockchains here). The blockchain used in bitcoin is guaranteed to be always correct but it does not guarantee privacy of users. Elaine described a universally-composable model of the bitcoin protocol, where the functionality F_blockchain exposes internal state of the blockchain to everyone, and the security proof is conducted in the F_blockchain-hybrid model.

*Ethereum*, a new cryptocurrency that consists of a ledger and user-defined smart contracts. The contracts run transactions (e.g., transfer x amount from y) over the ledger and the ledger ensures consensus among all users. A system called

*Hawk*provides privacy-preserving smart contracts. Hawk implements contracts as auctions: each auction receives bids from the users and the auctioneer closes the auction at some point. One challenging problem here is that all bids are visible to the public so, for example, a smart user can wait for others and then submit its bid. In their model, the auctioneer (also called the manager) is assumed to be semi-trusted; it is standard to implement the manager using MPC but for efficiency purposes this is not considered in their model. Hawk supports financial fairness by punishing bad behaviors.

### References

[AJLTVW12] . Multiparty Computation with Low Communication, Computation and Interaction via Threshold FHE. In *Advances in Cryptology — EUROCRYPT 2012* . Springer Berlin Heidelberg, 2012.

[BGT13] . Communication locality in secure multi-party computation: how to run sublinear algorithms in a distributed setting. *Proceedings of the 10th theory of cryptography conference on Theory of Cryptography*, pp. 356—376, 2013.

[BGV12] . Fully Homomorphic Encryption Without Bootstrapping. *Proceedings of the 3rd Innovations in Theoretical Computer Science Conference*, pp. 309—325, 2012.

[CCD88] . Multiparty unconditionally secure protocols. *Proceedings of the twentieth annual ACM symposium on Theory of computing*, pp. 11—19, 1988.

[DKMS12] . Brief announcement: breaking the // bit barrier, secure multiparty computation with a static adversary. *Proceedings of the 2012 ACM symposium on Principles of distributed computing*, pp. 227—228, 2012.

[DKMS14] . Quorums Quicken Queries: Efficient Asynchronous Secure Multiparty Computation. In *Distributed Computing and Networking* . Springer Berlin Heidelberg, 2014.

[Gen09] . Fully homomorphic encryption using ideal lattices. *Proceedings of the 41st annual ACM symposium on Theory of computing*, pp. 169—178, 2009.

[BGW88] . Completeness Theorems for Non-Cryptographic Fault-Tolerant Distributed Computing. *Proceedings of the Twentieth ACM Symposium on the Theory of Computing (STOC)*, pp. 1—10, 1988.

[CM14] : *Multi-Identity and Multi-Key Leveled FHE from Learning with Errors*. Cryptology ePrint Archive, Report 2014/798, 2014.

[GMW87] . How to play ANY mental game. *Proceedings of the nineteenth annual ACM symposium on Theory of computing*, pp. 218—229, 1987.

[SZ15] . Recent Results in Scalable Multi-Party Computation. In *SOFSEM 2015: Theory and Practice of Computer Science* (Italiano, GiuseppeF. and Margaria-Steffen, Tiziana and Pokorný, Jaroslav and Quisquater, Jean-Jacques and Wattenhofer, Roger, ed.). Springer Berlin Heidelberg, 2015. URL: http://dx.doi.org/10.1007/978-3-662-46078-8_3.

[Yao82] . Protocols for secure computations. *Proceedings of the 23rd Annual Symposium on Foundations of Computer Science*, pp. 160—164, 1982.

Filed under: Uncategorized | Tags: distributed computing, interactive computation, reliability, theory

How do you compute over a noisy channel? Say Alice and Bob want to run a distributed algorithm, but they can only communicate over a channel where each bit is flipped either stochastically or adversarially. What is the smallest number of bits they must send in order to obtain the correct output of the algorithm, with high probability? This general problem is called Interactive Communication, or sometimes Interactive Coding/Computation – let’s use IC for short. It doesn’t take much imagination to think of applications of IC: enabling sensor nodes to compute a function on their inputs in a noisy environment; designing reliable algorithms for natural algorithms [1]; and designing attack-resistant algorithms [2], just as some examples.

Wait, you may be thinking, isn’t this problem solved by error correcting codes? No – the bits sent in the distributed algorithm may be determined only one at a time, in a dynamic manner. Even for stochastic noise, if we naively encode each bit sent, in order to get a union bound ensuring the probability of any decoding error is small, we’ll need logarithmic blowup for each bit. We can do better.

In fact, in the first paper on this problem, Coding for Interactive Communication, Leonard Schulman described an algorithm that requires only constant blowup, even when a constant fraction of the bits on the channel are flipped **adversarially**. Schulman’s algorithm was based on a new type of code, tree codes, designed specifically for interactive communication. After this paper was published in 1996, subsequent work focused on improving the error tolerance of Schulman’s algorithm (i.e. tolerating a larger constant fraction of errors); tuning the communication blowup to the (known) error rate; and considering the situation where the computation occurs over more than 2 parties. See the survey paper Coding for interactive computation: progress and challenges (especially Sections 3 and 4) for details.

I’ll note here that in a sense all of information theory can be though of as a subset of IC. In particular, information theory considers “only” the class of distributed algorithms where Alice has an input that she wants to send to Bob. In IC, we need to handle all possible distributed algorithms, and so in a way, it’s pretty amazing that a constant adversarial noise rate can even be tolerated.

A very recent breakthrough in IC occurred in FOCS 2014, where Bernhard Haeupler in the paper Interactive Channel Capacity Revisited, made huge progress on the problem of tuning the communication blowup to a known error rate. In particular, he described an algorithm that given a fixed and known adversarial noise rate \epsilon, achieves a communication blowup as a function of \epsilon that is conjectured to be optimal.

So what if \epsilon is not known in advance? I’m going to end this post with a short advertisement for a recent paper with some collaborators of mine that begins to address that problem. The paper is Interactive Communication with Unknown Noise Rate, with Varsha Dani, Mahnush Movahedi, and Maxwell Young that will be in this upcoming ICALP. This is the first paper I’ve written about IC, and I hope it won’t be the last. In fact, I’d like to see more people in the distributed computing community consider the problem. In particular, it’d be nice to make progress on the (currently very hard) problem of somehow generalizing results to more than 2 parties.

[1] For example, see this paper for a closely related problem: Breathe before Speaking: Efficient Information Dissemination Despite Noisy, Limited and Anonymous Communication

[2] This last domain is particularly interesting to me. An obvious connection is attacks on communication channels. A less-obvious connection is attacks on nodes. If communication from each node is bounded, as is increasingly common for algorithms on large networks, there may be a fairly tight connection between attacks on communication channels and attacks on nodes. Thus, there may be connections to attacks like Byzantine faults.

If you’re in New Haven next week, you could do worse than checking out this talk at Yale’s CS Colloquium by my stellar PhD student Mahdi Zamani.