*[The following blog report on SSS’14 was written by my student **George Saad**]*

*16th **International Symposium on Stabilization, Safety, and Security of Distributed Systems (SSS 2014)* was held in Paderborn, Germany, from Sep 28 to Oct 1, 2014. This conference discusses the design and development of distributed systems with fault-tolerance and self-* properties.

SSS 2014 had seven sessions: self-stabilization I/II/III; Dependable Systems; Formal Methods, Safety, and Security; Ad-hoc, Sensor and Mobile Networks and Cyberphysical Systems; and Cloud Computing, P2P, Self-organizing and Autonomous Systems.

On the first day, which was tutorial day on Self-Organizing Physical Systems, *Marco Dorigo*, a professor in Universite Libre de Bruxelles and University of Paderborn, gave the fourth talk for Self-organizing Swarms. In this talk, Dorigo showed how to use ants to find the shortest path of a pair of nodes in a network using artificial pheromones. The ants choose one path from a set of paths stochastically, depending on the amount of pheromones of previous ants visited such paths. Note that there are other strategies considered for solving this problem such as separation, alignment and cohesion. Interestingly, finding the shortest path in this way can be used to obtain arg min f(x).

However, artificial pheromones are not practical in many situations. Thus, goal search and path formation are studied in the absence of pheromones. In particular, robots are assigned as points in order to form a path to the goal, after which all other robots will follow such path.

Homogenous robots can cooperate together to perform tasks that cannot be done by individual robots, such as crossing big holes and climbing steep hills. Adaptive rotation is one of the strategies to enable a group of robots to climb high hills. Moreover, the robots do light-flashing to synchronize in order to cooperate properly.

Heterogeneous robots are also considered. Note that they are heterogeneous in the sense that they have different capabilities. Thus, they work together to empower their combined capabilities in order to perform harder tasks. For instance, there are three types of robots: eye-robot, arm-robot and foot-robot. In a popular scenario, they cooperate together to look for and bring a book. First, a eye-robot flies seeking for the book. Once the book is located. A set of foot-robots is notified in order to move to that location carrying an arm-robot. Eventually, the arm-robot catches the book. After that, all these robots return back to the starting point.

In Self-Stabilization I session, *Volker Turau*, a professor in Hamburg University of Technology, presented his paper, “A self-stabilizing algorithm for edge monitoring problem”.

In wireless sensor networks, the nodes sense the environment and transmit (or forward) the data in the network. In the presence of adversary, a set of compromised nodes may disrupt the network in the sense of corrupting or dropping messages. Thus, a set of nodes is chosen in order to monitor all communications on edges using k-hop knowledge. In k-hop knowledge, a monitoring node x can monitor the communication on any edge whose endpoint is reachable by at most k hops from node x. The challenging task is to find the minimum set of monitoring nodes in a network. This problem is NP-Complete even for 1-hop knowledge.

Two distributed approximation algorithms are provided as previous work to solve this problem in synchronous model with no transient faults. In this paper, the authors provided a self-stabilizing algorithm, which finds the minimum set of monitoring nodes in the presence of transient faults in asynchronous model. In this algorithm, each node has a state which determines if this node is in or out of the monitoring set, and each node maintains a set of monitoring nodes for each of its adjacent edges. In each step of the algorithm, the state of each node changes only after having the permission of all neighboring nodes.

We presented our paper, *Self-Healing Computation*, in Self-Stabilization II session. Our contribution is that we developed a self-healing algorithm, for computation networks, which 1) detects computation corruptions made by Byzantine nodes; and 2) segregates such nodes, so that eventually no more corruptions occur.

We show that our self-healing algorithm reduces asymptotically the message cost compared to non-self-healing algorithms. Moreover, our experimental results show that the message cost is reduced by a factor of 425 compared to the naïve computation for a network of size 8k.

In this paper, we have an interesting result: informally, given any tree of size n, and each node survives independently with a constant probability, the probability of having a subtree of surviving nodes of size Ω(log n) is at most ½.

In Self-Stabilization III session, *Toshimitsu Masuzawa*, a professor in Osaka University, presented the paper, “Edge Coloring Despite Transient and Permanent Faults”. In this paper, the authors provided a self-stabilizing algorithm, which colors the edges of an arbitrary graph so that every node has no two edges of the same color in the presence of Byzantine adversary and in ring topology. The basic idea is that coloring is performed in steps, where one node x proposes colors to its adjacent edges in one step. After setting these colors, if a neighboring node y has two incident edges of the same color, then node y proposes a different color to the edge (x, y). However, this kind of color proposals may not terminate in the presence of a Byzantine node. To overcome this problem, the authors used a rotating priority procedure, where each node has a priority to propose a color in case of conflict, and these priorities change in a round-robin fashion. Unfortunately, this algorithm does not color the graph with the minimum number of colors required. My consideration is that will their algorithm color the graph properly and terminate in case that there is a good node surrounded with two Byzantine nodes in a ring topology?

Also, *Hung Tran-The*, a graduate student in Universidade de Lisboa, presented his paper, “Tight Bounds for Stabilizing Uniform Consensus in Mobile Networks”. He provided a self-stabilizing algorithm, in which an agreement of nodes is obtained in the presence of crashes, send-omissions or general omissions. However, they claimed that there is no self-stabilizing algorithm in the presence of Byzantine adversary. This is arguable due to the existence of Byzantine Agreement. My consideration is that cannot we implement Byzantine agreement as a self-stabilizing algorithm?

Beside these scientific contents, I have a few comments about Paderborn city. Paderborn is a beautiful and quiet city in Germany. In this city, the mayor of Paderborn invited us to Paderborn Town-Hall. He gave us a presentation showing how beautiful Paderborn city is.

He told us that “Pader” of “Paderborn” means water spring, where Paderborn has many water springs.

We also visited a wonderful palace, Schloss *Corvey*, which is 57 km away from Paderborn.

Afterwards, we spent good time in a restaurant near to the palace.

Foundations of Mobile Computing (FOMC) was one of the most fun workshops I’ve organized. Many thanks to my co-organizer Maxwell Young.

Slides from talks are available at the link above.

*[The following blog report on MMDS’14 was written by my student Mahdi Zamani - Ed]*

Recently, I attended the MMDS workshop hosted by UC Berkeley. The workshop consisted of 40 talks distributed in four days and a poster session. It focused on algorithmic and statistical problems in large scale data analysis and brought together researchers from several areas including computer science, mathematics, statistics, and Big Data practitioners. In the third day of the workshop, I presented a poster about our recent work on multi-party sorting of large data sets. The following is a brief summary of a few talks that seemed interesting to me and my current research projects. The workshop program is available here.

Dan Suciu from University of Washington talked about running queries on very large databases. He argued that traditionally database queries were measured in terms of disk I/O but in Big Data query processing, since the database is store on distributed clusters, communication is the bottleneck. Now, the problem is to run a query on p servers via a *Massive Parallel Communication (MPC) *model, where the goal is to minimize the load on each server. In this model, the database records are partitioned among p servers each getting M/p, where M is the total number of records. Then, the database operations (like SQL joins) are performed on the partitions. As an example, Dan mentions a popular operation in Big Data analysis called *triangle* that is defined over three relations: Q(x,y,z)=R(x,y) join S(y,z) join T(z,x). Using a naive approach, this query can be computed in two communication rounds, doing two separate joins. But, surprisingly, it can be computed in a single communication round with reasonable load on each server using a simple technique described by Dan in [3].

Bill Howe from University of Washington talked about Myria, a web-based framework written in Java for Big Data query processing. He is motivated by the fact that at least we, as scientists, spend more than 90% of our time handling data for our experiments rather than designing new techniques and algorithms (I think by scientists he means practitioners here. I believe theoreticians spend much less time working with data :). Bill explains that although current Hadoop optimizations are more than 100 times faster than Hadoop itself, we still need faster and simpler frameworks that can be at least used by non-expert scientists. The main idea behind Myria is to blur the distinction between relational algebra and linear algebra. In other words, every query in the relational domain is translated to simple linear algebraic equations that can be computed very fast. Myria provides an imperative-declarative language that runs queries in parallel.

Yiannis Koutis from University of Puerto Rico talked about spectral algorithms for mining data from large graphs. Spectral analysis is an important tool for *graph partitioning*, where the set of vertices of a large graph is divided into smaller components such that the number of edges running between separated components is small. This is often an important subproblem for complexity reduction or parallelization. A cut that provides the smallest number of such edges is called the *sparsest cut*. Such a cut bisects the graph into the most important components. Given an adjacency matrix A of a graph G, the general idea behind spectral graph partitioning is that the eigenvector associated with the second smallest eigenvalue of the Laplacian matrix L of A is the sparsest cut of G (given a matrix A and the degree matrix of A denoted by D, the *Laplacian matrix* of can be computed as L=D-A). Since finding the exact eigenvalues of the Laplacian matrix is often computationally intensive, Yiannis argues that one can find a good (small) sparse cut by approximating the eigenvalues. However, it is an open problem whether one can find better cuts using another technique with similar computational costs. At the end of the talk, I asked Yiannis whether probabilistic methods (such as random walks) are slower than spectral methods. He answered that probabilistic methods are usually local, and global random walks are often much slower than global spectral analysis.

David Woodruff from IBM Almaden Research talked about an optimal CUR matrix decomposition technique. Given a matrix A, the goal is to find three matrices C, U, and R such that A=C.U.R. A solution to this problem can be used in many applications such as recommendation systems. For example, Netflix is interested in fast techniques for decomposing its huge users-by-movies matrix into a matrix of the most important users and a matrix of the most important movies. Since the exact decomposition is often very computationally expensive, an approximation is calculated instead. Although there are fast randomized algorithms for solving this problem (e.g., [2]), David proposes their asymptotically-optimal deterministic algorithm published recently in STOC 2014 [1].

Xavier Amatriain from Netflix talked about current Big Data challenges of his company. Until 2007, the company had over 100 million movie ratings and now it holds about 5 billion ratings. The company currently has 44 million users around the world and holds more than 5 billion hours of movies. The users play about 50 million movies and submit 5 million ratings per day. This contributes to about 32% of US daily downstream traffic. From this huge pile of data, Netflix needs to extract ranking and similarity information about the movies. Their main approach to this end is to employ distributed machine learning algorithms. Xavier argues that in today’s world more data sometimes beats better algorithms. He backs up his claim by a quote from Peter Norving, Director of Research at Google: “*Google does not have better algorithms, only more data*” (see this). Xavier continues by arguing that training algorithms can run in parallel because training data are often independent of each other. He also describe how they have scaled their algorithms by distributing at three different levels: across regions or subsets of the population, at the hyperparameter optimization stage, and at the model training level.

*[A polished version of this report is available here - Ed]*

## References

[1] : “Optimal CUR Matrix Decompositions”, *Proceedings of the 46th Annual ACM Symposium on Theory of Computing*, pp. 353—362, 2014. URL: http://doi.acm.org/10.1145/2591796.2591819.

[2] : “CUR Matrix Decompositions for Improved Data Analysis”, *Proceedings of the National Academy of Sciences*, pp. 697—702, 2009.

Filed under: Uncategorized | Tags: biology, distributed computing, game theory, theory

The Game Theory of Life is a nice writeup of a recent PNAS paper Algorithms, Games and Evolution by Chastaina, Livnatb, Papadimitriou and Vazirani. As a CS researcher with an amateur’s interest in biology, I really enjoyed this paper.

The paper shows a certain equivalence between two algorithms. The first algorithm, I’ll call “evolution”. This is an approximation (using replicator dynamics) to what happens in real Evolution in nature. The second algorithm is the multiplicative weight update algorithm (MWUA) which we all know and love in CS. MWUA is also known as weighted majority or winnow to many people in CS theory.

The paper proves that the outputs of “evolution” are equivalent to MWUA. They then show this implies that “evolution” is equivalent to a process that tries to maximize utility *plus* entropy (equation 3 in the paper). The utility part is not surprising since we’d certainly expect Evolution to be concerned with it. The “plus entropy” part is surprising, and perhaps explains the surprising diversity of life. Essentially sexual replication in the “evolution” algorithm is what gives rise to this additional entropy part.

I think it’s a pretty interesting paper. Intuitively, it proves that a simple kind of genetic algorithm (i.e. “evolution”) will give equivalent results to the MWUA (i.e. see the equivalence of Figures 1 d) and 2 d) in the paper). A nice result from the CS perspective. However, there are a few “rah rah computer science” comments in the paper that may turn off some biologists.

I think the key thing that would really impress biologists would be making the “evolution” algorithm more realistic. To do this would require improving Nagylaki’s theorem which is used in the paper for analyzing the “evolution” algorithm. The authors claim Nagylaki’s theorem can already be used to handle diploidy and partial recombination. Mutation is an area mentioned for future work. Modeling the interaction of organisms seems much harder. It’ll be interesting to see the followup work on this paper.

Filed under: Uncategorized | Tags: secure multiparty computation, theory, workshops

Last month I went to a workshop on secure multiparty computation that was organized by IARPA. I’ve had mixed luck with these types of workshops that are organized by funding agencies. However, this one was pretty good, because of the timeliness of the topic, well-prepared talks, and the fact that the participants were some of the big names in security and MPC.

This was also one first times I’ve visited and given a talk at MIT, which was a lot of fun. Many of the talk slides are included in the workshop link above.

*The June 2014 edition of the Distributed Computing Column in SIGACT News is devoted to a review article on Transactional Memory by Gokarna Sharma and Costas Busch. An archive of the column is available online at:*

*https://parasol.tamu.edu/~welch/sigactNews/*

*Also, if you, like me, are having difficulty obtaining the full version of the March 2014 column, which reviewed a Dagstuhl seminar on Consistency in Distributed Systems by Bettina Kemme, G. Ramalingam, Andre Schiper, and Marc Shapiro, you can get it at the archive.*

*Enjoy!*

*– Jennifer Welch*

Filed under: Uncategorized | Tags: consensus, distributed computing, theory

Long overdue (both the award and my reporting of it here)