Matching and covering in streaming graphs, Sept. 2016. Invited keynote talk at DISC 2016.

Problems related to (maximum) matchings and vertex covering in graph have a long history in Combinatorics and Computer Science. They arise in many contexts, from choosing which advertisements to display to online users, to characterizing properties of chemical compounds. Stable matchings have a suite of applications, from assigning students to universities, to arranging organ donations. These have been addressed in a variety of different computation models, from the traditional RAM model, to more recent sublinear (property testing) and external memory (MapReduce) models. Matching has also been studied for a number of classes of input graph: including general graphs, bipartite graphs, weighted graphs, and those with some sparsity structure.

We focus on the streaming case, where each edge is seen once only, and we are restricted to space sublinear in the size of the graph (ie., no. of its vertices). In this case, the objective is to find (approximately) the size of the matching. Even here, results for general graphs are either weak or make assumptions about the input or the stream order. In this talk, we describe work which seeks to improve the guarantees in various ways. First, we consider the case when we are given a promise on the size of the solution: the matching is of size at most k, say. This puts us in the realm of parameterized algorithms and kernelization, but with a streaming twist. We show that algorithms to find a maximal matching can have space which grows quadratically with k. Second, we consider restricting to graphs that have some measure of sparsity — bounded arboricity, or bounded degree. This aligns with reality, where most massive graphs have asymptotically fewer than O(n2) edges. In this case, we show algorithms whose space cost is polylogarithmic in the size of the input, multiplied by a constant that depends on the level of sparsity, in order to estimate the size of the maximum matching. The techniques used rely on ideas of sampling and sketching, developed to handle data which arrives as a stream of observations, coupled with analysis of the resulting randomized algorithms.

bib | slides ] Back


This file was generated by bibtex2html 1.92.