Concern about how to collect sensitive user data without compromising individual privacy is a major barrier to greater availability of data. The model of Local Differential Privacy has recently gained favor and enjoys widespread adoption for data gathering from browsers and apps. Deployed methods use Randomized Response, which applies only when the user data is a single bit. We study general mechanisms for data release which allow the release of statistics from small groups. We formalize this by introducing a set of desirable properties that such mechanisms can obey. Any combination of these can be satisfied by solving a linear program which minimizes a cost function. We also provide explicit constructions that are optimal for certain combinations of properties, and show a closed form for their cost. In the end, there are only three distinct optimal mechanisms to choose between: one is the well-known (truncated) geometric mechanism; the second a novel mechanism that we introduce, and the third is found as the solution to a particular LP. Through a set of experiments on real and synthetic data we determine which is preferable in practice, for different combinations of data distributions and privacy parameters. Joint work with Tejas Kulkarni and Divesh Srivastava
[ bib | video | slides ] Back
This file was generated by bibtex2html 1.92.