Differentially-Private Collaborative Online Personalized Mean Estimation
Yauhen Yakimenka, Chung-Wei Weng, Hsuan-Yin Lin, Eirik Rosnes, Jörg Kliewer
TL;DR
The work addresses online personalized mean estimation across multiple agents under differential privacy. It introduces two privacy mechanisms (PM-I and PM-II) and two data-variance estimation schemes, coupled with a hypothesis-testing-based decision rule to identify agents with the same mean, and a linear statistic to fuse information from selected peers. Theoretical results show faster convergence than fully local methods under Bernstein-type conditions, with analytical ideal/oracle performance curves and extensive simulations confirming the benefits of private collaboration. The findings demonstrate that online private collaboration can closely match ideal performance in practice, motivating practical deployment with privacy guarantees and paving the way for future work on communication-efficient designs.
Abstract
We consider the problem of collaborative personalized mean estimation under a privacy constraint in an environment of several agents continuously receiving data according to arbitrary unknown agent-specific distributions. In particular, we provide a method based on hypothesis testing coupled with differential privacy and data variance estimation. Two privacy mechanisms and two data variance estimation schemes are proposed, and we provide a theoretical convergence analysis of the proposed algorithm for any bounded unknown distributions on the agents' data, showing that collaboration provides faster convergence than a fully local approach where agents do not share data. Moreover, we provide analytical performance curves for the case with an oracle class estimator, i.e., the class structure of the agents, where agents receiving data from distributions with the same mean are considered to be in the same class, is known. The theoretical faster-than-local convergence guarantee is backed up by extensive numerical results showing that for a considered scenario the proposed approach indeed converges much faster than a fully local approach, and performs comparably to ideal performance where all data is public. This illustrates the benefit of private collaboration in an online setting.
