Distributed HDMM: Scalable, Distributed, Accurate, and Differentially Private Query Workloads without a Trusted Curator
Ratang Sedimo, Ivoline C. Ngong, Jami Lashua, Joseph P. Near
TL;DR
<3-5 sentence high-level summary>Distributed HDMM addresses the challenge of answering high-dimensional query workloads under differential privacy without a trusted curator by combining the central-model High-Dimensional Matrix Mechanism (HDMM) with secure aggregation. The method broadcasts an optimized strategy matrix, enables each client to compute a local HDMM measurement, adds discrete Gaussian noise, and securely aggregates these contributions to reconstruct the private workload answers, preserving privacy under semi-honest and malicious threat models. The authors provide formal privacy guarantees, analyze computation and communication costs, and demonstrate scalability to thousands of clients with utility close to central HDMM, outperforming local and shuffle-based baselines. The work also includes an open-source implementation and extensive evaluation on Census SF1 and Adult workloads, highlighting practical deployment considerations for federated settings.
Abstract
We present the Distributed High-Dimensional Matrix Mechanism (Distributed HDMM), a protocol for answering workloads of linear queries on distributed data that provides the accuracy of central-model HDMM without a trusted curator. Distributed HDMM leverages a secure aggregation protocol to evaluate HDMM on distributed data, and is secure in the context of a malicious aggregator and malicious clients (assuming an honest majority). Our preliminary empirical evaluation shows that Distributed HDMM can run on realistic datasets and workloads with thousands of clients in less than one minute.
