Privacy-Preserving Fully Distributed Gaussian Process Regression
Yeongjun Jang, Kaoru Teranishi, Jihoon Suh, Takashi Tanaka
TL;DR
This work tackles privacy in distributed Gaussian process regression (GPR) by introducing a privacy-preserving fully distributed GPR protocol built on secure multi-party computation and secure average consensus. It guarantees that local datasets remain confidential while the agents' local models converge to the global DGPR solution, without relying on a trusted central server, and it provides simulation-based privacy guarantees under a semi-honest adversary model. A novel extension enables kernel hyperparameter optimization in a privacy-preserving manner, using local log marginal likelihoods and a consensus-based approach. Experimental results on real benchmarks demonstrate the method’s effectiveness, scalability, and practical applicability in privacy-sensitive collaborative learning tasks such as healthcare and finance.
Abstract
Although distributed Gaussian process regression (GPR) enables multiple agents with separate datasets to jointly learn a model of the target function, its collaborative nature poses risks of private data leakage. To address this, we propose a privacy-preserving fully distributed GPR protocol based on secure multi-party computation (SMPC) that preserves the confidentiality of each agent's local dataset. Building upon a secure distributed average consensus algorithm, the protocol guarantees that each agent's local model practically converges to the same global model that would be obtained by the standard distributed GPR. Further, we adopt the paradigm of simulation based security to provide formal privacy guarantees, and extend the proposed protocol to enable kernel hyperparameter optimization, which is critical yet often overlooked in the literature. Experimental results demonstrate the effectiveness and practical applicability of the proposed method.
