Privacy-aware Gaussian Process Regression
Rui Tuo, Haoyuan Chen, Raktim Bhattacharya
TL;DR
This work introduces privacy-aware Gaussian process regression by adding optimally designed correlated Gaussian noise to training data to guarantee a minimum predictive variance at sensitive inputs, avoiding divulging private information while preserving useful predictions elsewhere. The optimization of the noise covariance Σ is shown to be a semidefinite program (SDP), with an explicit closed-form PSD part in the finite-sensitive-input setting and a kernel-based extension that handles continuous privacy over infinite input regions via RKHS inner products. A kernel-based framework yields a uniformly privacy-aware solution that can be approximated by dense finite subsets, ensuring practical computation for continuous privacy regions. The method is demonstrated on a space-object tracking scenario and a real census dataset, demonstrating favorable privacy-utility tradeoffs and computational efficiency relative to differential-privacy baselines, with clear guidance on utility validation and scalability considerations.
Abstract
We propose a novel theoretical and methodological framework for Gaussian process regression subject to privacy constraints. The proposed method can be used when a data owner is unwilling to share a high-fidelity supervised learning model built from their data with the public due to privacy concerns. The key idea of the proposed method is to add synthetic noise to the data until the predictive variance of the Gaussian process model reaches a prespecified privacy level. The optimal covariance matrix of the synthetic noise is formulated in terms of semi-definite programming. We also introduce the formulation of privacy-aware solutions under continuous privacy constraints using kernel-based approaches, and study their theoretical properties. The proposed method is illustrated by considering a model that tracks the trajectories of satellites and a real application on a census dataset.
