Privacy-preserving Non-negative Matrix Factorization with Outliers
Swapnil Saha, Hafiz Imtiaz
TL;DR
The paper tackles privacy concerns in non-negative matrix factorization by introducing a privacy-preserving NMF framework that robustly handles outliers. It formulates a robust objective V ≈ WH + R with non-negativity and bounded-outlier constraints, and develops a two-node training architecture that injects Gaussian noise into the gradient to achieve DP, analyzed via Rényi DP. By deriving L2-sensitivity for A = (1/N)HH^T and B = (1/N)(V−R)H^T, and applying the Gaussian mechanism, the authors provide a multi-stage DP guarantee and an optimal α for tight accounting. Empirically, across six real datasets (text and face images), the private method achieves a small utility gap relative to the non-private algorithm, demonstrating practical DP-NMF with outliers and offering guidance on privacy budget allocation through an explicit ε-trajectory.
Abstract
Non-negative matrix factorization is a popular unsupervised machine learning algorithm for extracting meaningful features from data which are inherently non-negative. However, such data sets may often contain privacy-sensitive user data, and therefore, we may need to take necessary steps to ensure the privacy of the users while analyzing the data. In this work, we focus on developing a Non-negative matrix factorization algorithm in the privacy-preserving framework. More specifically, we propose a novel privacy-preserving algorithm for non-negative matrix factorisation capable of operating on private data, while achieving results comparable to those of the non-private algorithm. We design the framework such that one has the control to select the degree of privacy grantee based on the utility gap. We show our proposed framework's performance in six real data sets. The experimental results show that our proposed method can achieve very close performance with the non-private algorithm under some parameter regime, while ensuring strict privacy.
