Table of Contents
Fetching ...

Privacy-preserving Non-negative Matrix Factorization with Outliers

Swapnil Saha, Hafiz Imtiaz

TL;DR

The paper tackles privacy concerns in non-negative matrix factorization by introducing a privacy-preserving NMF framework that robustly handles outliers. It formulates a robust objective V ≈ WH + R with non-negativity and bounded-outlier constraints, and develops a two-node training architecture that injects Gaussian noise into the gradient to achieve DP, analyzed via Rényi DP. By deriving L2-sensitivity for A = (1/N)HH^T and B = (1/N)(V−R)H^T, and applying the Gaussian mechanism, the authors provide a multi-stage DP guarantee and an optimal α for tight accounting. Empirically, across six real datasets (text and face images), the private method achieves a small utility gap relative to the non-private algorithm, demonstrating practical DP-NMF with outliers and offering guidance on privacy budget allocation through an explicit ε-trajectory.

Abstract

Non-negative matrix factorization is a popular unsupervised machine learning algorithm for extracting meaningful features from data which are inherently non-negative. However, such data sets may often contain privacy-sensitive user data, and therefore, we may need to take necessary steps to ensure the privacy of the users while analyzing the data. In this work, we focus on developing a Non-negative matrix factorization algorithm in the privacy-preserving framework. More specifically, we propose a novel privacy-preserving algorithm for non-negative matrix factorisation capable of operating on private data, while achieving results comparable to those of the non-private algorithm. We design the framework such that one has the control to select the degree of privacy grantee based on the utility gap. We show our proposed framework's performance in six real data sets. The experimental results show that our proposed method can achieve very close performance with the non-private algorithm under some parameter regime, while ensuring strict privacy.

Privacy-preserving Non-negative Matrix Factorization with Outliers

TL;DR

The paper tackles privacy concerns in non-negative matrix factorization by introducing a privacy-preserving NMF framework that robustly handles outliers. It formulates a robust objective V ≈ WH + R with non-negativity and bounded-outlier constraints, and develops a two-node training architecture that injects Gaussian noise into the gradient to achieve DP, analyzed via Rényi DP. By deriving L2-sensitivity for A = (1/N)HH^T and B = (1/N)(V−R)H^T, and applying the Gaussian mechanism, the authors provide a multi-stage DP guarantee and an optimal α for tight accounting. Empirically, across six real datasets (text and face images), the private method achieves a small utility gap relative to the non-private algorithm, demonstrating practical DP-NMF with outliers and offering guidance on privacy budget allocation through an explicit ε-trajectory.

Abstract

Non-negative matrix factorization is a popular unsupervised machine learning algorithm for extracting meaningful features from data which are inherently non-negative. However, such data sets may often contain privacy-sensitive user data, and therefore, we may need to take necessary steps to ensure the privacy of the users while analyzing the data. In this work, we focus on developing a Non-negative matrix factorization algorithm in the privacy-preserving framework. More specifically, we propose a novel privacy-preserving algorithm for non-negative matrix factorisation capable of operating on private data, while achieving results comparable to those of the non-private algorithm. We design the framework such that one has the control to select the degree of privacy grantee based on the utility gap. We show our proposed framework's performance in six real data sets. The experimental results show that our proposed method can achieve very close performance with the non-private algorithm under some parameter regime, while ensuring strict privacy.
Paper Structure (21 sections, 4 theorems, 20 equations, 11 figures, 4 tables, 2 algorithms)

This paper contains 21 sections, 4 theorems, 20 equations, 11 figures, 4 tables, 2 algorithms.

Key Result

Proposition 1

(From RDP to DP mironov2017renyi). If $f$ is an $(\alpha,\epsilon_r)$-RDP mechanism, it also satisfies $(\epsilon_r+\frac{\log 1/\delta}{\alpha-1},\delta)$-differential privacy for any $0<\delta<1$.

Figures (11)

  • Figure 1: Schematic diagram of privacy-preserving NMF
  • Figure 2: Utility Comparison on Text Data Set
  • Figure 3: Overall $\epsilon$ and Objective Value on Text Data Set
  • Figure 4: Topic Word Comparison
  • Figure 5: Utility Comparison on Face Image Data Set
  • ...and 6 more figures

Theorems & Definitions (9)

  • Definition 1
  • Definition 2
  • Definition 3
  • Definition 4
  • Proposition 1
  • Proposition 2
  • Proposition 3
  • Theorem 3.1: Privacy of Algorithm \ref{['alg:nmf_private_outlier']}
  • proof