Table of Contents
Fetching ...

Differentially Private Geodesic and Linear Regression

Aditya Kulkarni, Carlos Soto

TL;DR

The paper addresses privately releasing parameters of geodesic regression when responses lie on Riemannian manifolds. It extends the K-Norm Gradient mechanism to manifolds and derives per-parameter sensitivity bounds that depend on Jacobi fields and curvature, with global, curvature-bounded guarantees via Rauch comparison; the key bounds are $\Delta_p \le \frac{2\tau}{n}\|J_p(1)\|$ and an analogous bound for $\Delta_v$. Empirical validation on $S^2$ and a Euclidean specialization demonstrates controlled privacy-utility tradeoffs, including competitive MSE performance against private linear-regression baselines on real data. The work enables privacy-preserving inference for non-Euclidean data common in medical imaging and computer vision, while highlighting challenges in sampling from DP mechanisms on manifolds and suggesting directions for extending to other curved spaces like Kendall’s shape space.

Abstract

In statistical applications it has become increasingly common to encounter data structures that live on non-linear spaces such as manifolds. Classical linear regression, one of the most fundamental methodologies of statistical learning, captures the relationship between an independent variable and a response variable which both are assumed to live in Euclidean space. Thus, geodesic regression emerged as an extension where the response variable lives on a Riemannian manifold. The parameters of geodesic regression, as with linear regression, capture the relationship of sensitive data and hence one should consider the privacy protection practices of said parameters. We consider releasing Differentially Private (DP) parameters of geodesic regression via the K-Norm Gradient (KNG) mechanism for Riemannian manifolds. We derive theoretical bounds for the sensitivity of the parameters showing they are tied to their respective Jacobi fields and hence the curvature of the space. This corroborates recent findings of differential privacy for the Fréchet mean. We demonstrate the efficacy of our methodology on the sphere, $\mbS^2\subset\mbR^3$ and, since it is general to Riemannian manifolds, the manifold of Euclidean space which simplifies geodesic regression to a case of linear regression. Our methodology is general to any Riemannian manifold and thus it is suitable for data in domains such as medical imaging and computer vision.

Differentially Private Geodesic and Linear Regression

TL;DR

The paper addresses privately releasing parameters of geodesic regression when responses lie on Riemannian manifolds. It extends the K-Norm Gradient mechanism to manifolds and derives per-parameter sensitivity bounds that depend on Jacobi fields and curvature, with global, curvature-bounded guarantees via Rauch comparison; the key bounds are and an analogous bound for . Empirical validation on and a Euclidean specialization demonstrates controlled privacy-utility tradeoffs, including competitive MSE performance against private linear-regression baselines on real data. The work enables privacy-preserving inference for non-Euclidean data common in medical imaging and computer vision, while highlighting challenges in sampling from DP mechanisms on manifolds and suggesting directions for extending to other curved spaces like Kendall’s shape space.

Abstract

In statistical applications it has become increasingly common to encounter data structures that live on non-linear spaces such as manifolds. Classical linear regression, one of the most fundamental methodologies of statistical learning, captures the relationship between an independent variable and a response variable which both are assumed to live in Euclidean space. Thus, geodesic regression emerged as an extension where the response variable lives on a Riemannian manifold. The parameters of geodesic regression, as with linear regression, capture the relationship of sensitive data and hence one should consider the privacy protection practices of said parameters. We consider releasing Differentially Private (DP) parameters of geodesic regression via the K-Norm Gradient (KNG) mechanism for Riemannian manifolds. We derive theoretical bounds for the sensitivity of the parameters showing they are tied to their respective Jacobi fields and hence the curvature of the space. This corroborates recent findings of differential privacy for the Fréchet mean. We demonstrate the efficacy of our methodology on the sphere, and, since it is general to Riemannian manifolds, the manifold of Euclidean space which simplifies geodesic regression to a case of linear regression. Our methodology is general to any Riemannian manifold and thus it is suitable for data in domains such as medical imaging and computer vision.

Paper Structure

This paper contains 16 sections, 4 theorems, 29 equations, 7 figures, 1 table.

Key Result

Theorem 2.1

For two Riemannian manifolds $\mathcal{M}, \tilde{\mathcal{M}}$ with curvatures $K(\gamma), \tilde{K}(\tilde{\gamma})$ and unit speed geodesics $\gamma:[0,\beta]\rightarrow \mathcal{M}$ and $\tilde{\gamma}:[0,\beta]\rightarrow \tilde{\mathcal{M}}$ and $J,\tilde{J}$ be the Jacobi fields along $\gamma

Figures (7)

  • Figure 1: Twenty data points (green) on a unit sphere. The green curve is the geodesic formed with the pair ($\hat{p},\hat{v}$). The blue curve is the private geodesic formed with the pair ($\tilde{p}, \tilde{v}$).
  • Figure 2: Histograms for $20$ datapoints on a unit sphere of $200$ samples. Left: Histogram of geodesic distance between $\hat{p}$ and $\tilde{p}_i\in \mathcal{P}$. Right: Histogram of angles between $\hat{v}$ and $\tilde{v}_{1j}\in \mathcal{V}_1$.
  • Figure 3: Histogram of $\bar{\mathcal{E}}_i,\,\,\, i=1,2,...,p$ for $20$ datapoints on the sphere and $200$ sampled replicates.
  • Figure 4: Comparison of differentially private (DP) and non-DP linear regression models for the Wine quality dataset. The blue scatter points are the original data points with regression lines from a non-private model (red) and three different DP methods: DP-IBM (green), DP-Opacus (orange), and our proposed DPGR (blue).
  • Figure 5: Each histogram is generated with $100$ pairs of adjacent datasets each with $20$ data points on a unit sphere. Left: Histogram of the ratio $r_p = \Delta^{thy}_p/\Delta^{exp}_p$. Right: Histogram of the ratio $r_v = \Delta^{thy}_v/\Delta^{exp}_v$.
  • ...and 2 more figures

Theorems & Definitions (7)

  • Theorem 2.1: Rauch Comparison Theorem
  • Definition 2.2
  • Lemma 3.3
  • proof
  • Theorem 3.4
  • Theorem D.1
  • proof