Table of Contents
Fetching ...

Principal Feature Detection via $Φ$-Sobolev Inequalities

Matthew T. C. Li, Youssef Marzouk, Olivier Zahm

TL;DR

An application to Bayesian inverse problems and an analogous construction with approximation guarantees that hold in expectation over the data are proposed and an extension of the proposed dimension reduction strategy to nonlinear feature maps is extended.

Abstract

We investigate the approximation of high-dimensional target measures as low-dimensional updates of a dominating reference measure. This approximation class replaces the associated density with the composition of: (i) a feature map that identifies the leading principal components or features of the target measure, relative to the reference, and (ii) a low-dimensional profile function. When the reference measure satisfies a subspace $φ$-Sobolev inequality, we construct a computationally tractable approximation that yields certifiable error guarantees with respect to the Amari $α$-divergences. Our construction proceeds in two stages. First, for any feature map and any $α$-divergence, we obtain an analytical expression for the optimal profile function. Second, for linear feature maps, the principal features are obtained from eigenvectors of a matrix involving gradients of the log-density. Neither step requires explicit access to normalizing constants. Notably, by leveraging the $φ$-Sobolev inequalities, we demonstrate that these features universally certify approximation errors across the range of $α$-divergences $α\in (0,1]$. We then propose an application to Bayesian inverse problems and provide an analogous construction with approximation guarantees that hold in expectation over the data. We conclude with an extension of the proposed dimension reduction strategy to nonlinear feature maps.

Principal Feature Detection via $Φ$-Sobolev Inequalities

TL;DR

An application to Bayesian inverse problems and an analogous construction with approximation guarantees that hold in expectation over the data are proposed and an extension of the proposed dimension reduction strategy to nonlinear feature maps is extended.

Abstract

We investigate the approximation of high-dimensional target measures as low-dimensional updates of a dominating reference measure. This approximation class replaces the associated density with the composition of: (i) a feature map that identifies the leading principal components or features of the target measure, relative to the reference, and (ii) a low-dimensional profile function. When the reference measure satisfies a subspace -Sobolev inequality, we construct a computationally tractable approximation that yields certifiable error guarantees with respect to the Amari -divergences. Our construction proceeds in two stages. First, for any feature map and any -divergence, we obtain an analytical expression for the optimal profile function. Second, for linear feature maps, the principal features are obtained from eigenvectors of a matrix involving gradients of the log-density. Neither step requires explicit access to normalizing constants. Notably, by leveraging the -Sobolev inequalities, we demonstrate that these features universally certify approximation errors across the range of -divergences . We then propose an application to Bayesian inverse problems and provide an analogous construction with approximation guarantees that hold in expectation over the data. We conclude with an extension of the proposed dimension reduction strategy to nonlinear feature maps.
Paper Structure (25 sections, 12 theorems, 102 equations, 5 figures)

This paper contains 25 sections, 12 theorems, 102 equations, 5 figures.

Key Result

Theorem 2.1

Let $\pi$ and $\mu$ be probability measures such that $\mathrm{d}\pi(x) \propto \ell(x)\mathrm{d}\mu(x)$ for some integrable function $\ell:\mathbb{R}^d\rightarrow\mathbb{R}_{\geq0}$. Given a measurable function $\varphi_r : \mathbb{R}^d \rightarrow \mathbb{R}^r$ and $\alpha\in\mathbb{R}$, consider where Then, for any integrable function $\widetilde{\ell}_r:\mathbb{R}^r\rightarrow\mathbb{R}_{\ge

Figures (5)

  • Figure 1: Visualization of the majorized loss function $t \mapsto \mathcal{J}_\alpha(t)$ for $\alpha \geq 1/2$, defined in \ref{['eq:defJalpha']} (solid lines ), and its extension for $0 < \alpha < 1/2$, defined in \ref{['eq:Jext']} (dashed lines ).
  • Figure 2: Comparison of the majorization \ref{['eq:boundloss']} across different $\alpha \in (1/2,1]$. The decay of the eigenvalue spectrum is assumed to be algebraic, and the trace normalization of the diagnostic matrix is assumed to be $\sum \lambda_k = 10$ for this example.
  • Figure 3: Comparison of the exact squared Hellinger loss for the linear Gaussian inverse problem (Appendix \ref{['sec:lingaussian']}), the majorized bound in $\mathcal{J}_{1/2}(\sum_{k>r} \lambda_k)$ in \ref{['eq:boundloss']}, and the bound $\frac{1}{4} \sum_{k>r}\lambda_k$ derived by Cui and Tong Cui_Tong_2021. (Left) Example with algebraically decaying eigenvalues of the diagnostic matrix with $d=100$ and normalization $\sum_{k=1}^d \lambda_k = 7$. The shaded region indicates a vacuous upper-bound. (Right) Example with exponentially decaying eigenvalues for $d = 50$ and normalization $\sum_{k=1}^d \lambda_k = 2$.
  • Figure 4: Comparison between the improved majorized loss function $\mathcal{J}^\flat_\alpha(t)$ in \ref{['eq:Jflat']} and the majorized loss function $\mathcal{J}_\alpha(t)$ in \ref{['eq:defJalpha']} for several choices of $\alpha > 1/2$.
  • Figure 5: Comparison between the improved majorized loss function $\mathcal{J}^\flat_\alpha(t)$ and the majorized loss function $\mathcal{J}_\alpha(t)$ for (left) several choices of $\alpha < 1/2$, and (right) several choices of $\alpha > 1/2$.

Theorems & Definitions (26)

  • Remark : Linear feature map
  • Theorem 2.1: Pythagorean-like identity
  • Proposition 2.2
  • Definition 3.1: $\phi$-Sobolev inequality Chafai_2004Bolley_Gentil_2010
  • Proposition 3.2
  • Definition 3.3: Subspace $\phi$-Sobolev Inequality
  • Proposition 3.4
  • Theorem 3.5
  • Remark
  • Lemma 3.6: Beckner monotonicity
  • ...and 16 more