Table of Contents
Fetching ...

A Stable Neural Statistical Dependence Estimator for Autoencoder Feature Analysis

Bo Hu, Jose C Principe

Abstract

Statistical dependence measures like mutual information is ideal for analyzing autoencoders, but it can be ill-posed for deterministic, static, noise-free networks. We adopt the variational (Gaussian) formulation that makes dependence among inputs, latents, and reconstructions measurable, and we propose a stable neural dependence estimator based on an orthonormal density-ratio decomposition. Unlike MINE, our method avoids input concatenation and product-of-marginals re-pairing, reducing computational cost and improving stability. We introduce an efficient NMF-like scalar objective and demonstrate empirically that assuming Gaussian noise to form an auxiliary variable enables meaningful dependence measurements and supports quantitative feature analysis, with a sequential convergence of singular values.

A Stable Neural Statistical Dependence Estimator for Autoencoder Feature Analysis

Abstract

Statistical dependence measures like mutual information is ideal for analyzing autoencoders, but it can be ill-posed for deterministic, static, noise-free networks. We adopt the variational (Gaussian) formulation that makes dependence among inputs, latents, and reconstructions measurable, and we propose a stable neural dependence estimator based on an orthonormal density-ratio decomposition. Unlike MINE, our method avoids input concatenation and product-of-marginals re-pairing, reducing computational cost and improving stability. We introduce an efficient NMF-like scalar objective and demonstrate empirically that assuming Gaussian noise to form an auxiliary variable enables meaningful dependence measurements and supports quantitative feature analysis, with a sequential convergence of singular values.
Paper Structure (28 sections, 39 equations, 30 figures, 11 tables, 2 algorithms)

This paper contains 28 sections, 39 equations, 30 figures, 11 tables, 2 algorithms.

Figures (30)

  • Figure 1: Learning curves for the NMF-like cost. The curves are smooth and stable because no re-pairing is required.
  • Figure 2: A learning curve of MINE on MNIST. The sudden "dip" in the curve is largely due to the re-pairing step for sampling from the product of marginals. The learning curve would be smoother if we lowered the learning rate, but convergence would take significantly longer.
  • Figure 3: Learning curves of singular values.
  • Figure 4: Two-moon dataset: left and right singular functions for the pairs $\{X,Y'\}$ and $\{X,\widehat{\,X\,}\}$. (a), (c), and (d) display 2D singular functions as heatmaps (nine functions shown per panel). (b) displays 1D singular functions as curves (six functions shown).
  • Figure 5: MNIST: left and right singular functions for the pairs $\{X,Y'\}$ and $\{X,\widehat{\,X\,}\}$. In (a) we have excluded the trivial constant singular function that always has a singular value $1$.
  • ...and 25 more figures