Preventing Model Collapse in Gaussian Process Latent Variable Models

Ying Li; Zhidi Lin; Feng Yin; Michael Minyi Zhang

Preventing Model Collapse in Gaussian Process Latent Variable Models

Ying Li, Zhidi Lin, Feng Yin, Michael Minyi Zhang

TL;DR

The proposed GPLVM, named advisedRFLVM, is evaluated across diverse datasets and consistently outperforms various salient competing models, including state-of-the-art variational autoencoders and other GPLVM variants, in terms of informative latent representations and missing data imputation.

Abstract

Gaussian process latent variable models (GPLVMs) are a versatile family of unsupervised learning models commonly used for dimensionality reduction. However, common challenges in modeling data with GPLVMs include inadequate kernel flexibility and improper selection of the projection noise, leading to a type of model collapse characterized by vague latent representations that do not reflect the underlying data structure. This paper addresses these issues by, first, theoretically examining the impact of projection variance on model collapse through the lens of a linear GPLVM. Second, we tackle model collapse due to inadequate kernel flexibility by integrating the spectral mixture (SM) kernel and a differentiable random Fourier feature (RFF) kernel approximation, which ensures computational scalability and efficiency through off-the-shelf automatic differentiation tools for learning the kernel hyperparameters, projection variance, and latent representations within the variational inference framework. The proposed GPLVM, named advisedRFLVM, is evaluated across diverse datasets and consistently outperforms various salient competing models, including state-of-the-art variational autoencoders (VAEs) and other GPLVM variants, in terms of informative latent representations and missing data imputation.

Preventing Model Collapse in Gaussian Process Latent Variable Models

TL;DR

Abstract

Paper Structure (47 sections, 12 theorems, 81 equations, 8 figures, 6 tables, 1 algorithm)

This paper contains 47 sections, 12 theorems, 81 equations, 8 figures, 6 tables, 1 algorithm.

Introduction
Preliminaries
Causes of Model Collapse
Projection Variance Matters
Kernel Function Flexibility Matters
Preventing Model Collapse
Approximate Bayesian Inference
Differentiable RFF Approximation for SM Kernel
Related Work
VAEs.
RFLVMs.
Experiments
Projection Variance Matters
S-shaped Latent Manifold Learning
Real Dataset Evaluation
...and 32 more sections

Key Result

Theorem 3.1

Given the maximization problem in Eq. eq:GPLVM_MLE, the stationary points, $\hat{{\mathbf X}}$, in the case of the linear GPLVM is: where ${\mathbf U}_{Q} \!\triangleq\! \left[ {\mathbf u}_1, \ldots, {\mathbf u}_Q \right] \!\in\! \mathbb{R}^{N \times Q}$ represents arbitrary eigenvectors of $\frac{1}{M} {\mathbf Y} {\mathbf Y}^{\top}$, $\mathbf{R} \in \mathbb{R}^{Q \times Q}$ is an arbitrary orth

Figures (8)

Figure 1: Top: Latent function estimation using GPLVM with preliminary (---) or advanced/flexible kernels (---). Bottom: (\ref{['subfig:correct_estimation']}): 2-D S-shape latent manifold learned by the proposed advisedrflvm. (\ref{['subfig:distortion']}): 2-D S-shape latent manifold learned by using a preliminary (RBF) kernel. (\ref{['subfig:collapse']}): 2-D S-shape latent manifold learned without optimizing projection variance. (\ref{['subfig:correct_estimation']})--(\ref{['subfig:collapse']}) also show histograms in different dimensions of the learned latent manifold.
Figure 2: Left: The number of zero-columns (short as num-zc) in the latent variable ${\mathbf X}$ versus the initialization value of $\sigma^2$ (defined as init-$\sigma^2$). Right: knn classification accuracy against init-$\sigma^2$. Standard deviation is calculated over five experiments.
Figure 3: Left: Learned latent manifold in "RBF+periodic" dataset. Right: $\mathrm{R}^2$ score performance over different models in two datasets.
Figure 4: (Left)$\mathrm{R}^2$ against the number of mixture densities in SM kernel $(m)$. (Right)$\mathrm{R}^2$ versus the dimensionality of the random feature ($L/2$).
Figure 5: Latent manifold learning results with $L/2=25$ and different $m$
...and 3 more figures

Theorems & Definitions (28)

Definition 2.1: Model Collapse
Theorem 3.1
proof
Proposition 3.2
proof
Proposition 3.3
proof
Theorem 4.1
Proposition 4.2
proof
...and 18 more

Preventing Model Collapse in Gaussian Process Latent Variable Models

TL;DR

Abstract

Preventing Model Collapse in Gaussian Process Latent Variable Models

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (8)

Theorems & Definitions (28)