Table of Contents
Fetching ...

Unsupervised discovery of the shared and private geometry in multi-view data

Sai Koukuntla, Joshua B. Julian, Jesse C. Kaminsky, Manuel Schottdorf, David W. Tank, Carlos D. Brody, Adam S. Charles

TL;DR

This work tackles the challenge of uncovering nonlinear, shared versus private latent structure across paired high-dimensional views without supervision. It introduces SPLICE, a two-step autoencoder framework that first disentangles shared and private latents via predictability minimization in a crossed-encoder setup, then preserves the intrinsic geometry of each submanifold by projecting data, estimating geodesic distances with landmark-based methods, and applying a geometry-preserving fine-tuning loss. The approach yields interpretable latent spaces that capture the geometry of both shared and private information, demonstrated across Rotated MNIST, synthetic LGN–V1 data, and real hippocampus–prefrontal cortex recordings, and outperforming several baselines. The method enables blind discovery of meaningful cross-view structure with robust dimensionality estimation, holding promise for neuroscience and cross-modal sensor fusion applications. Overall, SPLICE advances interpretable, geometry-aware multi-view representation learning in scientific contexts where true latent dimensionality is unknown a priori.

Abstract

Studying complex real-world phenomena often involves data from multiple views (e.g. sensor modalities or brain regions), each capturing different aspects of the underlying system. Within neuroscience, there is growing interest in large-scale simultaneous recordings across multiple brain regions. Understanding the relationship between views (e.g., the neural activity in each region recorded) can reveal fundamental insights into each view and the system as a whole. However, existing methods to characterize such relationships lack the expressivity required to capture nonlinear relationships, describe only shared sources of variance, or discard geometric information that is crucial to drawing insights from data. Here, we present SPLICE: a neural network-based method that infers disentangled, interpretable representations of private and shared latent variables from paired samples of high-dimensional views. Compared to competing methods, we demonstrate that SPLICE 1) disentangles shared and private representations more effectively, 2) yields more interpretable representations by preserving geometry, and 3) is more robust to incorrect a priori estimates of latent dimensionality. We propose our approach as a general-purpose method for finding succinct and interpretable descriptions of paired data sets in terms of disentangled shared and private latent variables.

Unsupervised discovery of the shared and private geometry in multi-view data

TL;DR

This work tackles the challenge of uncovering nonlinear, shared versus private latent structure across paired high-dimensional views without supervision. It introduces SPLICE, a two-step autoencoder framework that first disentangles shared and private latents via predictability minimization in a crossed-encoder setup, then preserves the intrinsic geometry of each submanifold by projecting data, estimating geodesic distances with landmark-based methods, and applying a geometry-preserving fine-tuning loss. The approach yields interpretable latent spaces that capture the geometry of both shared and private information, demonstrated across Rotated MNIST, synthetic LGN–V1 data, and real hippocampus–prefrontal cortex recordings, and outperforming several baselines. The method enables blind discovery of meaningful cross-view structure with robust dimensionality estimation, holding promise for neuroscience and cross-modal sensor fusion applications. Overall, SPLICE advances interpretable, geometry-aware multi-view representation learning in scientific contexts where true latent dimensionality is unknown a priori.

Abstract

Studying complex real-world phenomena often involves data from multiple views (e.g. sensor modalities or brain regions), each capturing different aspects of the underlying system. Within neuroscience, there is growing interest in large-scale simultaneous recordings across multiple brain regions. Understanding the relationship between views (e.g., the neural activity in each region recorded) can reveal fundamental insights into each view and the system as a whole. However, existing methods to characterize such relationships lack the expressivity required to capture nonlinear relationships, describe only shared sources of variance, or discard geometric information that is crucial to drawing insights from data. Here, we present SPLICE: a neural network-based method that infers disentangled, interpretable representations of private and shared latent variables from paired samples of high-dimensional views. Compared to competing methods, we demonstrate that SPLICE 1) disentangles shared and private representations more effectively, 2) yields more interpretable representations by preserving geometry, and 3) is more robust to incorrect a priori estimates of latent dimensionality. We propose our approach as a general-purpose method for finding succinct and interpretable descriptions of paired data sets in terms of disentangled shared and private latent variables.
Paper Structure (22 sections, 7 equations, 4 figures, 2 algorithms)

This paper contains 22 sections, 7 equations, 4 figures, 2 algorithms.

Figures (4)

  • Figure 1: Problem formulation and model architecture. a) Illustration of the observation model. Low-dimensional private and shared latent variables are combined nonlinearly to form low-dimensional manifolds embedded in the $A$ view and $B$ view high-dimensional observation spaces. b) The SPLICE unsupervised autoencoder network architecture.
  • Figure 2: Rotated MNIST example. a) (left) During training, view $A$ inputs were original MNIST digits, view $B$ inputs were a random rotation of them. (right) SPLICE accurately reconstructs the original and rotated digits. b: The $F_B(\cdot)$ private encoder in SPLICE distilled from input $\bm{x}_B$ only the rotation angle and discards digit identity, indicating successful disentangling. c) SPLICE retains the circular 1-D geometry of rotation angle, unlike Lyu et al. Lyu2021-ts and DMVAE Lee2021-kw. d) Even when given 5x the true number of private latents, SPLICE confines private variance to 2 dimensions, while other methods use all available dimensions and admit non-angle related variance.
  • Figure 3: Simulated LGN-V1 experiment. a) The two synthetic brain regions encode 1) location on a linear track (place 1, private to $A$), 2) the 2D position of a vertical visual stimulus bar (shared across $A$ and $B$), and 3) a second linear track location (place 2, private to $B$). b) The visual stimulus drives center-surround and Gabor RFs. Neurons are ordered by RF centers; As the neurons’ private place infromation is unrelated to their visual RF centers, it appears as random noise. c) SPLICE autoencoder network. d) SPLICE correctly estimates the shared latent dimensionality ($d=2$), while RRR overestimates it as $d=70$. e) Value of the 2D SPLICE shared latents for each trial (each dot is one trial) before applying geometry preservation. f) Same as g, but after applying SPLICE's geometry preservation. $x$ and $y$ positions are dot size and dot color, respectively. g) Each panel shows the data as in f, but colored by the activity of a randomly chosen neuron: SPLICE allows discovering that the activity coordinated across the regions has localized RFs that tile the shared space.
  • Figure 4: Neurophysiological data experiment. a) On each trial, mice made a Left/Right decision on a virtual T-maze. The correct response was cued by a visual stimulus in the first half of the stem of the T. Simultaneous recordings yielded 96 neurons in hippocampus, and 348 neurons in medial prefrontal cortex. b) SPLICE outperformed RRR Semedo2019-wt, summarizing the shared space in 2 dimensions, instead of 12. c) The shared latent space encodes the animal's position.