Table of Contents
Fetching ...

Spectral Graph Filtering for Modality-Specific Representation Learning

Shira Yoffe, Amit Moscovich, Ariel Jaffe

TL;DR

This paper presents DELVE, a spectral method for extracting modality-specific (differential) latent variables and provides an asymptotic convergence analysis for the method under a product manifold model.

Abstract

Multimodal datasets, where measurements are obtained from multiple sensors, have become central to many scientific domains. In unsupervised settings, most representation learning methods focus on identifying shared latent structures, such as clusters or continuous processes that appear across modalities. However, some aspects of the data may be observed only through a single modality. For example, in computational biology, certain cell-subtypes may appear in genetic profiles but not in epigenetic markers. In this paper, we present DELVE, a spectral method for extracting modality-specific (differential) latent variables. Our approach constructs a graph for each modality and leverages differences in their connectivity patterns to design a graph filter that attenuates shared signals while preserving modality-specific components. We provide an asymptotic convergence analysis for our method under a product manifold model. To evaluate the performance of our method, we test its ability to recover differential latent structures in several synthetic and real datasets.

Spectral Graph Filtering for Modality-Specific Representation Learning

TL;DR

This paper presents DELVE, a spectral method for extracting modality-specific (differential) latent variables and provides an asymptotic convergence analysis for the method under a product manifold model.

Abstract

Multimodal datasets, where measurements are obtained from multiple sensors, have become central to many scientific domains. In unsupervised settings, most representation learning methods focus on identifying shared latent structures, such as clusters or continuous processes that appear across modalities. However, some aspects of the data may be observed only through a single modality. For example, in computational biology, certain cell-subtypes may appear in genetic profiles but not in epigenetic markers. In this paper, we present DELVE, a spectral method for extracting modality-specific (differential) latent variables. Our approach constructs a graph for each modality and leverages differences in their connectivity patterns to design a graph filter that attenuates shared signals while preserving modality-specific components. We provide an asymptotic convergence analysis for our method under a product manifold model. To evaluate the performance of our method, we test its ability to recover differential latent structures in several synthetic and real datasets.
Paper Structure (39 sections, 9 theorems, 97 equations, 10 figures, 7 tables, 2 algorithms)

This paper contains 39 sections, 9 theorems, 97 equations, 10 figures, 7 tables, 2 algorithms.

Key Result

Theorem 1

For $n \to \infty$ and under assumptions (i)-(iii), w.p. $>1-4K^2n^{-10} - (4K+6)n^{-9}$, the kth eigenvector of the random-walk Laplacian satisfies where $v_k^{(rw)}$ is $D$-normalized such that $(v_k^{(rw)})^T D v_k^{(rw)}/(n p) = 1$, $\epsilon_n$ is the bandwidth parameter used for the Gaussian affinity, equal to $\sigma^2/2$ in our notation eqn: gaussian kernel weights, and $|\alpha| = 1+o_p(

Figures (10)

  • Figure 1: Rotating dolls. Left: A single image from $X^A$ and $X^B$. Right: Embedding using DELVE's two leading differential vectors, colored by the rabbit's rotation angle $\psi^B$.
  • Figure 2: Illustration of Algorithm \ref{['alg:description']} using the line ($X^A$) vs. rectangle ($X^B$) example. (II) $G^A$ with nodes colored by $\theta$, and $G^B$ with nodes colored by $\theta$ and $\psi^B$ (III) Top: Points colored by the eigenvectors of $P^A$. Bottom: Points colored by the eigenvectors of $P^B$. (IV) Points colored by the leading differential vector $\delta_0^B$.
  • Figure 3: Line $(X^A)$ vs. cube $(X^B)$ example. The $\theta$ coordinate is shared between the two datasets. The second and third coordinates $\psi^B_1,\psi^B_2$ are unique to $X^B$. Each point in the scatter plots presents an observation in $X^B$. Left panel: The observations in $X^B$ are colored according to the values of its corresponding entries in the leading differential vectors. Right panel: The observations in $X^B$ colored according to the values of its corresponding entries in the iterative differential vectors $\delta^{B_0}$ and $\delta^{B_1}$. In all figures, the white arrow points to the direction of the dominant parameter.
  • Figure 4: Rectangle vs. Line. Blue curve: difference in $l_2$ norm between leading Laplacian eigenvector $v^{(rw)}$, computed from $x_i^A$ and the theoretical vector whose elements equal $\cos(\pi x_i^A/a)$. Orange curve: difference between differential vector $\delta^B$ and $\cos(\pi x_{i,1}^B/b)$. For both curves, the difference is presented as a function of the number of samples $n$ on a log-log scale.
  • Figure 5: Multimodal Tori. Top:$X^A$ colored by the leading vector associated with $\psi^A$ in each method. Bottom:$X^B$ colored by the leading vector associated with $\psi^B$.
  • ...and 5 more figures

Theorems & Definitions (17)

  • Theorem 1: Theorem 5.5, cheng2022eigen
  • Theorem 2
  • Lemma 1
  • Theorem 3
  • proof
  • Lemma 2
  • proof
  • Lemma 3
  • proof
  • Lemma 4
  • ...and 7 more