Table of Contents
Fetching ...

Eigenfunction Extraction for Ordered Representation Learning

Burak Varıcı, Che-Ping Tsai, Ritabrata Ray, Nicholas M. Boffi, Pradeep Ravikumar

TL;DR

The paper tackles identifiability in representation learning by reframing learned features as ordered eigenfunctions of a contextual kernel $k_{XA}$ with associated operator $T_{XA}$. It proposes a modular eigenfunction extraction framework that can recover exact eigenfunctions and eigenvalues, not just an eigenspace, by combining base eigenspace extractors with sequential/joint nesting or Rayleigh–Ritz post-processing. The authors formalize desiderata (compatibility, exact decomposition, unconstrained optimization, efficiency) and show how two mainstream paradigms, LoRA (low-rank approximation) and Rayleigh quotient optimization, fit within this framework and connect to contrastive and non-contrastive learning. Through synthetic kernel experiments and image-representation tasks, they demonstrate that recovered eigenvalues provide meaningful feature-importance scores enabling adaptive-dimensional representations, with practical guidance on method choice (e.g., Rayleigh–Ritz for VICReg, joint nesting for SCL). Overall, the work offers a principled, scalable route to identifiable, ordered representations that support principled efficiency-accuracy tradeoffs in large-scale systems.

Abstract

Recent advances in representation learning reveal that widely used objectives, such as contrastive and non-contrastive, implicitly perform spectral decomposition of a contextual kernel, induced by the relationship between inputs and their contexts. Yet, these methods recover only the linear span of top eigenfunctions of the kernel, whereas exact spectral decomposition is essential for understanding feature ordering and importance. In this work, we propose a general framework to extract ordered and identifiable eigenfunctions, based on modular building blocks designed to satisfy key desiderata, including compatibility with the contextual kernel and scalability to modern settings. We then show how two main methodological paradigms, low-rank approximation and Rayleigh quotient optimization, align with this framework for eigenfunction extraction. Finally, we validate our approach on synthetic kernels and demonstrate on real-world image datasets that the recovered eigenvalues act as effective importance scores for feature selection, enabling principled efficiency-accuracy tradeoffs via adaptive-dimensional representations.

Eigenfunction Extraction for Ordered Representation Learning

TL;DR

The paper tackles identifiability in representation learning by reframing learned features as ordered eigenfunctions of a contextual kernel with associated operator . It proposes a modular eigenfunction extraction framework that can recover exact eigenfunctions and eigenvalues, not just an eigenspace, by combining base eigenspace extractors with sequential/joint nesting or Rayleigh–Ritz post-processing. The authors formalize desiderata (compatibility, exact decomposition, unconstrained optimization, efficiency) and show how two mainstream paradigms, LoRA (low-rank approximation) and Rayleigh quotient optimization, fit within this framework and connect to contrastive and non-contrastive learning. Through synthetic kernel experiments and image-representation tasks, they demonstrate that recovered eigenvalues provide meaningful feature-importance scores enabling adaptive-dimensional representations, with practical guidance on method choice (e.g., Rayleigh–Ritz for VICReg, joint nesting for SCL). Overall, the work offers a principled, scalable route to identifiable, ordered representations that support principled efficiency-accuracy tradeoffs in large-scale systems.

Abstract

Recent advances in representation learning reveal that widely used objectives, such as contrastive and non-contrastive, implicitly perform spectral decomposition of a contextual kernel, induced by the relationship between inputs and their contexts. Yet, these methods recover only the linear span of top eigenfunctions of the kernel, whereas exact spectral decomposition is essential for understanding feature ordering and importance. In this work, we propose a general framework to extract ordered and identifiable eigenfunctions, based on modular building blocks designed to satisfy key desiderata, including compatibility with the contextual kernel and scalability to modern settings. We then show how two main methodological paradigms, low-rank approximation and Rayleigh quotient optimization, align with this framework for eigenfunction extraction. Finally, we validate our approach on synthetic kernels and demonstrate on real-world image datasets that the recovered eigenvalues act as effective importance scores for feature selection, enabling principled efficiency-accuracy tradeoffs via adaptive-dimensional representations.

Paper Structure

This paper contains 85 sections, 15 theorems, 100 equations, 4 figures, 3 tables, 4 algorithms.

Key Result

Theorem 1

We have the following results for compact operators schmidt1907theorie and finite-dimensional matrices eckart1936approximationmirsky1960symmetric:

Figures (4)

  • Figure 1: Comparison of eigenfunctions estimated by different methods (LoRA, RQ, and VICReg) with the ground-truth eigenfunctions (solid gray line). Results are shown for Legendre kernels with Rayleigh–Ritz post-processing ($p=1$, $r=8$, $d=4$).
  • Figure 2: Linear evaluations of VICReg and SCL on different representation sizes on CIFAR-10 and ImageNette.
  • Figure 3: Top 100 eigenvalues of SCL and VICReg on ImageNette.
  • Figure 4: A comparison of eigenfunctions estimated by different methods (LoRA, RQ, and VICReg) against the true eigenfunctions (solid gray line). Each subplot shows the results for a specific kernel type and post-processing technique with parameters. For an input dimension $p=2$, the plot shows the functions $\psi_i(a,a)$ from $a=-1$ to $a=1$.

Theorems & Definitions (32)

  • Theorem 1
  • Remark 1: Alternative objectives
  • Lemma 1
  • Example 1
  • Definition 1: Base Optimization Problem
  • Definition 2: Eigenspace Extractor
  • Definition 3: Orthogonal Nested Minimizers
  • Theorem 2
  • Remark 2
  • Theorem 3
  • ...and 22 more