Table of Contents
Fetching ...

Operator SVD with Neural Networks via Nested Low-Rank Approximation

J. Jon Ryu, Xiangxiang Xu, H. S. Melihcan Erol, Yuheng Bu, Lizhong Zheng, Gregory W. Wornell

TL;DR

This work introduces NeuralSVD, an unconstrained neural network-based framework for learning the top-$L$ ordered singular functions of a linear operator by casting SVD as a low-rank approximation problem (LoRA) and applying nesting to impose the correct ordering. The key innovations are Schmidt's LoRA objective, sequential and joint nesting strategies, and their realization in NeuralSVD with disjoint or shared networks, enabling efficient gradient-based optimization for non-self-adjoint operators and providing EVD as a special case. Empirical results on analytical PDE operators (e.g., 2D hydrogen and harmonic oscillator) and a cross-domain retrieval task based on canonical dependence kernels demonstrate that NeuralSVD can recover accurate, orthogonal singular functions and yield structured, compact representations that outperform prior parametric methods like SpIN and NeuralEF. The approach scales favorably with the number of modes and offers practical benefits for PDEs, spectral embeddings, and cross-domain learning, with open-source implementations to facilitate adoption and comparison.

Abstract

Computing eigenvalue decomposition (EVD) of a given linear operator, or finding its leading eigenvalues and eigenfunctions, is a fundamental task in many machine learning and scientific computing problems. For high-dimensional eigenvalue problems, training neural networks to parameterize the eigenfunctions is considered as a promising alternative to the classical numerical linear algebra techniques. This paper proposes a new optimization framework based on the low-rank approximation characterization of a truncated singular value decomposition, accompanied by new techniques called \emph{nesting} for learning the top-$L$ singular values and singular functions in the correct order. The proposed method promotes the desired orthogonality in the learned functions implicitly and efficiently via an unconstrained optimization formulation, which is easy to solve with off-the-shelf gradient-based optimization algorithms. We demonstrate the effectiveness of the proposed optimization framework for use cases in computational physics and machine learning.

Operator SVD with Neural Networks via Nested Low-Rank Approximation

TL;DR

This work introduces NeuralSVD, an unconstrained neural network-based framework for learning the top- ordered singular functions of a linear operator by casting SVD as a low-rank approximation problem (LoRA) and applying nesting to impose the correct ordering. The key innovations are Schmidt's LoRA objective, sequential and joint nesting strategies, and their realization in NeuralSVD with disjoint or shared networks, enabling efficient gradient-based optimization for non-self-adjoint operators and providing EVD as a special case. Empirical results on analytical PDE operators (e.g., 2D hydrogen and harmonic oscillator) and a cross-domain retrieval task based on canonical dependence kernels demonstrate that NeuralSVD can recover accurate, orthogonal singular functions and yield structured, compact representations that outperform prior parametric methods like SpIN and NeuralEF. The approach scales favorably with the number of modes and offers practical benefits for PDEs, spectral embeddings, and cross-domain learning, with open-source implementations to facilitate adoption and comparison.

Abstract

Computing eigenvalue decomposition (EVD) of a given linear operator, or finding its leading eigenvalues and eigenfunctions, is a fundamental task in many machine learning and scientific computing problems. For high-dimensional eigenvalue problems, training neural networks to parameterize the eigenfunctions is considered as a promising alternative to the classical numerical linear algebra techniques. This paper proposes a new optimization framework based on the low-rank approximation characterization of a truncated singular value decomposition, accompanied by new techniques called \emph{nesting} for learning the top- singular values and singular functions in the correct order. The proposed method promotes the desired orthogonality in the learned functions implicitly and efficiently via an unconstrained optimization formulation, which is easy to solve with off-the-shelf gradient-based optimization algorithms. We demonstrate the effectiveness of the proposed optimization framework for use cases in computational physics and machine learning.
Paper Structure (61 sections, 11 theorems, 80 equations, 8 figures, 2 tables)

This paper contains 61 sections, 11 theorems, 80 equations, 8 figures, 2 tables.

Key Result

Theorem 3.1

Assume that $\mathpzc{T}\colon\mathcal{F}\to\mathcal{G}$ is compact. Let $((f_{\ell}^\star,g_{\ell}^\star))_{\ell=1}^L\in(\mathcal{F}\times\mathcal{G})^L$ be a global minimizer of $\mathcal{L}_{\mathsf{LoRA}}({\bf f}_{1:L}, {\bf g}_{1:L})$. If $\sigma_L >\sigma_{L+1}$, then

Figures (8)

  • Figure 1: Schematic illustration of NeuralSVD.
  • Figure 2: Sequential nesting.
  • Figure 3: Joint nesting.
  • Figure 5: Visualization of the first 16 eigenfunctions of the 2D hydrogen atom. The first three rows present the learned eigenfunctions by SpIN (128), NeuralEF (512), and NeuralSVD$_{\text{seq}}$ (512), respectively. Due to the memory complexity of SpIN, we ran SpIN with only 9 eigenstates. The learned functions are aligned by an orthogonal transformation via the orthogonal Procrustes method within each degenerate subspace to compare with the ground truth (GT) in the fourth row. The rightmost column visualizes the learned orthogonality.
  • Figure 6: Summary of quantitative evaluations for solving TISEs: (a) 2D hydrogen atom; (b) 2D harmonic oscillator. Non-hatched, light-colored bars represent a batch size of 128, and hatched bars indicate 512. The definitions of reported measures are given in Sec. \ref{['app:sec:def_measures']}.
  • ...and 3 more figures

Theorems & Definitions (21)

  • Theorem 3.1
  • Theorem 3.2
  • Theorem 3.3
  • Remark 3.4: Comparison to sequential nesting
  • Remark 4.1: Impact of imperfect orthogonality
  • Lemma 2.1
  • proof
  • Proposition 2.2
  • Theorem 2.3
  • proof : Informal proof
  • ...and 11 more