Table of Contents
Fetching ...

Accelerated decomposition of bistochastic kernel matrices by low rank approximation

Chris Vales, Dimitrios Giannakis

TL;DR

This work tackles the computational bottleneck of obtaining the eigen-decomposition of bistochastic kernel matrices for large datasets. It introduces a rank-$r$ pivoted partial Cholesky-based strategy to form a low-rank approximation $ ilde{K}=F F^ op$ and then computes the approximate eigenpairs of the bistochastic matrix $ ilde{P}$ with cost $O(N r^2)$, requiring only $N(r+1)$ kernel evaluations. Two acceleration schemes are developed and compared: dilution, which leverages the full dataset information via a sequence of small $r imes r$ factorizations, and subsampling with Nyström extension, which is more parallelizable but incurs higher asymptotic cost. The methods are applied to kernel-based spatiotemporal pattern extraction in chaotic Kuramoto-Sivashinsky dynamics, demonstrating close agreement with true eigenfunctions and highlighting practical trade-offs between accuracy and scalability. Overall, the proposed approach expands the applicability of bistochastic kernel methods to large-scale problems, enabling efficient diffusion-map–style analyses and kernel spectral clustering on big data.

Abstract

We develop an accelerated algorithm for computing an approximate eigenvalue decomposition of bistochastic normalized kernel matrices. Our approach constructs a low rank approximation of the original kernel matrix by the pivoted partial Cholesky algorithm and uses it to compute an approximate decomposition of its bistochastic normalization without requiring the formation of the full kernel matrix. The cost of the proposed algorithm depends linearly on the size of the employed training dataset and quadratically on the rank of the low rank approximation, offering a significant cost reduction compared to the naive approach. We apply the proposed algorithm to the kernel based extraction of spatiotemporal patterns from chaotic dynamics, demonstrating its accuracy while also comparing it with an alternative algorithm consisting of subsampling and Nystroem extension.

Accelerated decomposition of bistochastic kernel matrices by low rank approximation

TL;DR

This work tackles the computational bottleneck of obtaining the eigen-decomposition of bistochastic kernel matrices for large datasets. It introduces a rank- pivoted partial Cholesky-based strategy to form a low-rank approximation and then computes the approximate eigenpairs of the bistochastic matrix with cost , requiring only kernel evaluations. Two acceleration schemes are developed and compared: dilution, which leverages the full dataset information via a sequence of small factorizations, and subsampling with Nyström extension, which is more parallelizable but incurs higher asymptotic cost. The methods are applied to kernel-based spatiotemporal pattern extraction in chaotic Kuramoto-Sivashinsky dynamics, demonstrating close agreement with true eigenfunctions and highlighting practical trade-offs between accuracy and scalability. Overall, the proposed approach expands the applicability of bistochastic kernel methods to large-scale problems, enabling efficient diffusion-map–style analyses and kernel spectral clustering on big data.

Abstract

We develop an accelerated algorithm for computing an approximate eigenvalue decomposition of bistochastic normalized kernel matrices. Our approach constructs a low rank approximation of the original kernel matrix by the pivoted partial Cholesky algorithm and uses it to compute an approximate decomposition of its bistochastic normalization without requiring the formation of the full kernel matrix. The cost of the proposed algorithm depends linearly on the size of the employed training dataset and quadratically on the rank of the low rank approximation, offering a significant cost reduction compared to the naive approach. We apply the proposed algorithm to the kernel based extraction of spatiotemporal patterns from chaotic dynamics, demonstrating its accuracy while also comparing it with an alternative algorithm consisting of subsampling and Nystroem extension.

Paper Structure

This paper contains 16 sections, 36 equations, 7 figures, 2 algorithms.

Figures (7)

  • Figure 1: (Left) Space-time heatmap of the true state data obtained by integrating the KS problem \ref{['eq:ks']} for $500$ time units using the parameter values given in Section \ref{['sec:numerics']}. (Right) Space-time heatmap of the same state data as in the left plot with black dots indicating the states sampled by the pivoted partial Cholesky algorithm for rank parameter $r=2048$.
  • Figure 2: Comparison of eigenfunctions $\phi_1$ (top row) and $\phi_2$ (bottom row) obtained using the true EVD of the bistochastic kernel matrix (leftmost column), the dilution method (middle column) and the subsampling method (rightmost column).
  • Figure 3: Comparison of eigenfunctions $\phi_4$ (top row) and $\phi_5$ (bottom row) obtained using the true EVD of the bistochastic kernel matrix (leftmost column), the dilution method (middle column) and the subsampling method (rightmost column).
  • Figure 4: Comparison of eigenfunctions $\phi_3$ (top row) and $\phi_6$ (bottom row) obtained using the true EVD of the bistochastic kernel matrix (leftmost column), the dilution method (middle column) and the subsampling method (rightmost column).
  • Figure 5: Comparison of the true eigenvalues with those obtained using the dilution and subsampling methods. In the left plot the horizontal axis is limited to the maximum value of $20\,000$ to facilitate the visual comparison of the dilution and subsampling eigenvalues. The right plot is a subset of the left one focusing on the leading 20 eigenvalues.
  • ...and 2 more figures