Table of Contents
Fetching ...

Matrix Phylogeny: Compact Spectral Fingerprints for Trap-Robust Preconditioner Selection

Jinwoo Baek

TL;DR

Matrix Phylogeny presents CSF/ASF, compact, eigendecomposition-free fingerprints built from damped Chebyshev moments and Hutchinson trace estimates, made invariant by affine spectral normalization to $[-1,1]$. CSF fixes a small dimension ($K\in\{3,5\}$), while ASF adaptively selects $K$ using energy-tail and Hankel-low-rank tests, enabling accurate clustering across matrix families with minimal features ($K\le 10$ in practice). Experiments show perfect clustering (ARI $=1.0$) on synthetic and real suites, robust performance on large Sparse inputs, and near-oracle preconditioner-selection in adversarial settings via a probe-and-switch policy. The work offers a scalable, structure-aware retrieval/recommendation pipeline for large matrix repositories, with practical defaults (CSF$K{=}5$) and adaptive alternatives when domain adaptivity is needed. Overall, the approach achieves invariant, noise-stable fingerprints that capture essential spectral structure without full eigen-decompositions, enabling fast, robust matrix phylogeny and automated solver choices.

Abstract

Matrix Phylogeny introduces compact spectral fingerprints (CSF/ASF) that characterize matrices at the family level. These fingerprints are low-dimensional, eigendecomposition-free descriptors built from Chebyshev trace moments estimated by Hutchinson sketches. A simple affine rescaling to [-1,1] makes them permutation/similarity invariant and robust to global scaling. Across synthetic and real tests, we observe phylogenetic compactness: only a few moments are needed. CSF with K=3-5 already yields perfect clustering (ARI=1.0; silhouettes ~0.89) on four synthetic families and a five-family set including BA vs ER, while ASF adapts the dimension on demand (median K*~9). On a SuiteSparse mini-benchmark (Hutchinson p~100), both CSF-H and ASF-H reach ARI=1.0. Against strong alternatives (eigenvalue histograms + Wasserstein, heat-kernel traces, WL-subtree), CSF-K=5 matches or exceeds accuracy while avoiding eigendecompositions and using far fewer features (K<=10 vs 64/9153). The descriptors are stable to noise (log-log slope ~1.03, R^2~0.993) and support a practical trap->recommend pipeline for automated preconditioner selection. In an adversarial E6+ setting with a probe-and-switch mechanism, our physics-guided recommender attains near-oracle iteration counts (p90 regret=0), whereas a Frobenius 1-NN baseline exhibits large spikes (p90~34-60). CSF/ASF deliver compact (K<=10), fast, invariant fingerprints that enable scalable, structure-aware search and recommendation over large matrix repositories. We recommend CSF with K=5 by default, and ASF when domain-specific adaptivity is desired.

Matrix Phylogeny: Compact Spectral Fingerprints for Trap-Robust Preconditioner Selection

TL;DR

Matrix Phylogeny presents CSF/ASF, compact, eigendecomposition-free fingerprints built from damped Chebyshev moments and Hutchinson trace estimates, made invariant by affine spectral normalization to . CSF fixes a small dimension (), while ASF adaptively selects using energy-tail and Hankel-low-rank tests, enabling accurate clustering across matrix families with minimal features ( in practice). Experiments show perfect clustering (ARI ) on synthetic and real suites, robust performance on large Sparse inputs, and near-oracle preconditioner-selection in adversarial settings via a probe-and-switch policy. The work offers a scalable, structure-aware retrieval/recommendation pipeline for large matrix repositories, with practical defaults (CSF) and adaptive alternatives when domain adaptivity is needed. Overall, the approach achieves invariant, noise-stable fingerprints that capture essential spectral structure without full eigen-decompositions, enabling fast, robust matrix phylogeny and automated solver choices.

Abstract

Matrix Phylogeny introduces compact spectral fingerprints (CSF/ASF) that characterize matrices at the family level. These fingerprints are low-dimensional, eigendecomposition-free descriptors built from Chebyshev trace moments estimated by Hutchinson sketches. A simple affine rescaling to [-1,1] makes them permutation/similarity invariant and robust to global scaling. Across synthetic and real tests, we observe phylogenetic compactness: only a few moments are needed. CSF with K=3-5 already yields perfect clustering (ARI=1.0; silhouettes ~0.89) on four synthetic families and a five-family set including BA vs ER, while ASF adapts the dimension on demand (median K*~9). On a SuiteSparse mini-benchmark (Hutchinson p~100), both CSF-H and ASF-H reach ARI=1.0. Against strong alternatives (eigenvalue histograms + Wasserstein, heat-kernel traces, WL-subtree), CSF-K=5 matches or exceeds accuracy while avoiding eigendecompositions and using far fewer features (K<=10 vs 64/9153). The descriptors are stable to noise (log-log slope ~1.03, R^2~0.993) and support a practical trap->recommend pipeline for automated preconditioner selection. In an adversarial E6+ setting with a probe-and-switch mechanism, our physics-guided recommender attains near-oracle iteration counts (p90 regret=0), whereas a Frobenius 1-NN baseline exhibits large spikes (p90~34-60). CSF/ASF deliver compact (K<=10), fast, invariant fingerprints that enable scalable, structure-aware search and recommendation over large matrix repositories. We recommend CSF with K=5 by default, and ASF when domain-specific adaptivity is desired.

Paper Structure

This paper contains 72 sections, 9 theorems, 56 equations, 6 figures, 12 tables, 3 algorithms.

Key Result

Proposition 2

For any invertible $S$, permutation $P$, positive diagonal $D$, and $\alpha>0$, moments (hence $\phi_K$) are invariant: Proof: App. app:proofs.

Figures (6)

  • Figure 1: E2: ARI/silhouette overview across methods and views.
  • Figure 2: E2: Confusion matrix (one run; label permutations allowed). ARI $=1.0$.
  • Figure 3: E3: ARI on SuiteSparse (Hutchinson).
  • Figure 4: E4: Probe--quality--time trade-off for CSF-H/ASF-H.
  • Figure 5: E5: Log--log fit of fingerprint distance vs. noise level $\epsilon$.
  • ...and 1 more figures

Theorems & Definitions (11)

  • Remark 1: Scope of the claim
  • Proposition 2: Invariance
  • Theorem 3: Adaptive compactness: stopping & tail control
  • Proposition 4: Hankel low rank: mixture-of-modes
  • Remark 5: Counting convention
  • Theorem 6: Hutchinson concentration
  • Theorem 7: Lipschitz stability
  • Proposition 1: Invariance (appendix)
  • Lemma 1: Hutchinson variance and SE-guard
  • Proposition 2: Energy-tail guarantee
  • ...and 1 more