Matrix Phylogeny: Compact Spectral Fingerprints for Trap-Robust Preconditioner Selection
Jinwoo Baek
TL;DR
Matrix Phylogeny presents CSF/ASF, compact, eigendecomposition-free fingerprints built from damped Chebyshev moments and Hutchinson trace estimates, made invariant by affine spectral normalization to $[-1,1]$. CSF fixes a small dimension ($K\in\{3,5\}$), while ASF adaptively selects $K$ using energy-tail and Hankel-low-rank tests, enabling accurate clustering across matrix families with minimal features ($K\le 10$ in practice). Experiments show perfect clustering (ARI $=1.0$) on synthetic and real suites, robust performance on large Sparse inputs, and near-oracle preconditioner-selection in adversarial settings via a probe-and-switch policy. The work offers a scalable, structure-aware retrieval/recommendation pipeline for large matrix repositories, with practical defaults (CSF$K{=}5$) and adaptive alternatives when domain adaptivity is needed. Overall, the approach achieves invariant, noise-stable fingerprints that capture essential spectral structure without full eigen-decompositions, enabling fast, robust matrix phylogeny and automated solver choices.
Abstract
Matrix Phylogeny introduces compact spectral fingerprints (CSF/ASF) that characterize matrices at the family level. These fingerprints are low-dimensional, eigendecomposition-free descriptors built from Chebyshev trace moments estimated by Hutchinson sketches. A simple affine rescaling to [-1,1] makes them permutation/similarity invariant and robust to global scaling. Across synthetic and real tests, we observe phylogenetic compactness: only a few moments are needed. CSF with K=3-5 already yields perfect clustering (ARI=1.0; silhouettes ~0.89) on four synthetic families and a five-family set including BA vs ER, while ASF adapts the dimension on demand (median K*~9). On a SuiteSparse mini-benchmark (Hutchinson p~100), both CSF-H and ASF-H reach ARI=1.0. Against strong alternatives (eigenvalue histograms + Wasserstein, heat-kernel traces, WL-subtree), CSF-K=5 matches or exceeds accuracy while avoiding eigendecompositions and using far fewer features (K<=10 vs 64/9153). The descriptors are stable to noise (log-log slope ~1.03, R^2~0.993) and support a practical trap->recommend pipeline for automated preconditioner selection. In an adversarial E6+ setting with a probe-and-switch mechanism, our physics-guided recommender attains near-oracle iteration counts (p90 regret=0), whereas a Frobenius 1-NN baseline exhibits large spikes (p90~34-60). CSF/ASF deliver compact (K<=10), fast, invariant fingerprints that enable scalable, structure-aware search and recommendation over large matrix repositories. We recommend CSF with K=5 by default, and ASF when domain-specific adaptivity is desired.
