Sparse autoencoders reveal organized biological knowledge but minimal regulatory logic in single-cell foundation models: a comparative atlas of Geneformer and scGPT

Ihor Kendiukhov

Sparse autoencoders reveal organized biological knowledge but minimal regulatory logic in single-cell foundation models: a comparative atlas of Geneformer and scGPT

Ihor Kendiukhov

TL;DR

These models have internalized organized biological knowledge, including pathway membership, protein interactions, functional modules, and hierarchical abstraction, yet they encode minimal causal regulatory logic.

Abstract

Background: Single-cell foundation models such as Geneformer and scGPT encode rich biological information, but whether this includes causal regulatory logic rather than statistical co-expression remains unclear. Sparse autoencoders (SAEs) can resolve superposition in neural networks by decomposing dense activations into interpretable features, yet they have not been systematically applied to biological foundation models. Results: We trained TopK SAEs on residual stream activations from all layers of Geneformer V2-316M (18 layers, d=1152) and scGPT whole-human (12 layers, d=512), producing atlases of 82525 and 24527 features, respectively. Both atlases confirm massive superposition, with 99.8 percent of features invisible to SVD. Systematic characterization reveals rich biological organization: 29 to 59 percent of features annotate to Gene Ontology, KEGG, Reactome, STRING, or TRRUST, with U-shaped layer profiles reflecting hierarchical abstraction. Features organize into co-activation modules (141 in Geneformer, 76 in scGPT), exhibit causal specificity (median 2.36x), and form cross-layer information highways (63 to 99.8 percent). When tested against genome-scale CRISPRi perturbation data, only 3 of 48 transcription factors (6.2 percent) show regulatory-target-specific feature responses. A multi-tissue control yields marginal improvement (10.4 percent, 5 of 48 TFs), establishing model representations as the bottleneck. Conclusions: These models have internalized organized biological knowledge, including pathway membership, protein interactions, functional modules, and hierarchical abstraction, yet they encode minimal causal regulatory logic. We release both feature atlases as interactive web platforms enabling exploration of more than 107000 features across 30 layers of two leading single-cell foundation models.

Sparse autoencoders reveal organized biological knowledge but minimal regulatory logic in single-cell foundation models: a comparative atlas of Geneformer and scGPT

TL;DR

Abstract

Paper Structure (50 sections, 13 figures, 18 tables)

This paper contains 50 sections, 13 figures, 18 tables.

Background
Results
SAE feature atlas reveals massive superposition in Geneformer
Quantifying superposition: SAE features versus SVD
scGPT feature atlas: cross-model replication
Biological annotation reveals a U-shaped layer profile
Early layers (0--4): Molecular machinery.
Middle layers (5--9): Abstract computation.
Mid-late layers (10--12): Re-specialization.
Terminal layers (15--17): Prediction-focused.
Cross-layer tracking: features are layer-specific
Features organize into 141 biologically coherent co-activation modules
Causal patching demonstrates feature-level specificity
scGPT causal patching at layer 7.
Cross-layer information highways
...and 35 more sections

Figures (13)

Figure 1: Geneformer V2-316M SAE feature atlas. TopK SAEs on all 18 layers yield 82,525 features. Reconstruction quality declines with depth while dead features increase.
Figure 2: 99.8% of SAE features are invisible to SVD. Novel features carry 98.7% of annotations and explain 2.4$\times$ more variance.
Figure 3: Cross-model comparison on normalized depth. (A) scGPT has higher variance explained at all depths. (B) Geneformer has higher annotation rates. (C) Summary. (D) Dead feature profiles differ.
Figure 4: U-shaped annotation profile across 18 layers. Early layers encode molecular programs; middle layers develop abstract representations; later layers re-specialize then optimize for prediction.
Figure 5: Co-activation modules evolve across layers. Identity shifts from molecular machinery (L0) to integrative programs (L11), reflecting hierarchical abstraction.
...and 8 more figures

Sparse autoencoders reveal organized biological knowledge but minimal regulatory logic in single-cell foundation models: a comparative atlas of Geneformer and scGPT

TL;DR

Abstract

Sparse autoencoders reveal organized biological knowledge but minimal regulatory logic in single-cell foundation models: a comparative atlas of Geneformer and scGPT

Authors

TL;DR

Abstract

Table of Contents

Figures (13)