Causal Circuit Tracing Reveals Distinct Computational Architectures in Single-Cell Foundation Models: Inhibitory Dominance, Biological Coherence, and Cross-Model Convergence

Ihor Kendiukhov

Causal Circuit Tracing Reveals Distinct Computational Architectures in Single-Cell Foundation Models: Inhibitory Dominance, Biological Coherence, and Cross-Model Convergence

Ihor Kendiukhov

TL;DR

This work introduces causal circuit tracing by ablating SAE features and measuring downstream responses, and applies it to Geneformer V2-316M and scGPT whole-human across four conditions, confirming co-expression rather than causal encoding.

Abstract

Motivation: Sparse autoencoders (SAEs) decompose foundation model activations into interpretable features, but causal feature-to-feature interactions across network depth remain unknown for biological foundation models. Results: We introduce causal circuit tracing by ablating SAE features and measuring downstream responses, and apply it to Geneformer V2-316M and scGPT whole-human across four conditions (96,892 edges, 80,191 forward passes). Both models show approximately 53 percent biological coherence and 65 to 89 percent inhibitory dominance, invariant to architecture and cell type. scGPT produces stronger effects (mean absolute d = 1.40 vs. 1.05) with more balanced dynamics. Cross-model consensus yields 1,142 conserved domain pairs (10.6x enrichment, p < 0.001). Disease-associated domains are 3.59x more likely to be consensus. Gene-level CRISPRi validation shows 56.4 percent directional accuracy, confirming co-expression rather than causal encoding.

Causal Circuit Tracing Reveals Distinct Computational Architectures in Single-Cell Foundation Models: Inhibitory Dominance, Biological Coherence, and Cross-Model Convergence

TL;DR

Abstract

Paper Structure (57 sections, 13 figures, 10 tables)

This paper contains 57 sections, 13 figures, 10 tables.

Introduction
Results
Causal circuit tracing reveals dense, predominantly inhibitory computational graphs
Hub features and convergent integration layers
Circuits encode interpretable biological cascades
DNA damage response cascades
Causal effects persist across the full network depth
Statistical co-activation validates causal edges
Multi-tissue SAEs produce more biologically coherent circuits
Biological coherence depends on the SAE lens, not the input cells
scGPT circuit tracing reveals a fundamentally different computational architecture
Stronger individual effects, more balanced dynamics
Biological coherence converges across architectures
Energy metabolism as organizing hub
Interpretable stress response and protein quality control circuits
...and 42 more sections

Figures (13)

Figure 1: Causal effect size comparison across four experimental conditions.(A) Mean and median Cohen's $|d|$ for causal edges. scGPT produces the strongest individual effects ($|d|{=}1.40$), while Tabula Sapiens cells through Geneformer produce the weakest ($|d|{=}0.72$). The dashed line marks $|d|{=}1.0$ (strong effect threshold). (B) Fraction of edges exceeding strong ($|d|{>}1.0$) and very strong ($|d|{>}2.0$) thresholds. scGPT leads with 65.2% strong edges. (C) Inhibitory/excitatory balance. All conditions are predominantly inhibitory (65--89%), with Tabula Sapiens cells producing the most inhibitory circuits (89.4%) and scGPT the most balanced (65.5%).
Figure 2: DNA damage response cascade in Geneformer multi-tissue circuits. A biologically coherent multi-layer circuit progresses from DNA damage detection (L0) through checkpoint activation (L5) to cell cycle arrest (L11, L17). The L5 Centromere Assembly feature provides a parallel pathway through kinetochore organization. The L0 source directly connects to L17 targets ($d = -2.30$), creating a "skip connection" alongside the sequential cascade. Shared biological annotations (text on edges) confirm that each connection reflects established molecular pathways.
Figure 3: Geneformer: Mitotic progression circuit from DNA replication to cytokinesis. Centromere assembly drives spindle checkpoint and G2/M transition across layers 0--15, terminating with cytokinesis feeding back to spindle checkpoint. See Supplementary Note 1 for full description.
Figure 4: Geneformer: Nervous System Development as a computational hub. L0 feature F146 drives 7 targets across layers 1--13 spanning proteostasis, cellular transport, and immune signaling (128--142 shared terms per edge). See Supplementary Note 2.
Figure 5: Geneformer K562/K562 biological circuit architecture. 16 features across layers 0--15 connected by 10 significant causal edges. Two main circuit families emerge: (i) the neurodevelopment$\to$proteostasis pathway (L0 Nervous System Development driving L1 Endosome Organization, L2 Proteasomal Catabolism, and L6 Protein Catabolism), and (ii) the DNA damage$\to$mitotic apparatus cascade (L0 DNA Repair driving L6 Kinetochore at $d = -3.47$, then L5 DNA Metabolic driving L11 Kinetochore at $d = -2.39$).
...and 8 more figures

Causal Circuit Tracing Reveals Distinct Computational Architectures in Single-Cell Foundation Models: Inhibitory Dominance, Biological Coherence, and Cross-Model Convergence

TL;DR

Abstract

Causal Circuit Tracing Reveals Distinct Computational Architectures in Single-Cell Foundation Models: Inhibitory Dominance, Biological Coherence, and Cross-Model Convergence

Authors

TL;DR

Abstract

Table of Contents

Figures (13)