Table of Contents
Fetching ...

MechPert: Mechanistic Consensus as an Inductive Bias for Unseen Perturbation Prediction

Marc Boubnovski Martell, Josefa Lia Stoisser, Lawrence Phillips, Aditya Misra, Robert Kitchen, Jesper Ferkinghoff-Borg, Jialin Yu, Philip Torr, Kaspar Märten

TL;DR

MechPert is introduced, a lightweight framework that encourages LLM agents to generate directed regulatory hypotheses rather than relying solely on functional similarity, and selects anchor genes that outperform standard network centrality heuristics in well-characterized cell lines.

Abstract

Predicting transcriptional responses to unseen genetic perturbations is essential for understanding gene regulation and prioritizing large-scale perturbation experiments. Existing approaches either rely on static, potentially incomplete knowledge graphs, or prompt language models for functionally similar genes, retrieving associations shaped by symmetric co-occurrence in scientific text rather than directed regulatory logic. We introduce MechPert, a lightweight framework that encourages LLM agents to generate directed regulatory hypotheses rather than relying solely on functional similarity. Multiple agents independently propose candidate regulators with associated confidence scores; these are aggregated through a consensus mechanism that filters spurious associations, producing weighted neighborhoods for downstream prediction. We evaluate MechPert on Perturb-seq benchmarks across four human cell lines. For perturbation prediction in low-data regimes ($N=50$ observed perturbations), MechPert improves Pearson correlation by up to 10.5\% over similarity-based baselines. For experimental design, MechPert-selected anchor genes outperform standard network centrality heuristics by up to 46\% in well-characterized cell lines.

MechPert: Mechanistic Consensus as an Inductive Bias for Unseen Perturbation Prediction

TL;DR

MechPert is introduced, a lightweight framework that encourages LLM agents to generate directed regulatory hypotheses rather than relying solely on functional similarity, and selects anchor genes that outperform standard network centrality heuristics in well-characterized cell lines.

Abstract

Predicting transcriptional responses to unseen genetic perturbations is essential for understanding gene regulation and prioritizing large-scale perturbation experiments. Existing approaches either rely on static, potentially incomplete knowledge graphs, or prompt language models for functionally similar genes, retrieving associations shaped by symmetric co-occurrence in scientific text rather than directed regulatory logic. We introduce MechPert, a lightweight framework that encourages LLM agents to generate directed regulatory hypotheses rather than relying solely on functional similarity. Multiple agents independently propose candidate regulators with associated confidence scores; these are aggregated through a consensus mechanism that filters spurious associations, producing weighted neighborhoods for downstream prediction. We evaluate MechPert on Perturb-seq benchmarks across four human cell lines. For perturbation prediction in low-data regimes ( observed perturbations), MechPert improves Pearson correlation by up to 10.5\% over similarity-based baselines. For experimental design, MechPert-selected anchor genes outperform standard network centrality heuristics by up to 46\% in well-characterized cell lines.
Paper Structure (30 sections, 4 equations, 3 figures, 5 tables)

This paper contains 30 sections, 4 equations, 3 figures, 5 tables.

Figures (3)

  • Figure 1: MechPert Framework. Problem setup for predicting gene expression vectors of unseen perturbations. MechPert pipeline: LLMs identify functional similarity and GRN regulators to weight and aggregate training data for few-shot prediction ($\hat{y}_X$).
  • Figure 2: Active experiment design. Iterative hub selection to maximize biological pathway diversity in the final anchor set.
  • Figure 3: MECHPERT improves low-data generalization and experimental design. (a) In Jurkat T-cells, our Causal-Consensus model (red) beats the semantic baseline (blue) by +15.5% at N=50, showing causal priors help when data are scarce. (b) In K562, choosing N=50 experimental anchors via geometrically adjudicated consensus gives a +46% gain over PPI-degree heuristics, indicating agentic reasoning finds effective regulatory hubs.