Progressive Multi-Agent Reasoning for Biological Perturbation Prediction
Hyomin Kim, Sang-Yeon Hwang, Jaechang Lim, Yinhua Piao, Yunhak Oh, Woo Youn Kim, Chanyoung Park, Sungsoo Ahn, Junhyeok Jeon
TL;DR
PBio-Agent introduces a multi-agent, progressive reasoning framework for predicting transcriptional responses to chemical perturbations in bulk-cell data. It pairs with the LincsQA benchmark, which uses consensus signatures to ground truth and tests biological validity across pharmacologically sensitive and insensitive contexts. The approach draws on specialized agents and knowledge graphs to decompose complex causal reasoning and uses difficulty-aware data sorting and iterative verification to improve robustness, achieving state-of-the-art results on LINCS-based perturbation datasets and PerturbQA with an 8B model. This work advances drug-perturbation reasoning by combining structured biological knowledge with agentic inference, suggesting scalable paths for mechanistic transcriptional predictions.
Abstract
Predicting gene regulation responses to biological perturbations requires reasoning about underlying biological causalities. While large language models (LLMs) show promise for such tasks, they are often overwhelmed by the entangled nature of high-dimensional perturbation results. Moreover, recent works have primarily focused on genetic perturbations in single-cell experiments, leaving bulk-cell chemical perturbations, which is central to drug discovery, largely unexplored. Motivated by this, we present LINCSQA, a novel benchmark for predicting target gene regulation under complex chemical perturbations in bulk-cell environments. We further propose PBio-Agent, a multi-agent framework that integrates difficulty-aware task sequencing with iterative knowledge refinement. Our key insight is that genes affected by the same perturbation share causal structure, allowing confidently predicted genes to contextualize more challenging cases. The framework employs specialized agents enriched with biological knowledge graphs, while a synthesis agent integrates outputs and specialized judges ensure logical coherence. PBio-Agent outperforms existing baselines on both LINCSQA and PerturbQA, enabling even smaller models to predict and explain complex biological processes without additional training.
