Predicting Cellular Responses with Variational Causal Inference and Refined Relational Information
Yulun Wu, Robert A. Barton, Zichen Wang, Vassilis N. Ioannidis, Carlo De Donno, Layne C. Price, Luis F. Voloch, George Karypis
TL;DR
The paper tackles predicting cellular responses to perturbations, a key challenge for drug discovery and personalized therapeutics. It introduces GraphVCI, a graph-structured variational causal model that embeds gene regulatory networks (GRNs) into the latent space to model counterfactual gene expressions, augmented by an adjacency-refinement step and a robust estimator for marginal perturbation effects. Core contributions include a variational objective for counterfactuals, a relational-information mechanism with graph attention, and a scalable GRN refinement procedure that improves edge relevance and model performance, plus an asymptotically efficient estimator for population-level effects. On three benchmark datasets, GraphVCI achieves state-of-the-art out-of-distribution predictions, with ablations confirming the value of refined relational information and biological alignment of learned edges, enabling more accurate and interpretable single-cell perturbation predictions.
Abstract
Predicting the responses of a cell under perturbations may bring important benefits to drug discovery and personalized therapeutics. In this work, we propose a novel graph variational Bayesian causal inference framework to predict a cell's gene expressions under counterfactual perturbations (perturbations that this cell did not factually receive), leveraging information representing biological knowledge in the form of gene regulatory networks (GRNs) to aid individualized cellular response predictions. Aiming at a data-adaptive GRN, we also developed an adjacency matrix updating technique for graph convolutional networks and used it to refine GRNs during pre-training, which generated more insights on gene relations and enhanced model performance. Additionally, we propose a robust estimator within our framework for the asymptotically efficient estimation of marginal perturbation effect, which is yet to be carried out in previous works. With extensive experiments, we exhibited the advantage of our approach over state-of-the-art deep learning models for individual response prediction.
