Towards generalizable single-cell perturbation modeling via the Conditional Monge Gap
Alice Driessen, Benedek Harsanyi, Marianna Rapsomaniki, Jannis Born
TL;DR
This work introduces the Conditional Monge Gap (CMonge), a global neural optimal transport map conditioned on covariates to predict single-cell perturbation responses across seen and unseen drugs and dosages. By jointly learning transport maps for multiple conditions, the method enables cross-task learning and robust out-of-distribution generalization, outperforming condition-specific and some state-of-the-art conditional approaches on scRNA-seq and multiplexed imaging data. The approach leverages a two-stage architecture (latent gene embeddings plus a context-conditioned transport map) and explores conditioning via MoA embeddings or RDKit fingerprints, showing strong performance for both in-distribution and OOD perturbations, including unseen drugs and combinations. The results suggest that conditional OT can bridge structure-based drug representations and observed perturbation effects at scale, offering a promising path for predicting responses to unseen treatments in single-cell data analysis and related domains.
Abstract
Learning the response of single-cells to various treatments offers great potential to enable targeted therapies. In this context, neural optimal transport (OT) has emerged as a principled methodological framework because it inherently accommodates the challenges of unpaired data induced by cell destruction during data acquisition. However, most existing OT approaches are incapable of conditioning on different treatment contexts (e.g., time, drug treatment, drug dosage, or cell type) and we still lack methods that unanimously show promising generalization performance to unseen treatments. Here, we propose the Conditional Monge Gap which learns OT maps conditionally on arbitrary covariates. We demonstrate its value in predicting single-cell perturbation responses conditional to one or multiple drugs, a drug dosage, or combinations thereof. We find that our conditional models achieve results comparable and sometimes even superior to the condition-specific state-of-the-art on scRNA-seq as well as multiplexed protein imaging data. Notably, by aggregating data across conditions we perform cross-task learning which unlocks remarkable generalization abilities to unseen drugs or drug dosages, widely outperforming other conditional models in capturing heterogeneity (i.e., higher moments) in the perturbed population. Finally, by scaling to hundreds of conditions and testing on unseen drugs, we narrow the gap between structure-based and effect-based drug representations, suggesting a promising path to the successful prediction of perturbation effects for unseen treatments.
