Table of Contents
Fetching ...

Season combinatorial intervention predictions with Salt & Peper

Thomas Gaudelet, Alice Del Vecchio, Eli M Carrami, Juliana Cudini, Chantriolnt-Andreas Kapourani, Caroline Uhler, Lindsay Edwards

TL;DR

This work tackles the challenge of predicting pairwise genetic intervention effects on cellular transcriptomes amid a vast combinatorial space. It introduces Salt, a simple additive baseline, and Peper, a neural correction built on Salt, and benchmarks them against state-of-the-art methods on two CRISPR perturbation datasets. The results show Peper achieves state-of-the-art in distributional accuracy, while Salt provides a strong, biologically motivated baseline; however, all models exhibit notable drops in out-of-distribution scenarios, underscoring limitations in generalisation and data coverage. The study highlights the need for improved priors, data acquisition strategies, and methods capable of leveraging or rejecting prior knowledge to better explore combinatorial genetic interventions.

Abstract

Interventions play a pivotal role in the study of complex biological systems. In drug discovery, genetic interventions (such as CRISPR base editing) have become central to both identifying potential therapeutic targets and understanding a drug's mechanism of action. With the advancement of CRISPR and the proliferation of genome-scale analyses such as transcriptomics, a new challenge is to navigate the vast combinatorial space of concurrent genetic interventions. Addressing this, our work concentrates on estimating the effects of pairwise genetic combinations on the cellular transcriptome. We introduce two novel contributions: Salt, a biologically-inspired baseline that posits the mostly additive nature of combination effects, and Peper, a deep learning model that extends Salt's additive assumption to achieve unprecedented accuracy. Our comprehensive comparison against existing state-of-the-art methods, grounded in diverse metrics, and our out-of-distribution analysis highlight the limitations of current models in realistic settings. This analysis underscores the necessity for improved modelling techniques and data acquisition strategies, paving the way for more effective exploration of genetic intervention effects.

Season combinatorial intervention predictions with Salt & Peper

TL;DR

This work tackles the challenge of predicting pairwise genetic intervention effects on cellular transcriptomes amid a vast combinatorial space. It introduces Salt, a simple additive baseline, and Peper, a neural correction built on Salt, and benchmarks them against state-of-the-art methods on two CRISPR perturbation datasets. The results show Peper achieves state-of-the-art in distributional accuracy, while Salt provides a strong, biologically motivated baseline; however, all models exhibit notable drops in out-of-distribution scenarios, underscoring limitations in generalisation and data coverage. The study highlights the need for improved priors, data acquisition strategies, and methods capable of leveraging or rejecting prior knowledge to better explore combinatorial genetic interventions.

Abstract

Interventions play a pivotal role in the study of complex biological systems. In drug discovery, genetic interventions (such as CRISPR base editing) have become central to both identifying potential therapeutic targets and understanding a drug's mechanism of action. With the advancement of CRISPR and the proliferation of genome-scale analyses such as transcriptomics, a new challenge is to navigate the vast combinatorial space of concurrent genetic interventions. Addressing this, our work concentrates on estimating the effects of pairwise genetic combinations on the cellular transcriptome. We introduce two novel contributions: Salt, a biologically-inspired baseline that posits the mostly additive nature of combination effects, and Peper, a deep learning model that extends Salt's additive assumption to achieve unprecedented accuracy. Our comprehensive comparison against existing state-of-the-art methods, grounded in diverse metrics, and our out-of-distribution analysis highlight the limitations of current models in realistic settings. This analysis underscores the necessity for improved modelling techniques and data acquisition strategies, paving the way for more effective exploration of genetic intervention effects.
Paper Structure (32 sections, 6 equations, 4 figures, 5 tables)

This paper contains 32 sections, 6 equations, 4 figures, 5 tables.

Figures (4)

  • Figure 1: Illustration of Salt and Peper predictions for the combinations of interventions on genes $p$ and $q$.
  • Figure 2: Score breakdown of GEARS, CPA, Salt, and Peper (lower is better) based on intervention subgroups identified by norman2019exploring. The definition for each label can be found in Appendix \ref{['glossary']}. As we can see, there are some variability in performances based on interaction type, but the potentiation interaction type stands out particularly.
  • Figure 3: UMAP visualisation of pseudo-bulked data from the norman-dataset, showing the clustering of interventions, their types, and whether they are in train, validation, or test sets in the different splits we use for the analysis. The last panel show the interaction labels associated to combinations by norman2019exploring.
  • Figure 4: UMAP visualisation of pseudo-bulked data from the wessels-dataset, showing the clustering of interventions, their types, and whether they are in train, validation, or test sets in the different splits we use for the analysis.