Table of Contents
Fetching ...

Modeling variable guide efficiency in pooled CRISPR screens with ContrastiveVI+

Ethan Weinberger, Ryan Conrad, Tal Ashuach

TL;DR

Applied to three large-scale Perturb-seq datasets, it is found that ContrastiveVI+ better recovers known perturbation-induced variations compared to previous methods while successfully identifying cells that escaped the functional consequences of guide RNA expression.

Abstract

Genetic screens mediated via CRISPR-Cas9 combined with high-content readouts have emerged as powerful tools for biological discovery. However, computational analyses of these screens come with additional challenges beyond those found with standard scRNA-seq analyses. For example, perturbation-induced variations of interest may be subtle and masked by other dominant source of variation shared with controls, and variable guide efficiency results in some cells not undergoing genetic perturbation despite expressing a guide RNA. While a number of methods have been developed to address the former problem by explicitly disentangling perturbation-induced variations from those shared with controls, less attention has been paid to the latter problem of noisy perturbation labels. To address this issue, here we propose ContrastiveVI+, a generative modeling framework that both disentangles perturbation-induced from non-perturbation-related variations while also inferring whether cells truly underwent genomic edits. Applied to three large-scale Perturb-seq datasets, we find that ContrastiveVI+ better recovers known perturbation-induced variations compared to previous methods while successfully identifying cells that escaped the functional consequences of guide RNA expression. An open-source implementation of our model is available at \url{https://github.com/insitro/contrastive_vi_plus}.

Modeling variable guide efficiency in pooled CRISPR screens with ContrastiveVI+

TL;DR

Applied to three large-scale Perturb-seq datasets, it is found that ContrastiveVI+ better recovers known perturbation-induced variations compared to previous methods while successfully identifying cells that escaped the functional consequences of guide RNA expression.

Abstract

Genetic screens mediated via CRISPR-Cas9 combined with high-content readouts have emerged as powerful tools for biological discovery. However, computational analyses of these screens come with additional challenges beyond those found with standard scRNA-seq analyses. For example, perturbation-induced variations of interest may be subtle and masked by other dominant source of variation shared with controls, and variable guide efficiency results in some cells not undergoing genetic perturbation despite expressing a guide RNA. While a number of methods have been developed to address the former problem by explicitly disentangling perturbation-induced variations from those shared with controls, less attention has been paid to the latter problem of noisy perturbation labels. To address this issue, here we propose ContrastiveVI+, a generative modeling framework that both disentangles perturbation-induced from non-perturbation-related variations while also inferring whether cells truly underwent genomic edits. Applied to three large-scale Perturb-seq datasets, we find that ContrastiveVI+ better recovers known perturbation-induced variations compared to previous methods while successfully identifying cells that escaped the functional consequences of guide RNA expression. An open-source implementation of our model is available at \url{https://github.com/insitro/contrastive_vi_plus}.

Paper Structure

This paper contains 18 sections, 25 equations, 5 figures.

Figures (5)

  • Figure 1: Graphical representation of the ContrastiveVI+ generative process for cells with non-control guides (left) and control guides (right).
  • Figure 2: a-b, UMAP visualizations of PCA (a) applied to data from papalexi2021characterizing colored by cell cycle (a) and replicate (b). c-d UMAP visualizations of ContrastiveVI+'s salient latent space colored by cell cycle (c) and replicate (d). e-f, Entropy of mixing for ContrastiveVI+ and baseline methods' representations with respect to cell cycle phase (e) and replicate identity (f).
  • Figure 3: a-b, UMAP plots of ContrastiveVI and ContrastiveVI+'s salient latent representations colored by gene target. c, UMAP plot of ContrastiveVI+'s salient latent representations colored by inferred probability of perturbation. d, PDL1 protein expression for gene perturbations highlighted in ContrastiveVI+'s salient latent space compared to control cells. e-f, Subset of transcriptomic changes for STAT2- and SMAD4-perturbed cells identified by ContrastiveVI+.
  • Figure 4: a-c, UMAP plots of ContrastiveVI+'s salient latent space for data from replogle2022mapping colored by cell cycle phase (a), pathway annotations (b) and inferred probability of perturbation (c). d, Quantitative assessments of invariance with respect to cell cycle phase (entropy of cell cycle phase mixing) and capturing of known perturbation-induced variations (pathway ARI) for ContrastiveVI+ and baseline method's salient representations. e, Change in MMD between cells labeled with gRNAs targeting a given gene versus cells with non-targeting control guides after filtering with ContrastiveVI+ ($y$-axis) or Mixscape ($x$-axis) compared to the MMD without filtering. f, MMD between cells labeled as escaping by ContrastiveVI+ ($y$-axis) or Mixscape ($x$-axis) versus control cells.
  • Figure 5: a-c, UMAP visualizations of ContrastiveVI+'s salient latent space for norman2019exploring colored by cell cycle phase (a), gene program labels provided by norman2019exploring (b), and inferred probability of perturbation (c). d, Quantitative assessments of ContrastiveVI+ and baseline method's salient representations. e-h, ContrastiveVI+'s salient latent representations for cells with perturbations labeled as "granulocyte/apoptosis" by norman2019exploring. Plots colored by perturbation labels (e), canonical granulocyte marker genes (f), the pro-apoptotic anti-proliferation factors BTG1 and BTG2 (g), and anti-apoptotic genes BIRC5 (survivin) and YBX1 (h).