Weakly Supervised Set-Consistency Learning Improves Morphological Profiling of Single-Cell Images

Heming Yao; Phil Hanslovsky; Jan-Christian Huetter; Burkhard Hoeckendorf; David Richmond

Weakly Supervised Set-Consistency Learning Improves Morphological Profiling of Single-Cell Images

Heming Yao, Phil Hanslovsky, Jan-Christian Huetter, Burkhard Hoeckendorf, David Richmond

TL;DR

This work tackles the challenge of learning robust, biologically meaningful representations from noisy single-cell images in Optical Pooled Screening by introducing Set-DINO, a weakly supervised set-consistency learning framework. Set-DINO combines self-supervised learning with cross-batch replication-based weak supervision and set-level aggregation to stabilize training and mitigate batch confounds, yielding more biologically relevant embeddings. On a large OPS dataset with over $5000$ perturbations, Set-DINO achieves superior reproducibility, reduced batch effects, and higher recall of gene–gene relationships (e.g., via CORUM), outperforming engineered features and standard DINO. The approach holds promise for improving drug target discovery campaigns and can extend to other single-cell datasets with weak labels, with future work exploring richer set-aggregation methods.

Abstract

Optical Pooled Screening (OPS) is a powerful tool combining high-content microscopy with genetic engineering to investigate gene function in disease. The characterization of high-content images remains an active area of research and is currently undergoing rapid innovation through the application of self-supervised learning and vision transformers. In this study, we propose a set-level consistency learning algorithm, Set-DINO, that combines self-supervised learning with weak supervision to improve learned representations of perturbation effects in single-cell images. Our method leverages the replicate structure of OPS experiments (i.e., cells undergoing the same genetic perturbation, both within and across batches) as a form of weak supervision. We conduct extensive experiments on a large-scale OPS dataset with more than 5000 genetic perturbations, and demonstrate that Set-DINO helps mitigate the impact of confounders and encodes more biologically meaningful information. In particular, Set-DINO recalls known biological relationships with higher accuracy compared to commonly used methods for morphological profiling, suggesting that it can generate more reliable insights from drug target discovery campaigns leveraging OPS.

Weakly Supervised Set-Consistency Learning Improves Morphological Profiling of Single-Cell Images

TL;DR

perturbations, Set-DINO achieves superior reproducibility, reduced batch effects, and higher recall of gene–gene relationships (e.g., via CORUM), outperforming engineered features and standard DINO. The approach holds promise for improving drug target discovery campaigns and can extend to other single-cell datasets with weak labels, with future work exploring richer set-aggregation methods.

Abstract

Paper Structure (17 sections, 2 equations, 3 figures, 2 tables)

This paper contains 17 sections, 2 equations, 3 figures, 2 tables.

Introduction
Related Work
Deep learning for high-content images
Removing nuisance in morphological profiles
Modeling population heterogeneity
Methods
Dataset
Image preprocessing
Set-DINO framework
Implementation
Representation levels
Evaluation protocols
Results and Discussion
Set-DINO achieves superior performance compared to existing methods
Set-DINO encodes biologically meaningful information
...and 2 more sections

Figures (3)

Figure 1: Overview of the Set-DINO framework. The inputs are two sets of single-cell images undergoing the same perturbation in different batches. Each 4-channel image is processed individually by the Vision Transformer (ViT) to generate a set of embeddings. The projector consists of an aggregation layer, followed by three fully-connected layers. The resulting consensus embeddings from the student and teacher branches are used to calculate the cross-entropy loss to train the model. After the model is trained, the single-cell image embeddings from ViT are used as the cell-level morphological features. SG: stop-gradient, EMA: exponential moving average.
Figure 2: Evaluation of consensus gene profiles in identifying gene-gene relationships. (a) Precision-recall curves for predicting gene-gene relationships according to CORUM and curated CORUM. The curves are drawn by varying cutoff percentiles from top 1% to top 20%. (b) Distribution of cosine similarities between gene pairs. "Gene Sets" contains gene pairs that are involved in the same protein complex, while "Random" contains randomly sampled gene pairs. (c) The first row contains the full adjacency matrices of the ground truth graph (leftmost column) and prediction graphs (other columns). The second row provides a zoomed-in view of the adjacency matrix focused around the exosome. Although self-edges are included in the visualization, they are not considered when calculating biological recall and precision. All DINO and Set-DINO models shown in this figure were trained with NTC z-score normalization.
Figure 3: Performance analysis on an increasing number of principal components. Principal component analysis (PCA) is performed on batch-level gene profiles. Reproducibility and batch effect metrics on an increasing number of principal components are measured.

Weakly Supervised Set-Consistency Learning Improves Morphological Profiling of Single-Cell Images

TL;DR

Abstract

Weakly Supervised Set-Consistency Learning Improves Morphological Profiling of Single-Cell Images

Authors

TL;DR

Abstract

Table of Contents

Figures (3)