Self-supervised Representation Learning with Local Aggregation for Image-based Profiling
Siran Dai, Qianqian Xu, Peisong Wen, Yang Liu, Qingming Huang
TL;DR
This work tackles the challenge of learning generalizable cell-image representations with minimal labeling by introducing SSLProfiler, a non-contrastive self-supervised framework that enforces cross-site consistency through a local-aggregation branch and ViT-based backbones. It addresses domain-specific issues via channel-aware augmentations, microscope-noise simulation, and elastic deformations, and further stabilizes representations with post-processing that merges multi-granularity and multi-position information and aligns across plates. The approach is validated on the CPJUMP1 dataset, with ablations showing the contributions of local aggregation, augmented views, and post-processing in improving cross-site transferability, ultimately achieving top results in the CVPR 2025 Cell Line Transferability Challenge. The findings underscore the importance of domain-tailored SSL design for dense, multi-channel biological images and offer practical steps toward robust, reagent-agnostic profiling for drug discovery and disease studies.
Abstract
Image-based cell profiling aims to create informative representations of cell images. This technique is critical in drug discovery and has greatly advanced with recent improvements in computer vision. Inspired by recent developments in non-contrastive Self-Supervised Learning (SSL), this paper provides an initial exploration into training a generalizable feature extractor for cell images using such methods. However, there are two major challenges: 1) Unlike typical scenarios where each representation is based on a single image, cell profiling often involves multiple input images, making it difficult to effectively fuse all available information; and 2) There is a large difference between the distributions of cell images and natural images, causing the view-generation process in existing SSL methods to fail. To address these issues, we propose a self-supervised framework with local aggregation to improve cross-site consistency of cell representations. We introduce specialized data augmentation and representation post-processing methods tailored to cell images, which effectively address the issues mentioned above and result in a robust feature extractor. With these improvements, the proposed framework won the Cell Line Transferability challenge at CVPR 2025.
