Table of Contents
Fetching ...

The Multi-modality Cell Segmentation Challenge: Towards Universal Solutions

Jun Ma, Ronald Xie, Shamini Ayyadhury, Cheng Ge, Anubha Gupta, Ritu Gupta, Song Gu, Yao Zhang, Gihun Lee, Joonkee Kim, Wei Lou, Haofeng Li, Eric Upschulte, Timo Dickscheid, José Guilherme de Almeida, Yixin Wang, Lin Han, Xin Yang, Marco Labagnara, Vojislav Gligorovski, Maxime Scheder, Sahand Jamal Rahi, Carly Kempster, Alice Pollitt, Leon Espinosa, Tâm Mignot, Jan Moritz Middeke, Jan-Niklas Eckardt, Wangkai Li, Zhaoyang Li, Xiaochen Cai, Bizhe Bai, Noah F. Greenwald, David Van Valen, Erin Weisbart, Beth A. Cimini, Trevor Cheung, Oscar Brück, Gary D. Bader, Bo Wang

TL;DR

A Transformer-based deep-learning algorithm is developed that not only exceeds existing methods but can also be applied to diverse microscopy images across imaging platforms and tissue types without manual parameter adjustments.

Abstract

Cell segmentation is a critical step for quantitative single-cell analysis in microscopy images. Existing cell segmentation methods are often tailored to specific modalities or require manual interventions to specify hyper-parameters in different experimental settings. Here, we present a multi-modality cell segmentation benchmark, comprising over 1500 labeled images derived from more than 50 diverse biological experiments. The top participants developed a Transformer-based deep-learning algorithm that not only exceeds existing methods but can also be applied to diverse microscopy images across imaging platforms and tissue types without manual parameter adjustments. This benchmark and the improved algorithm offer promising avenues for more accurate and versatile cell analysis in microscopy imaging.

The Multi-modality Cell Segmentation Challenge: Towards Universal Solutions

TL;DR

A Transformer-based deep-learning algorithm is developed that not only exceeds existing methods but can also be applied to diverse microscopy images across imaging platforms and tissue types without manual parameter adjustments.

Abstract

Cell segmentation is a critical step for quantitative single-cell analysis in microscopy images. Existing cell segmentation methods are often tailored to specific modalities or require manual interventions to specify hyper-parameters in different experimental settings. Here, we present a multi-modality cell segmentation benchmark, comprising over 1500 labeled images derived from more than 50 diverse biological experiments. The top participants developed a Transformer-based deep-learning algorithm that not only exceeds existing methods but can also be applied to diverse microscopy images across imaging platforms and tissue types without manual parameter adjustments. This benchmark and the improved algorithm offer promising avenues for more accurate and versatile cell analysis in microscopy imaging.
Paper Structure (20 sections, 1 equation, 4 figures, 1 table)

This paper contains 20 sections, 1 equation, 4 figures, 1 table.

Figures (4)

  • Figure 1: Overview of the challenge task and pipeline.a, The challenge aims to facilitate the development of universal cell segmentation algorithms that can segment a wide range of microscopy images without manual intervention. b, The challenge contains two phases. During the development phase, participants develop automatic segmentation algorithms based on 1000 labeled images and 1725 unlabeled images. The algorithms can be evaluated on a tuning set with 101 images and the online evaluation platform will automatically return back the quantitative performance. During the testing phase, each team can submit one algorithm via the Docker container as the final solution, which is independently evaluated on the holdout testing set with 422 images to obtain ranking results.
  • Figure 2: Dataset overview.a, The challenge provides a diverse microscopy image dataset that includes tissue cells, cultured cells, label-free cells, stained cells, and different microscopes (i.e., brightfield, fluorescent, phase-contrast (PC), and (Differential Interference Contrast) DIC). b, The geographical distribution of data sources and challenge participants. The red, green, purple, and blue address icons denote the countries or regions where the brightfield, fluorescent, phase-contrast, and differential interference contrast image datasets are from, respectively. The size of the pink circle in each country is proportional to the number of participants from the corresponding country. c, The number of images in the training set. d, The number of labeled cells in the training set. e, Randomly selected examples (from left to right: brightfield, fluorescent, PC, and DIC images) from the training set (the 1st row) and testing set (the 2nd row). f, The number of images in the testing set. g, The number of cells in the testing set. There are two fluorescent whole-slide images (WSI) in the testing set.
  • Figure 3: Evaluation results of 28 algorithms on the holdout testing set.a, Dot and box plot of the F1 scores on the testing set (n=422 independent images). The box plots display descriptive statistics across all testing cases, with the median value represented by the horizontal line within the box, the lower and upper quartiles delineating the borders of the box, and the vertical black lines indicating the 1.5 interquartile range. b, The top algorithms achieve a good trade-off between segmentation accuracy (y-axis) and efficiency (x-axis). The circle size is proportional to GPU memory consumption. c, Pairwise significant test results (one-sided Wilcoxon signed rank test) show that the winning algorithm is significantly better than the other algorithms. d, Blob plot for visualizing ranking stability based on bootstrap sampling. The median area of each blob is proportional to the relative frequency of achieved ranks across 1000 bootstrap samples. The median rank for each algorithm is indicated by a black cross. 95% bootstrap intervals across bootstrap samples are indicated by black lines. e, The winning algorithm holds the first place across five different ranking schemes. f, The high Kendall's tau scores indicate that the ranking results are stable. The volin plot shows descriptive statistics with the median value represented by the horizontal solid line within the box, the mean value represented by the horizontal dashed line the lower and upper quartiles delineating the borders of the box, and the vertical black lines indicating the 1.5 interquartile range.
  • Figure 4: Quantitative and qualitative comparison between the top three algorithms and state-of-the-art generalist cell segmentation algorithms: KIT-GE (top solution in the segmentation benchmark of the cell tracking challenge), Cellpose, Omnipose, and their variants under different training strategies. Dot and box plot of the F1 scores on the a, whole testing test (n=422 independent images); b, brightfield images (n=120 independent images); c, fluorescent images (n=122 independent images); d, phase-contrast images (n=120 independent images); e, DIC images (n=60 independent images). The box plots display descriptive statistics with the median value represented by the horizontal line within the box, the lower and upper quartiles delineating the borders of the box, and the vertical black lines indicating the 1.5 interquartile range. f, Example segmentation results of the four microscopy image modalities: brightfield, fluorescent, phase-contrast, and DIC images (from top to bottom). g, Quantitative comparison on the post-challenge testing set (N=157). The box plot shows descriptive statistics across the post-challenge testing cases, with the median value represented by the horizontal line within the box, the lower and upper quartiles delineating the borders of the box, and the vertical black lines indicating the 1.5 interquartile range. Cellpose-pretrain: Cellpose pretrained model ("cyto2"). Cellpose-scratch: Cellpose model trained from scratch on the challenge dataset. Cellpose-finetune: Cellpose fine-tuned model on the challenge dataset. Omnipose-pretrain: Omnipose pretrained model ("cyto2"). Omnipose-scratch: Omnipose model trained from scratch on the challenge dataset. Omnipose-finetune: Omnipose fine-tuned model on the challenge dataset.