Table of Contents
Fetching ...

Glioma C6: A Novel Dataset for Training and Benchmarking Cell Segmentation

Roman Malashin, Svetlana Pashkevich, Daniil Ilyukhin, Arseniy Volkov, Valeria Yachnaya, Andrey Denisov, Maria Mikhalkova

TL;DR

The paper introduces Glioma C6, a specialized phase-contrast microscopy dataset for glioma C6 cells with soma and two cell-type labels, designed for benchmarking and training instance segmentation models. It demonstrates that generalist segmentation approaches struggle to generalize to this dataset without fine-tuning, while targeted fine-tuning yields robust performance across varied imaging conditions, using a 75-image, $12{,}000$+ cell corpus split into spec and gen subsets. It also analyzes annotation uncertainty, showing inherent ambiguity in dense, overlapping morphologies and highlighting that model predictions can sometimes align with expert consensus better than original ground truth. Overall, Glioma C6 provides a realistic, challenging benchmark to advance robust cell segmentation and phenotyping in dense tumor-like environments and supports model adaptation studies.

Abstract

We present Glioma C6, a new open dataset for instance segmentation of glioma C6 cells, designed as both a benchmark and a training resource for deep learning models. The dataset comprises 75 high-resolution phase-contrast microscopy images with over 12,000 annotated cells, providing a realistic testbed for biomedical image analysis. It includes soma annotations and morphological cell categorization provided by biologists. Additional categorization of cells, based on morphology, aims to enhance the utilization of image data for cancer cell research. Glioma C6 consists of two parts: the first is curated with controlled parameters for benchmarking, while the second supports generalization testing under varying conditions. We evaluate the performance of several generalist segmentation models, highlighting their limitations on our dataset. Our experiments demonstrate that training on Glioma C6 significantly enhances segmentation performance, reinforcing its value for developing robust and generalizable models. The dataset is publicly available for researchers.

Glioma C6: A Novel Dataset for Training and Benchmarking Cell Segmentation

TL;DR

The paper introduces Glioma C6, a specialized phase-contrast microscopy dataset for glioma C6 cells with soma and two cell-type labels, designed for benchmarking and training instance segmentation models. It demonstrates that generalist segmentation approaches struggle to generalize to this dataset without fine-tuning, while targeted fine-tuning yields robust performance across varied imaging conditions, using a 75-image, + cell corpus split into spec and gen subsets. It also analyzes annotation uncertainty, showing inherent ambiguity in dense, overlapping morphologies and highlighting that model predictions can sometimes align with expert consensus better than original ground truth. Overall, Glioma C6 provides a realistic, challenging benchmark to advance robust cell segmentation and phenotyping in dense tumor-like environments and supports model adaptation studies.

Abstract

We present Glioma C6, a new open dataset for instance segmentation of glioma C6 cells, designed as both a benchmark and a training resource for deep learning models. The dataset comprises 75 high-resolution phase-contrast microscopy images with over 12,000 annotated cells, providing a realistic testbed for biomedical image analysis. It includes soma annotations and morphological cell categorization provided by biologists. Additional categorization of cells, based on morphology, aims to enhance the utilization of image data for cancer cell research. Glioma C6 consists of two parts: the first is curated with controlled parameters for benchmarking, while the second supports generalization testing under varying conditions. We evaluate the performance of several generalist segmentation models, highlighting their limitations on our dataset. Our experiments demonstrate that training on Glioma C6 significantly enhances segmentation performance, reinforcing its value for developing robust and generalizable models. The dataset is publicly available for researchers.

Paper Structure

This paper contains 12 sections, 6 figures, 4 tables.

Figures (6)

  • Figure 1: Sample images from the Glioma C6 dataset: (a) Type A cells (blue) and Type B cells (yellow), (b) Soma, (c, d) Cells from gen subset captured under varying conditions.
  • Figure 2: Shape distribution. (a) Cell and soma area distribution in spec subset of Glioma C6, (b) Diagram showing eccentricity, solidity and size relation of cells type A and type B and soma, (c) Distribution of solidity vs eccentricity and area for glioma C6 cells and soma.
  • Figure 3: Datasets statistics. (a) Distribution of sizes, (b) eccentricity vs. solidity ratio distribution, and (c) Diagram showing relation of mean size, solidity and eccentricity of the cell objects in different datasets
  • Figure 4: Predictions of the models on the Glioma C6 dataset. Ground truth cell borders are shown in light blue, predictions in red, and binary cell masks in dark blue. (a, b) Cells, (c) somata. Panels (a), (b) and (c) correspond to spec subset, (d) depicts images from the gen subset.
  • Figure 5: Predictions of the models on the Glioma C6 dataset. (a, b) Type A cells, (c, d) Type B cells. Ground truth cell borders are shown in light blue, predictions in red, and binary binary cell masks in dark blue.
  • ...and 1 more figures