Table of Contents
Fetching ...

CellFMCount: A Fluorescence Microscopy Dataset, Benchmark, and Methods for Cell Counting

Abdurahman Ali Mohammed, Catherine Fonder, Ying Wei, Wallapak Tavanapong, Donald S Sakaguchi, Qi Li, Surya K. Mallapragada

TL;DR

Automated cell counting in fluorescence microscopy is challenged by dense, overlapping cells and a lack of large, diverse annotated datasets. The authors introduce CellFMCount, a 3,023-image dataset with over 430,000 dot annotations, and benchmark thirteen counting methods plus a SAM-Counter density-map approach. SAM-Counter achieves $MAE=22.12$, $MSE=2470.71$, and $RMSE=49.71$, with $ACP=48.73\%$, outperforming baselines on the DAPI-stained subset. The work provides a robust benchmark and practical demonstration of adapting foundation models to microscopy, with implications for neural development, cancer research, and regenerative medicine.

Abstract

Accurate cell counting is essential in various biomedical research and clinical applications, including cancer diagnosis, stem cell research, and immunology. Manual counting is labor-intensive and error-prone, motivating automation through deep learning techniques. However, training reliable deep learning models requires large amounts of high-quality annotated data, which is difficult and time-consuming to produce manually. Consequently, existing cell-counting datasets are often limited, frequently containing fewer than $500$ images. In this work, we introduce a large-scale annotated dataset comprising $3{,}023$ images from immunocytochemistry experiments related to cellular differentiation, containing over $430{,}000$ manually annotated cell locations. The dataset presents significant challenges: high cell density, overlapping and morphologically diverse cells, a long-tailed distribution of cell count per image, and variation in staining protocols. We benchmark three categories of existing methods: regression-based, crowd-counting, and cell-counting techniques on a test set with cell counts ranging from $10$ to $2{,}126$ cells per image. We also evaluate how the Segment Anything Model (SAM) can be adapted for microscopy cell counting using only dot-annotated datasets. As a case study, we implement a density-map-based adaptation of SAM (SAM-Counter) and report a mean absolute error (MAE) of $22.12$, which outperforms existing approaches (second-best MAE of $27.46$). Our results underscore the value of the dataset and the benchmarking framework for driving progress in automated cell counting and provide a robust foundation for future research and development.

CellFMCount: A Fluorescence Microscopy Dataset, Benchmark, and Methods for Cell Counting

TL;DR

Automated cell counting in fluorescence microscopy is challenged by dense, overlapping cells and a lack of large, diverse annotated datasets. The authors introduce CellFMCount, a 3,023-image dataset with over 430,000 dot annotations, and benchmark thirteen counting methods plus a SAM-Counter density-map approach. SAM-Counter achieves , , and , with , outperforming baselines on the DAPI-stained subset. The work provides a robust benchmark and practical demonstration of adapting foundation models to microscopy, with implications for neural development, cancer research, and regenerative medicine.

Abstract

Accurate cell counting is essential in various biomedical research and clinical applications, including cancer diagnosis, stem cell research, and immunology. Manual counting is labor-intensive and error-prone, motivating automation through deep learning techniques. However, training reliable deep learning models requires large amounts of high-quality annotated data, which is difficult and time-consuming to produce manually. Consequently, existing cell-counting datasets are often limited, frequently containing fewer than images. In this work, we introduce a large-scale annotated dataset comprising images from immunocytochemistry experiments related to cellular differentiation, containing over manually annotated cell locations. The dataset presents significant challenges: high cell density, overlapping and morphologically diverse cells, a long-tailed distribution of cell count per image, and variation in staining protocols. We benchmark three categories of existing methods: regression-based, crowd-counting, and cell-counting techniques on a test set with cell counts ranging from to cells per image. We also evaluate how the Segment Anything Model (SAM) can be adapted for microscopy cell counting using only dot-annotated datasets. As a case study, we implement a density-map-based adaptation of SAM (SAM-Counter) and report a mean absolute error (MAE) of , which outperforms existing approaches (second-best MAE of ). Our results underscore the value of the dataset and the benchmarking framework for driving progress in automated cell counting and provide a robust foundation for future research and development.

Paper Structure

This paper contains 23 sections, 5 equations, 5 figures, 11 tables.

Figures (5)

  • Figure 1: Cell-type diversity and morphological variation across immunolabeled samples. Cells were immunolabeled with antibodies for markers of specific cell types, including immature neurons (TuJ1; A), maturing neurons (MAP2ab; B), astrocytes (GFAP; C), and oligodendrocytes (RIP; D). Cells were also stained with a cell viability dye to mark for dead cells (PI; E) as well as a nuclear stain (DAPI; F). Images show drastic differences in both cell morphology and density, ranging from sparse cells (e.g., PI, 14 cells) to dense fields (e.g., DAPI, 2546 cells). Scale bar = 50 $\mu m$ for 40x image fields (Row 1) or 100 $\mu m$ for 20x image fields (Row 2). Images were pseudo-colored for visualization.
  • Figure 2: CellFMCount dataset creation pipeline.
  • Figure 3: The architecture of SAM-Counter. The input image is encoded using a fine-tuned ViT encoder from SAM kirillov2023segany. The density estimation layers then estimate the density map, whose sum yields the cell count. See ( \ref{['eq:feature_map']}) for reshaping.
  • Figure 4: Distribution of cell counts per DAPI-stained images in training and testing sets.
  • Figure 5: Visualization of density map predictions of the three best-performing models on five representative test samples: one per column.