Table of Contents
Fetching ...

Unsupervised cell segmentation by fast Gaussian Processes

Laura Baracaldo, Blythe King, Haoran Yan, Yizi Lin, Nina Miolane, Mengyang Gu

TL;DR

This work tackles unsupervised segmentation of cell boundaries in noisy microscopy by formulating a fast Gaussian-process model for images with a separable covariance $\mathbf{R}=\mathbf{R}_2 \otimes \mathbf{R}_1$. It combines GP-based denoising, data-driven automatic thresholding, and watershed segmentation, with a fast eigen-based computation that reduces complexity to $\mathcal{O}(n_1^3+n_2^3)$, enabling processing of large images through sub-images. The method is demonstrated on nuclei and whole-cell images, showing consistent improvement over ImageJ and other unsupervised baselines in IoU/AP metrics, and is supported by publicly available code. Overall, the approach offers a parameter-free, scalable alternative for segmentation of new cell types without requiring annotated training data, with potential extensions to time-lapse tracking and 3D imaging.

Abstract

Cell boundary information is crucial for analyzing cell behaviors from time-lapse microscopy videos. Existing supervised cell segmentation tools, such as ImageJ, require tuning various parameters and rely on restrictive assumptions about the shape of the objects. While recent supervised segmentation tools based on convolutional neural networks enhance accuracy, they depend on high-quality labeled images, making them unsuitable for segmenting new types of objects not in the database. We developed a novel unsupervised cell segmentation algorithm based on fast Gaussian processes for noisy microscopy images without the need for parameter tuning or restrictive assumptions about the shape of the object. We derived robust thresholding criteria adaptive for heterogeneous images containing distinct brightness at different parts to separate objects from the background, and employed watershed segmentation to distinguish touching cell objects. Both simulated studies and real-data analysis of large microscopy images demonstrate the scalability and accuracy of our approach compared with the alternatives.

Unsupervised cell segmentation by fast Gaussian Processes

TL;DR

This work tackles unsupervised segmentation of cell boundaries in noisy microscopy by formulating a fast Gaussian-process model for images with a separable covariance . It combines GP-based denoising, data-driven automatic thresholding, and watershed segmentation, with a fast eigen-based computation that reduces complexity to , enabling processing of large images through sub-images. The method is demonstrated on nuclei and whole-cell images, showing consistent improvement over ImageJ and other unsupervised baselines in IoU/AP metrics, and is supported by publicly available code. Overall, the approach offers a parameter-free, scalable alternative for segmentation of new cell types without requiring annotated training data, with potential extensions to time-lapse tracking and 3D imaging.

Abstract

Cell boundary information is crucial for analyzing cell behaviors from time-lapse microscopy videos. Existing supervised cell segmentation tools, such as ImageJ, require tuning various parameters and rely on restrictive assumptions about the shape of the objects. While recent supervised segmentation tools based on convolutional neural networks enhance accuracy, they depend on high-quality labeled images, making them unsuitable for segmenting new types of objects not in the database. We developed a novel unsupervised cell segmentation algorithm based on fast Gaussian processes for noisy microscopy images without the need for parameter tuning or restrictive assumptions about the shape of the object. We derived robust thresholding criteria adaptive for heterogeneous images containing distinct brightness at different parts to separate objects from the background, and employed watershed segmentation to distinguish touching cell objects. Both simulated studies and real-data analysis of large microscopy images demonstrate the scalability and accuracy of our approach compared with the alternatives.

Paper Structure

This paper contains 14 sections, 1 theorem, 23 equations, 10 figures.

Key Result

Lemma 1

Figures (10)

  • Figure 1: Workflow for segmenting and labeling cell images: (A) Divide the image into different sub-images to enable locally estimated mean and variance parameters for capturing local properties such as the change of brightness. (B) Compute the predictive mean of fast GPs in Section \ref{['subsec:gp_images']} to each sub-image, which greatly reduces image noise. (C) Threshold each smoothed sub-image based on the criterion discussed in Section \ref{['subsec:crit_1']} to produce binary images, separating cells from the background. The optimal threshold is estimated for each sub-image. (D) Recombine the binary sub-images into a single binary image, and apply the watershed algorithm discussed in Section \ref{['subsec:watershed_alg_sim']} to the image for segmentation and labeling, with each cell marked by a unique color.
  • Figure 2: Comparison of binary image results across thresholds based on the absolute second difference in foreground pixels. The threshold, which ranges from 0-1, refers to the proportion of the maximum value of the predictive mean that is set as the binary cutoff. Images A, B, C, D, and E are the binary images generated by setting the corresponding threshold on cropped predictive mean image F. Note that the Image B corresponds with $\mathop{\mathrm{argmax}}\limits_m \Delta c_{k}(\alpha_m)$ in Equation (\ref{['eq:opt_threshold']}).
  • Figure 3: (A) Frequency of the predictive mean of the intensity values over all pixels from the same microscopy image shown in Figure \ref{['fig:crit_1_demonstration']} F. The optimal threshold is annotated and lies right after the bulk of the background pixel values. The thresholds that generate images A, B, C, D, and E from Figure \ref{['fig:crit_1_demonstration']} are represented as vertical lines with the same color. All pixel intensity values less than the optimal threshold are background pixels and have a symmetric distribution. (B) The predictive mean is shown for each pixel with the optimal threshold plotted as the horizontal plane.
  • Figure 4: An image of two cell nuclei (upper row) and heights of negative distances to the nearest background pixels (lower row). (A) Heights of negative distances along the red straight line in the upper panel are plotted in the lower panel before the watershed algorithm starts. (B) At water level $= -13$, both catch basins are partially filled, as both local minima are less than the current water level. The two separate water sources have unique labels and are not yet touching. (C) At water level at around $-10.2$, the water sources from the two catch basins flow into each other and the watershed line is formed at water level at around $-10.2$. (D) At water level $= 0$, the cell objects are filled with water, and the watershed operation is complete. Each cell is labeled and separated.
  • Figure 5: Violin plots of RMSE for five methods applied to the Branin function and the linear diffusion equation across various noise levels. Each experiment is repeated 10 times. The Fast-Mat and Fast-Exp represent the fast GPs with Matérn kernels in Equation (\ref{['equ:matern_5_2']}) and exponential kernels, respectively.
  • ...and 5 more figures

Theorems & Definitions (4)

  • Lemma 1
  • Example 1
  • Example 2: Cell images
  • proof