Multi-scale attention-based instance segmentation for measuring crystals with large size variation

Theresa Neubauer; Astrid Berg; Maria Wimmer; Dimitrios Lenis; David Major; Philip Matthias Winter; Gaia Romana De Paolis; Johannes Novotny; Daniel Lüftner; Katja Reinharter; Katja Bühler

Multi-scale attention-based instance segmentation for measuring crystals with large size variation

Theresa Neubauer, Astrid Berg, Maria Wimmer, Dimitrios Lenis, David Major, Philip Matthias Winter, Gaia Romana De Paolis, Johannes Novotny, Daniel Lüftner, Katja Reinharter, Katja Bühler

TL;DR

This work addresses automatic crystal-size measurement in high-resolution images where large intra-image size variation and boundary ambiguity hinder traditional segmentation. It introduces Cellpose+SiMA, an instance-segmentation framework that augments Cellpose with a size-aware multi-scale attention module (SiMA) to fuse predictions across multiple resolutions region-wise. SiMA derives size-aware attention maps from crystal-length thresholds and uses them to compute a weighted fusion of flow maps and foreground predictions, leading to improved segmentation and more accurate crystal-size estimation. On a challenging refractory raw material dataset, Cellpose+SiMA outperforms boundary-based and state-of-the-art instance segmentation methods in both segmentation quality (PQ, AJI) and size accuracy (MAE/MRE of ACS), while maintaining a favorable parameter count and latency. The approach offers a practical, scalable solution for quantitative grain size analysis in materials science and could generalize to other high-resolution datasets with pronounced size variation.

Abstract

Quantitative measurement of crystals in high-resolution images allows for important insights into underlying material characteristics. Deep learning has shown great progress in vision-based automatic crystal size measurement, but current instance segmentation methods reach their limits with images that have large variation in crystal size or hard to detect crystal boundaries. Even small image segmentation errors, such as incorrectly fused or separated segments, can significantly lower the accuracy of the measured results. Instead of improving the existing pixel-wise boundary segmentation methods, we propose to use an instance-based segmentation method, which gives more robust segmentation results to improve measurement accuracy. Our novel method enhances flow maps with a size-aware multi-scale attention module. The attention module adaptively fuses information from multiple scales and focuses on the most relevant scale for each segmented image area. We demonstrate that our proposed attention fusion strategy outperforms state-of-the-art instance and boundary segmentation methods, as well as simple average fusion of multi-scale predictions. We evaluate our method on a refractory raw material dataset of high-resolution images with large variation in crystal size and show that our model can be used to calculate the crystal size more accurately than existing methods.

Multi-scale attention-based instance segmentation for measuring crystals with large size variation

TL;DR

Abstract

Paper Structure (27 sections, 1 equation, 6 figures, 3 tables)

This paper contains 27 sections, 1 equation, 6 figures, 3 tables.

Introduction
Related Work
Boundary segmentation method
Superpixel segmentation
Instance segmentation – Proposal-based
Instance segmentation – Proposal-free
Multi-scale segmentation models and Attention models
Method
Cellpose
Cellpose+SiMA
SiMA
Multi-scale attention fusion with Cellpose+SiMA
Experimental Evaluation
Dataset
Experimental setup
...and 12 more sections

Figures (6)

Figure 1: Comparison of different segmentation methods and their impact on size measurement accuracy.(1a) Grain with multiple crystals of different sizes. (1b) Ground truth crystal instance segmentation. (2) Boundary prediction and resulting instance segmentation from Yu et al. yuCenterGuidedConnectivityPreservingNetwork2022. The boundary model mispredicts tiny crystals (P1) when the crystal boundary is difficult to identify. Also scratched surfaces (P2) are falsely recognized as boundary. Gaps (P3) between grain and background need to be closed by post-processing. (3) Intermediate flow map output and final instance segmentation from the instance model (Cellpose) of Stringer et al. stringerCellposeGeneralistAlgorithm2021. The model cannot correctly identify the center of the large crystal (P4), and it also fails to detect thin boundaries (P5), resulting in missing instance segments. (4) Our proposed multi-scale attention model overcomes these problems and correctly segments crystals of varying sizes and thus enhances the accuracy of the measured sizes.
Figure 2: (a) Proposed pipeline Cellpose+SiMA, visualized for $N=3$. At training time, the CP module is trained with different scaling augmentations. At inference time, the input image is rescaled and passed through Cellpose $N$ times. At different input resolutions, the flow emphasizes different details. The combination of flows via attention maps helps to produce predictions adapted to different parts of the image with varying characteristics. (b) The magnified image region illustrates the effect of different image resolutions, from low to high. Flow $F_{1}$ shows that important details are lost when the image resolution is too low. Flow $F_{2}$ has an ideal image resolution to detect the left big crystal, but has problems with tiny crystals. The center of the big crystal is not detected correctly in flow $F_{3}$, indicating that the image resolution is too high to fit the big crystal on one patch as model input. The combined flow $F^{c}$ shows the correct region-wise fusion of the flow maps using the weights from the attention maps of SiMA. (c) At test time, the rescaled image is tiled into patches, which are then processed by Cellpose. An advanced stitching algorithm merges the output patches together. (d) Network architecture of the SiMA model.
Figure 3: (a) Crystal instance label with (b) the corresponding attention maps for $N=3$ and $t=(100\%, 50\%, 25\%)$. Each attention map is a binary segmentation of all crystal instances belonging to a certain length range of $t$.
Figure 4: The relationship between the PQ/clDice score and the relative error of the crystal size on the test set. A higher PQ score results in a lower measurement error, while a higher clDice score does not indicate more accurate size measurements.
Figure 5: Visual comparison of different model results. The image shows two crystals with a large difference in size. Our Cellpose+SiMA method can accurately segment both crystals, which is enabled by our multi-scale prediction approach. Due to the strong downsampling of Cellpose original, the model is not able to recognize the thin boundary between the two crystals and the whole area is detected as one crystal. The boundary prediction from Yu et al.'s model (visible as a black line in the instance segmentation) produces a boundary gap (P1) where the boundary is difficult to detect, resulting in merged crystal segments. MaskFormer, OneFormer, and RTMDet-Ins struggle to segment instances that occupy a larger area than the model's input size.
...and 1 more figures

Multi-scale attention-based instance segmentation for measuring crystals with large size variation

TL;DR

Abstract

Multi-scale attention-based instance segmentation for measuring crystals with large size variation

Authors

TL;DR

Abstract

Table of Contents

Figures (6)