Table of Contents
Fetching ...

Context Determines Optimal Architecture in Materials Segmentation

Mingjian Lu, Pawan K. Tripathi, Mark Shteyn, Debargha Ganguly, Roger H. French, Vipin Chaudhary, Yinghui Wu

TL;DR

The paper addresses deployment gaps in materials image segmentation by showing that optimal architectures vary across imaging modalities. It introduces a cross-modal configuration framework with three modules—Cross-Modal Configuration, Quality Feedback, and Expert Knowledge Integration—that standardize inputs, produce segmentation with reliability signals, and generate interpretability heatmaps and counterfactual explanations. Across seven datasets and six encoder-decoder configurations, it demonstrates context-dependent architecture performance, with UNet favored for high-contrast 2D surfaces and DeepLabv3+ for hard, multi-scale volumetric tasks, and enhances deployment confidence via Forte-based out-of-distribution detection and expert-aligned explanations. The framework enables researchers to select architecture choices tailored to their imaging setup and to assess model trustworthiness in new samples, thereby facilitating reliable, automated materials characterization.

Abstract

Segmentation architectures are typically benchmarked on single imaging modalities, obscuring deployment-relevant performance variations: an architecture optimal for one modality may underperform on another. We present a cross-modal evaluation framework for materials image segmentation spanning SEM, AFM, XCT, and optical microscopy. Our evaluation of six encoder-decoder combinations across seven datasets reveals that optimal architectures vary systematically by context: UNet excels for high-contrast 2D imaging while DeepLabv3+ is preferred for the hardest cases. The framework also provides deployment feedback via out-of-distribution detection and counterfactual explanations that reveal which microstructural features drive predictions. Together, the architecture guidance, reliability signals, and interpretability tools address a practical gap in materials characterization, where researchers lack tools to select architectures for their specific imaging setup or assess when models can be trusted on new samples.

Context Determines Optimal Architecture in Materials Segmentation

TL;DR

The paper addresses deployment gaps in materials image segmentation by showing that optimal architectures vary across imaging modalities. It introduces a cross-modal configuration framework with three modules—Cross-Modal Configuration, Quality Feedback, and Expert Knowledge Integration—that standardize inputs, produce segmentation with reliability signals, and generate interpretability heatmaps and counterfactual explanations. Across seven datasets and six encoder-decoder configurations, it demonstrates context-dependent architecture performance, with UNet favored for high-contrast 2D surfaces and DeepLabv3+ for hard, multi-scale volumetric tasks, and enhances deployment confidence via Forte-based out-of-distribution detection and expert-aligned explanations. The framework enables researchers to select architecture choices tailored to their imaging setup and to assess model trustworthiness in new samples, thereby facilitating reliable, automated materials characterization.

Abstract

Segmentation architectures are typically benchmarked on single imaging modalities, obscuring deployment-relevant performance variations: an architecture optimal for one modality may underperform on another. We present a cross-modal evaluation framework for materials image segmentation spanning SEM, AFM, XCT, and optical microscopy. Our evaluation of six encoder-decoder combinations across seven datasets reveals that optimal architectures vary systematically by context: UNet excels for high-contrast 2D imaging while DeepLabv3+ is preferred for the hardest cases. The framework also provides deployment feedback via out-of-distribution detection and counterfactual explanations that reveal which microstructural features drive predictions. Together, the architecture guidance, reliability signals, and interpretability tools address a practical gap in materials characterization, where researchers lack tools to select architectures for their specific imaging setup or assess when models can be trusted on new samples.
Paper Structure (8 sections, 4 equations, 3 figures, 3 tables)

This paper contains 8 sections, 4 equations, 3 figures, 3 tables.

Figures (3)

  • Figure 1: (a) The three-component evaluation framework maps heterogeneous inputs to standardized profiles, produces segmentation masks with reliability signals, and generates interpretability heatmaps. (b) The quality-controlled evaluation workflow executes this pipeline.
  • Figure 2: Complete feedback loop demonstration for AFM crystallite analysis: (a) Multi-channel input $D$, (b) Segmentation output $Z$, (c) Quality feedback signals $Q$ (AUROC, F1) indicating deployment readiness, (d) Interpretable feedback $H$ enabling expert validation of decision-relevant regions.
  • Figure 3: AFM crystallites segmentation failure cases flagged by OOD-based quality control. Each row shows a representative prediction assigned a low Forte score. From left to right: predicted segmentation, ground-truth annotation, pixel-wise difference, and overlay visualization with the corresponding IoU score. Although the IoU values remain moderately high, the predictions exhibit systematic errors in crystallite boundary delineation. These deviations from typical boundary morphology lead to low Forte scores, demonstrating the utility of OOD-based quality control for identifying subtle but semantically meaningful segmentation failures.