Table of Contents
Fetching ...

SAM-I-Am: Semantic Boosting for Zero-shot Atomic-Scale Electron Micrograph Segmentation

Waqwoya Abebe, Jan Strube, Luanzheng Guo, Nathan R. Tallent, Oceane Bel, Steven Spurgeon, Christina Doty, Ali Jannesari

Abstract

Image segmentation is a critical enabler for tasks ranging from medical diagnostics to autonomous driving. However, the correct segmentation semantics - where are boundaries located? what segments are logically similar? - change depending on the domain, such that state-of-the-art foundation models can generate meaningless and incorrect results. Moreover, in certain domains, fine-tuning and retraining techniques are infeasible: obtaining labels is costly and time-consuming; domain images (micrographs) can be exponentially diverse; and data sharing (for third-party retraining) is restricted. To enable rapid adaptation of the best segmentation technology, we propose the concept of semantic boosting: given a zero-shot foundation model, guide its segmentation and adjust results to match domain expectations. We apply semantic boosting to the Segment Anything Model (SAM) to obtain microstructure segmentation for transmission electron microscopy. Our booster, SAM-I-Am, extracts geometric and textural features of various intermediate masks to perform mask removal and mask merging operations. We demonstrate a zero-shot performance increase of (absolute) +21.35%, +12.6%, +5.27% in mean IoU, and a -9.91%, -18.42%, -4.06% drop in mean false positive masks across images of three difficulty classes over vanilla SAM (ViT-L).

SAM-I-Am: Semantic Boosting for Zero-shot Atomic-Scale Electron Micrograph Segmentation

Abstract

Image segmentation is a critical enabler for tasks ranging from medical diagnostics to autonomous driving. However, the correct segmentation semantics - where are boundaries located? what segments are logically similar? - change depending on the domain, such that state-of-the-art foundation models can generate meaningless and incorrect results. Moreover, in certain domains, fine-tuning and retraining techniques are infeasible: obtaining labels is costly and time-consuming; domain images (micrographs) can be exponentially diverse; and data sharing (for third-party retraining) is restricted. To enable rapid adaptation of the best segmentation technology, we propose the concept of semantic boosting: given a zero-shot foundation model, guide its segmentation and adjust results to match domain expectations. We apply semantic boosting to the Segment Anything Model (SAM) to obtain microstructure segmentation for transmission electron microscopy. Our booster, SAM-I-Am, extracts geometric and textural features of various intermediate masks to perform mask removal and mask merging operations. We demonstrate a zero-shot performance increase of (absolute) +21.35%, +12.6%, +5.27% in mean IoU, and a -9.91%, -18.42%, -4.06% drop in mean false positive masks across images of three difficulty classes over vanilla SAM (ViT-L).
Paper Structure (20 sections, 6 equations, 9 figures, 2 algorithms)

This paper contains 20 sections, 6 equations, 9 figures, 2 algorithms.

Figures (9)

  • Figure 1: Comparing the vanilla SAM pipeline and SAM-I-Am for a cross-sectional image of three layers: Pt / C (top), SrTiO$_3$ (middle) and Ge (bottom). The SAM pipeline conducts the 'segment anything' task that yields ambiguous masks and fails to identify surfaces with similar material makeup. Using semantic boosting, SAM-I-Am delivers the microstructural segmentation task.
  • Figure 2: Semantic boosting: The proposed booster augments SAM+ by performing a mask-in mask-out post-processing procedure for mask removal and mask merging operations.
  • Figure 3: Labeling times in seconds for different difficulty classes of images.
  • Figure 4: Comparing mean performance on entire mask regions and mask boundary regions. In both cases, SAM-I-Am pipeline outperforms the SAM+ baseline.
  • Figure 5: Comparing mean false positive rate on entire mask regions and mask boundary regions. In both cases, SAM-I-Am pipeline outperforms the SAM+ baseline.
  • ...and 4 more figures