Table of Contents
Fetching ...

Handling Geometric Domain Shifts in Semantic Segmentation of Surgical RGB and Hyperspectral Images

Silvia Seidlitz, Jan Sellner, Alexander Studier-Fischer, Alessandro Motta, Berkin Özdemir, Beat P. Müller-Stich, Felix Nickel, Lena Maier-Hein

TL;DR

This paper investigates how geometric domain shifts (e.g., organ occlusions and resections) affect semantic segmentation in open surgery using RGB and hyperspectral imaging (HSI). It introduces Organ Transplantation, a topology-altering augmentation that transfers organ pixels and corresponding labels between images to simulate unusual geometric contexts, improving generalization. Across six OOD datasets derived from 600 RGB/HSI cubes of 33 pigs and 19 classes, standard SOA models exhibit substantial DSC/NSD declines, particularly at higher input granularity, which Organ Transplantation can recover up to 67% (RGB) and 90% (HSI)—matching in-distribution performance on real OOD data and proving its model-agnostic efficacy. The method is simple to implement, scales across modalities, and is released publicly, offering a practical tool to mitigate geometric domain shifts in surgical scene segmentation.

Abstract

Robust semantic segmentation of intraoperative image data holds promise for enabling automatic surgical scene understanding and autonomous robotic surgery. While model development and validation are primarily conducted on idealistic scenes, geometric domain shifts, such as occlusions of the situs, are common in real-world open surgeries. To close this gap, we (1) present the first analysis of state-of-the-art (SOA) semantic segmentation models when faced with geometric out-of-distribution (OOD) data, and (2) propose an augmentation technique called "Organ Transplantation", to enhance generalizability. Our comprehensive validation on six different OOD datasets, comprising 600 RGB and hyperspectral imaging (HSI) cubes from 33 pigs, each annotated with 19 classes, reveals a large performance drop in SOA organ segmentation models on geometric OOD data. This performance decline is observed not only in conventional RGB data (with a dice similarity coefficient (DSC) drop of 46 %) but also in HSI data (with a DSC drop of 45 %), despite the richer spectral information content. The performance decline increases with the spatial granularity of the input data. Our augmentation technique improves SOA model performance by up to 67 % for RGB data and 90 % for HSI data, achieving performance at the level of in-distribution performance on real OOD test data. Given the simplicity and effectiveness of our augmentation method, it is a valuable tool for addressing geometric domain shifts in surgical scene segmentation, regardless of the underlying model. Our code and pre-trained models are publicly available at https://github.com/IMSY-DKFZ/htc.

Handling Geometric Domain Shifts in Semantic Segmentation of Surgical RGB and Hyperspectral Images

TL;DR

This paper investigates how geometric domain shifts (e.g., organ occlusions and resections) affect semantic segmentation in open surgery using RGB and hyperspectral imaging (HSI). It introduces Organ Transplantation, a topology-altering augmentation that transfers organ pixels and corresponding labels between images to simulate unusual geometric contexts, improving generalization. Across six OOD datasets derived from 600 RGB/HSI cubes of 33 pigs and 19 classes, standard SOA models exhibit substantial DSC/NSD declines, particularly at higher input granularity, which Organ Transplantation can recover up to 67% (RGB) and 90% (HSI)—matching in-distribution performance on real OOD data and proving its model-agnostic efficacy. The method is simple to implement, scales across modalities, and is released publicly, offering a practical tool to mitigate geometric domain shifts in surgical scene segmentation.

Abstract

Robust semantic segmentation of intraoperative image data holds promise for enabling automatic surgical scene understanding and autonomous robotic surgery. While model development and validation are primarily conducted on idealistic scenes, geometric domain shifts, such as occlusions of the situs, are common in real-world open surgeries. To close this gap, we (1) present the first analysis of state-of-the-art (SOA) semantic segmentation models when faced with geometric out-of-distribution (OOD) data, and (2) propose an augmentation technique called "Organ Transplantation", to enhance generalizability. Our comprehensive validation on six different OOD datasets, comprising 600 RGB and hyperspectral imaging (HSI) cubes from 33 pigs, each annotated with 19 classes, reveals a large performance drop in SOA organ segmentation models on geometric OOD data. This performance decline is observed not only in conventional RGB data (with a dice similarity coefficient (DSC) drop of 46 %) but also in HSI data (with a DSC drop of 45 %), despite the richer spectral information content. The performance decline increases with the spatial granularity of the input data. Our augmentation technique improves SOA model performance by up to 67 % for RGB data and 90 % for HSI data, achieving performance at the level of in-distribution performance on real OOD test data. Given the simplicity and effectiveness of our augmentation method, it is a valuable tool for addressing geometric domain shifts in surgical scene segmentation, regardless of the underlying model. Our code and pre-trained models are publicly available at https://github.com/IMSY-DKFZ/htc.
Paper Structure (14 sections, 9 figures)

This paper contains 14 sections, 9 figures.

Figures (9)

  • Figure 1: Handling geometric domain shifts in the semantic segmentation of open surgery images. We address the generalizability of surgical scene segmentation algorithms when faced with *ood geometries for two modalities (RGB and *hsi) and different spatial granularities of the input data (from pixels to images). Our proposed Organ Transplantation augmentation method outperforms topology-altering augmentation techniques adapted from the general computer vision community and yields performance on par with in-distribution performance.
  • Figure 2: Concept of our Organ Transplantation augmentation and experimental setup. (a) Our Organ Transplantation augmentation involves transferring image features and corresponding segmentations of randomly selected organs (here: stomach and spleen) between images within a batch. (b) We compare the generalization performance of state-of-the-art organ segmentation models under geometric domain shifts upon equipping the models with either the proposed new approach or one of the six adapted data augmentation techniques. Our test datasets encompass the geometric out-of-distribution scenarios (I) organs in isolation, (II) organ resections, and (III) situs occlusions, in addition to in-distribution data. This figure is adapted from sellner_context_2023.
  • Figure 3: Performance degradation in response to geometric domain shifts increases with an increasing spatial granularity. The segmentation performance on our in-distribution (highlighted in italic) and *ood scenarios (columns) is shown as a function of the input modality and spatial granularity (rows; top: *hsi, bottom: RGB). The numbers denote the average class *dsc, with standard deviations across classes indicated in brackets. The color-coding is based on the change in *dsc relative to the in-distribution *dsc of the same model. Hierarchical aggregation was performed first across all images of a single subject, then across subjects, and finally across class-level *dsc scores.
  • Figure 4: Performance decline after organ removal is related to the local neighborhood. (a) Performance difference of the image#HSI model in response to class removals. The $i,j$-th entry displays the change in the average *dsc of the $j$-th class when the $i$-th class is replaced with zeros. For clarity, values of $\left| \Delta \acs*{dsc} < 0.01 \right|$ are omitted. (b) Organ neighborhood matrix for the original test dataset. The $(i,j)$-th entry indicates the average proportion of boundary pixels that the observed organ class $j$ shares with the organ class $i$. Values below 0.1% are not shown for clarity. For both subfigures, the aggregation was performed hierarchically by first averaging the proportions/scores across all images of one subject, and subsequently averaging across subjects. The performance matrix is adapted from sellner_context_2023.
  • Figure 5: The proposed Organ Transplantation augmentation compensates for geometric domain shifts. The dot and box plots show the segmentation performance of the baseline image model and our Organ Transplantation model for the modalities *hsi (top) and RGB (bottom) across six geometric out-of-distribution datasets and two in-distribution datasets (highlighted in italic). Each boxplot displays the *iqr of the *dsc with the median (solid line) and mean (dotted line) and whiskers extending up to 1.5 times the *iqr. Each point represents the average *dsc of one organ class. The aggregation was performed hierarchically by first averaging the *dsc values across all images of one subject, and subsequently averaging across subjects. Results for the *nsd are presented in \ref{['fig:task_performance_nsd']}. This figure is adapted from sellner_context_2023.
  • ...and 4 more figures