RobustSurg: Tackling domain generalisation for out-of-distribution surgical scene segmentation

Mansoor Ali; Maksim Richards; Gilberto Ochoa-Ruiz; Sharib Ali

RobustSurg: Tackling domain generalisation for out-of-distribution surgical scene segmentation

Mansoor Ali, Maksim Richards, Gilberto Ochoa-Ruiz, Sharib Ali

TL;DR

RobustSurg tackles domain generalisation in surgical scene segmentation under cross-centre and cross-modality shifts. It introduces a Domain-invariant Feature Encoder with Style Normalization and Restitution and Instance Selective Whitening to preserve discriminative content while removing style variations, along with a new HeiCholeSeg multicentre dataset. The method achieves state-of-the-art mean IoU on IID CholecSeg8K and significant improvements on OOD HeiCholeSeg, EndoUDA, and cataract datasets, demonstrating robust generalisation with a modest computational cost increase. The work provides a valuable benchmark and practical approach for reliable surgical scene understanding in diverse clinical settings.

Abstract

While recent advances in deep learning for surgical scene segmentation have demonstrated promising results on single-centre and single-imaging modality data, these methods usually do not generalise to unseen distribution (i.e., from other centres) and unseen modalities. Current literature for tackling generalisation on out-of-distribution data and domain gaps due to modality changes has been widely researched but mostly for natural scene data. However, these methods cannot be directly applied to the surgical scenes due to limited visual cues and often extremely diverse scenarios compared to the natural scene data. Inspired by these works in natural scenes to push generalisability on OOD data, we hypothesise that exploiting the style and content information in the surgical scenes could minimise the appearances, making it less variable to sudden changes such as blood or imaging artefacts. This can be achieved by performing instance normalisation and feature covariance mapping techniques for robust and generalisable feature representations. Further, to eliminate the risk of removing salient feature representation associated with the objects of interest, we introduce a restitution module within the feature learning ResNet backbone that can enable the retention of useful task-relevant features. To tackle the lack of multiclass and multicentre data for surgical scene segmentation, we also provide a newly curated dataset that can be vital for addressing generalisability in this domain. Our proposed RobustSurg obtained nearly 23% improvement on the baseline DeepLabv3+ and from 10-32% improvement on the SOTA in terms of mean IoU score on an unseen centre HeiCholSeg dataset when trained on CholecSeg8K. Similarly, RobustSurg also obtained nearly 22% improvement over the baseline and nearly 11% improvement on a recent SOTA method for the target set of the EndoUDA polyp dataset.

RobustSurg: Tackling domain generalisation for out-of-distribution surgical scene segmentation

TL;DR

Abstract

RobustSurg: Tackling domain generalisation for out-of-distribution surgical scene segmentation

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (12)