Table of Contents
Fetching ...

SwIPE: Efficient and Robust Medical Image Segmentation with Implicit Patch Embeddings

Yejia Zhang, Pengfei Gu, Nishchal Sapkota, Danny Z. Chen

TL;DR

SwIPE introduces a patch-based implicit representation for medical image segmentation to overcome the limitations of discrete masks and global/pointwise INRs. By encoding images into multi-scale patch embeddings and a global image embedding, and decoding with both patch- and image-level occupancies, SwIPE achieves accurate local boundaries while maintaining global shape coherence. The approach employs a Multi-stage Embedding Attention (MEA) to fuse features across scales and a Stochastic Patch Overreach (SPO) to encourage boundary continuity, resulting in superior performance with far fewer parameters and robust data-shift behavior across 2D polyp and 3D organ segmentation tasks. Empirical results show SwIPE outperforms state-of-the-art implicit and discrete methods, with notable gains in accuracy, efficiency, and data/utilization robustness, suggesting a promising new direction for medical image segmentation with local INRs.

Abstract

Modern medical image segmentation methods primarily use discrete representations in the form of rasterized masks to learn features and generate predictions. Although effective, this paradigm is spatially inflexible, scales poorly to higher-resolution images, and lacks direct understanding of object shapes. To address these limitations, some recent works utilized implicit neural representations (INRs) to learn continuous representations for segmentation. However, these methods often directly adopted components designed for 3D shape reconstruction. More importantly, these formulations were also constrained to either point-based or global contexts, lacking contextual understanding or local fine-grained details, respectively--both critical for accurate segmentation. To remedy this, we propose a novel approach, SwIPE (Segmentation with Implicit Patch Embeddings), that leverages the advantages of INRs and predicts shapes at the patch level--rather than at the point level or image level--to enable both accurate local boundary delineation and global shape coherence. Extensive evaluations on two tasks (2D polyp segmentation and 3D abdominal organ segmentation) show that SwIPE significantly improves over recent implicit approaches and outperforms state-of-the-art discrete methods with over 10x fewer parameters. Our method also demonstrates superior data efficiency and improved robustness to data shifts across image resolutions and datasets. Code is available on Github (https://github.com/charzharr/miccai23-swipe-implicit-segmentation).

SwIPE: Efficient and Robust Medical Image Segmentation with Implicit Patch Embeddings

TL;DR

SwIPE introduces a patch-based implicit representation for medical image segmentation to overcome the limitations of discrete masks and global/pointwise INRs. By encoding images into multi-scale patch embeddings and a global image embedding, and decoding with both patch- and image-level occupancies, SwIPE achieves accurate local boundaries while maintaining global shape coherence. The approach employs a Multi-stage Embedding Attention (MEA) to fuse features across scales and a Stochastic Patch Overreach (SPO) to encourage boundary continuity, resulting in superior performance with far fewer parameters and robust data-shift behavior across 2D polyp and 3D organ segmentation tasks. Empirical results show SwIPE outperforms state-of-the-art implicit and discrete methods, with notable gains in accuracy, efficiency, and data/utilization robustness, suggesting a promising new direction for medical image segmentation with local INRs.

Abstract

Modern medical image segmentation methods primarily use discrete representations in the form of rasterized masks to learn features and generate predictions. Although effective, this paradigm is spatially inflexible, scales poorly to higher-resolution images, and lacks direct understanding of object shapes. To address these limitations, some recent works utilized implicit neural representations (INRs) to learn continuous representations for segmentation. However, these methods often directly adopted components designed for 3D shape reconstruction. More importantly, these formulations were also constrained to either point-based or global contexts, lacking contextual understanding or local fine-grained details, respectively--both critical for accurate segmentation. To remedy this, we propose a novel approach, SwIPE (Segmentation with Implicit Patch Embeddings), that leverages the advantages of INRs and predicts shapes at the patch level--rather than at the point level or image level--to enable both accurate local boundary delineation and global shape coherence. Extensive evaluations on two tasks (2D polyp segmentation and 3D abdominal organ segmentation) show that SwIPE significantly improves over recent implicit approaches and outperforms state-of-the-art discrete methods with over 10x fewer parameters. Our method also demonstrates superior data efficiency and improved robustness to data shifts across image resolutions and datasets. Code is available on Github (https://github.com/charzharr/miccai23-swipe-implicit-segmentation).
Paper Structure (12 sections, 1 figure, 3 tables)

This paper contains 12 sections, 1 figure, 3 tables.

Figures (1)

  • Figure 1: At a high level, SwIPE first encodes an input image into patch $\textbf{z}^\mathbb{P}$ and image $\textbf{z}^\mathbb{I}$ shape embeddings, and then employs these embeddings along with coordinate information $\textbf{p}$ to predict class occupancy scores via the patch $\textbf{D}^\mathbb{P}$ and image $\textbf{D}^\mathbb{I}$ decoders.