Table of Contents
Fetching ...

WSI-INR: Implicit Neural Representations for Lesion Segmentation in Whole-Slide Images

Yunheng Wu, Wenqi Huang, Liangyi Wang, Masahiro Oda, Yuichiro Hayashi, Daniel Rueckert, Kensaku Mori

TL;DR

WSI-INR enables INRs to segment highly heterogeneous pathological lesions beyond structurally consistent anatomical tissues, offering a fresh perspective for pathological analysis.

Abstract

Whole-slide images (WSIs) are fundamental for computational pathology, where accurate lesion segmentation is critical for clinical decision making. Existing methods partition WSIs into discrete patches, disrupting spatial continuity and treating multi-resolution views as independent samples, which leads to spatially fragmented segmentation and reduced robustness to resolution variations. To address the issues, we propose WSI-INR, a novel patch-free framework based on Implicit Neural Representations (INRs). WSI-INR models the WSI as a continuous implicit function mapping spatial coordinates directly to tissue semantics features, outputting segmentation results while preserving intrinsic spatial information across the entire slide. In the WSI-INR, we incorporate multi-resolution hash grid encoding to regard different resolution levels as varying sampling densities of the same continuous tissue, achieving a consistent feature representation across resolutions. In addition, by jointly training a shared INR decoder, WSI-INR can capture general priors across different cases. Experimental results showed that WSI-INR maintains robust segmentation performance across resolutions; at Base/4, our resolution-specific optimization improves Dice score by +26.11%, while U-Net and TransUNet decrease by 54.28% and 36.18%, respectively. Crucially, this work enables INRs to segment highly heterogeneous pathological lesions beyond structurally consistent anatomical tissues, offering a fresh perspective for pathological analysis.

WSI-INR: Implicit Neural Representations for Lesion Segmentation in Whole-Slide Images

TL;DR

WSI-INR enables INRs to segment highly heterogeneous pathological lesions beyond structurally consistent anatomical tissues, offering a fresh perspective for pathological analysis.

Abstract

Whole-slide images (WSIs) are fundamental for computational pathology, where accurate lesion segmentation is critical for clinical decision making. Existing methods partition WSIs into discrete patches, disrupting spatial continuity and treating multi-resolution views as independent samples, which leads to spatially fragmented segmentation and reduced robustness to resolution variations. To address the issues, we propose WSI-INR, a novel patch-free framework based on Implicit Neural Representations (INRs). WSI-INR models the WSI as a continuous implicit function mapping spatial coordinates directly to tissue semantics features, outputting segmentation results while preserving intrinsic spatial information across the entire slide. In the WSI-INR, we incorporate multi-resolution hash grid encoding to regard different resolution levels as varying sampling densities of the same continuous tissue, achieving a consistent feature representation across resolutions. In addition, by jointly training a shared INR decoder, WSI-INR can capture general priors across different cases. Experimental results showed that WSI-INR maintains robust segmentation performance across resolutions; at Base/4, our resolution-specific optimization improves Dice score by +26.11%, while U-Net and TransUNet decrease by 54.28% and 36.18%, respectively. Crucially, this work enables INRs to segment highly heterogeneous pathological lesions beyond structurally consistent anatomical tissues, offering a fresh perspective for pathological analysis.
Paper Structure (5 sections, 5 equations, 4 figures, 2 tables)

This paper contains 5 sections, 5 equations, 4 figures, 2 tables.

Figures (4)

  • Figure 1: (a) Patch-based methods. (b) Our WSI-INR. (c) WSI resolution pyramid and multi-resolution hash encoding. (d) Patches across resolutions exhibit inconsistent representations. Visualization of U-Net bottleneck features trained on a WSI dataset.
  • Figure 2: Workflow of our WSI-INR. Training: We first optimize the decoder, reconstruction head and hash encoding by minimizing the reconstruction loss. These modules are frozen, and only the segmentation head is trained. Inference: For each unseen WSI, network parameters remain fixed, while the hash encoding is optimized via the reconstruction loss. The final segmentation is obtained from the optimized representation.
  • Figure 3: We compare our WSI-INR with U-Net and TransUNet across multi resolutions, while all models are trained at a single base resolution. For WSI-INR, we adopt base-resolution optimization. U-Net performs well at training resolution, but U-Net and TransUNet degrade at lower resolutions, with fragmented predictions (black arrows).
  • Figure 4: We visualize the reconstruction and segmentation results under three settings: high-level only (dense sampling), low-level only, and the full hash grid. Columns: (a) global reconstruction, (b) zoomed-in local patch, (c) frequency spectrum (FFT) of local patch, (d) global segmentation prediction, and (e) local segmentation mask.