Unit-Based Histopathology Tissue Segmentation via Multi-Level Feature Representation

Ashkan Shakarami; Azade Farshad; Yousef Yeganeh; Lorenzo Nicole; Peter Schüffler; Stefano Ghidoni; Nassir Navab

Unit-Based Histopathology Tissue Segmentation via Multi-Level Feature Representation

Ashkan Shakarami, Azade Farshad, Yousef Yeganeh, Lorenzo Nicole, Peter Schüffler, Stefano Ghidoni, Nassir Navab

TL;DR

This paper tackles the annotation and computational bottlenecks of pixel-wise histopathology segmentation by introducing a unit-based framework (UTS) that classifies fixed-size tiles of $32 \times 32$ pixels. Central to UTS is L-ViT, a Multi-Level Vision Transformer with an EfficientNetB3 backbone, MLFF, and attention modules (DAT-SE, D-CBAM) that captures both local morphology and global tissue context. The approach demonstrates superior performance on a large, tile-based breast tissue dataset, outperforming CNN baselines and state-of-the-art pixel-wise models in DSC and IoU while offering substantial efficiency gains. A refinement stage (Neighborhood-Based Smoothing and Class Discretization) enhances boundary coherence and interpretability, supporting clinical workflow through WSI overlays and quantitative tissue composition analysis. The work suggests unit-based segmentation as a scalable, annotation-efficient paradigm for digital pathology with practical impact for tumor quantification and surgical margin assessment.

Abstract

We propose UTS, a unit-based tissue segmentation framework for histopathology that classifies each fixed-size 32 * 32 tile, rather than each pixel, as the segmentation unit. This approach reduces annotation effort and improves computational efficiency without compromising accuracy. To implement this approach, we introduce a Multi-Level Vision Transformer (L-ViT), which benefits the multi-level feature representation to capture both fine-grained morphology and global tissue context. Trained to segment breast tissue into three categories (infiltrating tumor, non-neoplastic stroma, and fat), UTS supports clinically relevant tasks such as tumor-stroma quantification and surgical margin assessment. Evaluated on 386,371 tiles from 459 H&E-stained regions, it outperforms U-Net variants and transformer-based baselines. Code and Dataset will be available at GitHub.

Unit-Based Histopathology Tissue Segmentation via Multi-Level Feature Representation

TL;DR

This paper tackles the annotation and computational bottlenecks of pixel-wise histopathology segmentation by introducing a unit-based framework (UTS) that classifies fixed-size tiles of

pixels. Central to UTS is L-ViT, a Multi-Level Vision Transformer with an EfficientNetB3 backbone, MLFF, and attention modules (DAT-SE, D-CBAM) that captures both local morphology and global tissue context. The approach demonstrates superior performance on a large, tile-based breast tissue dataset, outperforming CNN baselines and state-of-the-art pixel-wise models in DSC and IoU while offering substantial efficiency gains. A refinement stage (Neighborhood-Based Smoothing and Class Discretization) enhances boundary coherence and interpretability, supporting clinical workflow through WSI overlays and quantitative tissue composition analysis. The work suggests unit-based segmentation as a scalable, annotation-efficient paradigm for digital pathology with practical impact for tumor quantification and surgical margin assessment.

Unit-Based Histopathology Tissue Segmentation via Multi-Level Feature Representation

TL;DR

Abstract

Unit-Based Histopathology Tissue Segmentation via Multi-Level Feature Representation

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (6)