Table of Contents
Fetching ...

Towards Effective and Efficient Context-aware Nucleus Detection in Histopathology Whole Slide Images

Zhongyi Shui, Ruizhe Guo, Honglin Li, Yuxuan Sun, Yunlong Zhang, Chenglu Zhu, Jiatong Cai, Pingyi Chen, Yanzhou Su, Lin Yang

TL;DR

This work tackles efficient context-aware nucleus detection in gigapixel histopathology WSIs by avoiding costly large FoV crops and instead aggregating contextual cues from surrounding patches seen during inference. It introduces a shared encoder for ROI and surrounding patches at the same magnification, uses gradient-free surrounding feature extraction, and applies self-attention to fuse context followed by cross-attention to inject it into ROI representations, with a grid-pooling step to reduce token count. The method achieves notable gains over state-of-the-art baselines in nucleus detection and segmentation on the OCELOT dataset, and introduces OCELOT-seg, a dedicated benchmark for context-aware nucleus segmentation, while delivering substantial speedups (about 3.26×) over previous approaches. These results demonstrate practical impact for rapid and accurate nucleus analysis in clinical histopathology, enabling scalable, context-aware inference on gigapixel WSIs.

Abstract

Nucleus detection in histopathology whole slide images (WSIs) is crucial for a broad spectrum of clinical applications. Current approaches for nucleus detection in gigapixel WSIs utilize a sliding window methodology, which overlooks boarder contextual information (eg, tissue structure) and easily leads to inaccurate predictions. To address this problem, recent studies additionally crops a large Filed-of-View (FoV) region around each sliding window to extract contextual features. However, such methods substantially increases the inference latency. In this paper, we propose an effective and efficient context-aware nucleus detection algorithm. Specifically, instead of leveraging large FoV regions, we aggregate contextual clues from off-the-shelf features of historically visited sliding windows. This design greatly reduces computational overhead. Moreover, compared to large FoV regions at a low magnification, the sliding window patches have higher magnification and provide finer-grained tissue details, thereby enhancing the detection accuracy. To further improve the efficiency, we propose a grid pooling technique to compress dense feature maps of each patch into a few contextual tokens. Finally, we craft OCELOT-seg, the first benchmark dedicated to context-aware nucleus instance segmentation. Code, dataset, and model checkpoints will be available at https://github.com/windygoo/PathContext.

Towards Effective and Efficient Context-aware Nucleus Detection in Histopathology Whole Slide Images

TL;DR

This work tackles efficient context-aware nucleus detection in gigapixel histopathology WSIs by avoiding costly large FoV crops and instead aggregating contextual cues from surrounding patches seen during inference. It introduces a shared encoder for ROI and surrounding patches at the same magnification, uses gradient-free surrounding feature extraction, and applies self-attention to fuse context followed by cross-attention to inject it into ROI representations, with a grid-pooling step to reduce token count. The method achieves notable gains over state-of-the-art baselines in nucleus detection and segmentation on the OCELOT dataset, and introduces OCELOT-seg, a dedicated benchmark for context-aware nucleus segmentation, while delivering substantial speedups (about 3.26×) over previous approaches. These results demonstrate practical impact for rapid and accurate nucleus analysis in clinical histopathology, enabling scalable, context-aware inference on gigapixel WSIs.

Abstract

Nucleus detection in histopathology whole slide images (WSIs) is crucial for a broad spectrum of clinical applications. Current approaches for nucleus detection in gigapixel WSIs utilize a sliding window methodology, which overlooks boarder contextual information (eg, tissue structure) and easily leads to inaccurate predictions. To address this problem, recent studies additionally crops a large Filed-of-View (FoV) region around each sliding window to extract contextual features. However, such methods substantially increases the inference latency. In this paper, we propose an effective and efficient context-aware nucleus detection algorithm. Specifically, instead of leveraging large FoV regions, we aggregate contextual clues from off-the-shelf features of historically visited sliding windows. This design greatly reduces computational overhead. Moreover, compared to large FoV regions at a low magnification, the sliding window patches have higher magnification and provide finer-grained tissue details, thereby enhancing the detection accuracy. To further improve the efficiency, we propose a grid pooling technique to compress dense feature maps of each patch into a few contextual tokens. Finally, we craft OCELOT-seg, the first benchmark dedicated to context-aware nucleus instance segmentation. Code, dataset, and model checkpoints will be available at https://github.com/windygoo/PathContext.

Paper Structure

This paper contains 12 sections, 2 equations, 5 figures, 5 tables.

Figures (5)

  • Figure 1: (a) Typical nucleus detection methods operate on gigapixel WSIs in a sliding window manner. The model detects nuclei in each window patch without understanding boarder tissue structure, which easily leads to inaccurate predictions. (b) Pathologists first zoom out to examine the tissue context at large FoVs and then zoom in to observe detailed nuclear morphology for accurate assessments ryu2023ocelot.
  • Figure 2: Pipeline comparison. (a) Typical nucleus detectors operate on patches of a single FoV without considering tissue context. (b) Previous context-aware nucleus detection approaches leverage a large FoV patch to extract contextual information. (c) Our method aggregates contextual information from off-the-shelf features of historically visited surrounding patches during whole-slide inference.
  • Figure 3: Training details of our proposed context-aware nucleus detection method.
  • Figure 4: Qualitative comparison results.
  • Figure 5: Effect of pooling grid number.