Context Matters: Query-aware Dynamic Long Sequence Modeling of Gigapixel Images

Zhengrui Guo; Qichen Sun; Jiabo Ma; Lishuang Feng; Jinzhuo Wang; Hao Chen

Context Matters: Query-aware Dynamic Long Sequence Modeling of Gigapixel Images

Zhengrui Guo, Qichen Sun, Jiabo Ma, Lishuang Feng, Jinzhuo Wang, Hao Chen

TL;DR

Querent proposes a query-aware, dynamic long-context modeling framework for gigapixel WSIs that retains the expressive power of full self-attention while dramatically reducing computation via region-level metadata and selective attention. The method introduces min-max region summarization, a region-importance estimator, and query-guided attention to achieve near-linear scaling for long sequences. Theoretical guarantees bound the approximation error to full self-attention, and extensive experiments across biomarker, mutation, subtyping, and survival tasks show state-of-the-art performance on over 10 WSI datasets. These contributions offer a practical, scalable solution for deep learning in computational pathology, with potential for broad clinical impact after further validation.

Abstract

Whole slide image (WSI) analysis presents significant computational challenges due to the massive number of patches in gigapixel images. While transformer architectures excel at modeling long-range correlations through self-attention, their quadratic computational complexity makes them impractical for computational pathology applications. Existing solutions like local-global or linear self-attention reduce computational costs but compromise the strong modeling capabilities of full self-attention. In this work, we propose Querent, i.e., the query-aware long contextual dynamic modeling framework, which achieves a theoretically bounded approximation of full self-attention while delivering practical efficiency. Our method adaptively predicts which surrounding regions are most relevant for each patch, enabling focused yet unrestricted attention computation only with potentially important contexts. By using efficient region-wise metadata computation and importance estimation, our approach dramatically reduces computational overhead while preserving global perception to model fine-grained patch correlations. Through comprehensive experiments on biomarker prediction, gene mutation prediction, cancer subtyping, and survival analysis across over 10 WSI datasets, our method demonstrates superior performance compared to the state-of-the-art approaches. Codes are available at https://github.com/dddavid4real/Querent.

Context Matters: Query-aware Dynamic Long Sequence Modeling of Gigapixel Images

TL;DR

Abstract

Context Matters: Query-aware Dynamic Long Sequence Modeling of Gigapixel Images

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (11)