Table of Contents
Fetching ...

Beyond Linearity: Squeeze-and-Recalibrate Blocks for Few-Shot Whole Slide Image Classification

Conghao Xiong, Zhengrui Guo, Zhe Xu, Yifei Zhang, Raymond Kai-Yu Tong, Si Yong Yeo, Hao Chen, Joseph J. Y. Sung, Irwin King

TL;DR

The paper tackles few-shot whole-slide image classification where limited annotations lead to overfitting and mischaracterized features. It introduces a Squeeze-and-Recalibrate (SR) block as a drop-in replacement for linear layers in MIL: a trainable low-rank Squeeze Pathway and a frozen recalibration matrix that enriches representations. The authors prove a universal approximation result and geometry-preserving properties for SR, ensuring performance at least matches the linear baseline while offering richer feature directions. Empirically, SR-MIL consistently outperforms state-of-the-art methods on Camelyon16, TCGA-NSCLC, and TCGA-RCC across shot settings with far fewer trainable parameters and no architectural changes. These findings suggest SR blocks enable robust, data-efficient pathology analysis and hold promise for broader applicability in few-shot learning across domains.

Abstract

Deep learning has advanced computational pathology but expert annotations remain scarce. Few-shot learning mitigates annotation burdens yet suffers from overfitting and discriminative feature mischaracterization. In addition, the current few-shot multiple instance learning (MIL) approaches leverage pretrained vision-language models to alleviate these issues, but at the cost of complex preprocessing and high computational cost. We propose a Squeeze-and-Recalibrate (SR) block, a drop-in replacement for linear layers in MIL models to address these challenges. The SR block comprises two core components: a pair of low-rank trainable matrices (squeeze pathway, SP) that reduces parameter count and imposes a bottleneck to prevent spurious feature learning, and a frozen random recalibration matrix that preserves geometric structure, diversifies feature directions, and redefines the optimization objective for the SP. We provide theoretical guarantees that the SR block can approximate any linear mapping to arbitrary precision, thereby ensuring that the performance of a standard MIL model serves as a lower bound for its SR-enhanced counterpart. Extensive experiments demonstrate that our SR-MIL models consistently outperform prior methods while requiring significantly fewer parameters and no architectural changes.

Beyond Linearity: Squeeze-and-Recalibrate Blocks for Few-Shot Whole Slide Image Classification

TL;DR

The paper tackles few-shot whole-slide image classification where limited annotations lead to overfitting and mischaracterized features. It introduces a Squeeze-and-Recalibrate (SR) block as a drop-in replacement for linear layers in MIL: a trainable low-rank Squeeze Pathway and a frozen recalibration matrix that enriches representations. The authors prove a universal approximation result and geometry-preserving properties for SR, ensuring performance at least matches the linear baseline while offering richer feature directions. Empirically, SR-MIL consistently outperforms state-of-the-art methods on Camelyon16, TCGA-NSCLC, and TCGA-RCC across shot settings with far fewer trainable parameters and no architectural changes. These findings suggest SR blocks enable robust, data-efficient pathology analysis and hold promise for broader applicability in few-shot learning across domains.

Abstract

Deep learning has advanced computational pathology but expert annotations remain scarce. Few-shot learning mitigates annotation burdens yet suffers from overfitting and discriminative feature mischaracterization. In addition, the current few-shot multiple instance learning (MIL) approaches leverage pretrained vision-language models to alleviate these issues, but at the cost of complex preprocessing and high computational cost. We propose a Squeeze-and-Recalibrate (SR) block, a drop-in replacement for linear layers in MIL models to address these challenges. The SR block comprises two core components: a pair of low-rank trainable matrices (squeeze pathway, SP) that reduces parameter count and imposes a bottleneck to prevent spurious feature learning, and a frozen random recalibration matrix that preserves geometric structure, diversifies feature directions, and redefines the optimization objective for the SP. We provide theoretical guarantees that the SR block can approximate any linear mapping to arbitrary precision, thereby ensuring that the performance of a standard MIL model serves as a lower bound for its SR-enhanced counterpart. Extensive experiments demonstrate that our SR-MIL models consistently outperform prior methods while requiring significantly fewer parameters and no architectural changes.

Paper Structure

This paper contains 45 sections, 3 theorems, 49 equations, 4 figures, 9 tables.

Key Result

Theorem 1

Let $\boldsymbol{A}_{*}\in\mathbb{R}^{d_0\times d_1}$ have full rank $r_A=\min\{d_0,d_1\}$. Draw a frozen matrix $\boldsymbol{B}\in\mathbb{R}^{d_0\times d_1}$ with sub-Gaussian entries. Then almost surely: for every $\varepsilon>0$ there exist integers $r\le r_A$ and matrices $\boldsymbol{W}_2$ and

Figures (4)

  • Figure 1: Illustration of the linear transformation (left), our proposed SR block (middle), and the overall pipeline of MIL for WSI classification task and where our SR block takes effect (right).
  • Figure 2: Sensitivity analysis of parameter $r$ across multiple datasets using LCLAM and LTransMIL. Dashed lines indicate the performance of the respective baseline methods (CLAM and TransMIL).
  • Figure 3: SR-CLAM-generated heatmaps for tumor049 (Camelyon16). The left panel shows heatmaps for three biopsy regions, while the right panel displays fine-grained patches with their heatmaps. Red indicates high attention (tumor), whereas blue indicates low attention (normal tissue).
  • Figure 4: Sensitivity analysis of parameter $r$ across multiple datasets using LCLAM and LTransMIL. Dashed lines indicate the performance of the respective baseline methods (CLAM and TransMIL).

Theorems & Definitions (6)

  • Theorem 1: Universal Approximation by Squeeze-and-Recalibrate Block
  • proof
  • Theorem 2
  • proof
  • Theorem 3: Rank of a Low-Rank Product
  • proof