SlideGCD: Slide-based Graph Collaborative Training with Knowledge Distillation for Whole Slide Image Classification
Tong Shu, Jun Shi, Dongdong Sun, Zhiguo Jiang, Yushan Zheng
TL;DR
WSI classification historically relies on patch-level MIL, often neglecting inter-slide relationships across WSIs. SlideGCD introduces a slide-based graph branch with a rehearsal-based Node Buffer and an adaptive graph generator to capture inter-slide correlations, coupled with a two-branch training regime and knowledge distillation from the MIL backbone to the graph model. The method employs a 2-layer Hypergraph Convolution with Centering-Attention and a distillation loss $L_{KD}=L_{JS}(\hat{y}_G, \hat{y}_{MIL}, \hat{t})$, optimizing $L = L_{CE}(\hat{y}_{MIL}, Y) + L_{CE}(\hat{y}_G, Y) + w \cdot L_{KD}$. Evaluations on TCGA BRCA and NSCLC show consistent improvements across four MIL baselines with modest computational overhead, demonstrating that leveraging slide-level correlations enhances WSI classification performance and robustness.
Abstract
Existing WSI analysis methods lie on the consensus that histopathological characteristics of tumors are significant guidance for cancer diagnostics. Particularly, as the evolution of cancers is a continuous process, the correlations and differences across various stages, anatomical locations and patients should be taken into account. However, recent research mainly focuses on the inner-contextual information in a single WSI, ignoring the correlations between slides. To verify whether introducing the slide inter-correlations can bring improvements to WSI representation learning, we propose a generic WSI analysis pipeline SlideGCD that considers the existing multi-instance learning (MIL) methods as the backbone and forge the WSI classification task as a node classification problem. More specifically, SlideGCD declares a node buffer that stores previous slide embeddings for subsequent extensive slide-based graph construction and conducts graph learning to explore the inter-correlations implied in the slide-based graph. Moreover, we frame the MIL classifier and graph learning into two parallel workflows and deploy the knowledge distillation to transfer the differentiable information to the graph neural network. The consistent performance boosting, brought by SlideGCD, of four previous state-of-the-art MIL methods is observed on two TCGA benchmark datasets. The code is available at https://github.com/HFUT-miaLab/SlideGCD.
