CARMIL: Context-Aware Regularization on Multiple Instance Learning models for Whole Slide Images

Thiziri Nait Saada; Valentina Di Proietto; Benoit Schmauch; Katharina Von Loga; Lucas Fidon

CARMIL: Context-Aware Regularization on Multiple Instance Learning models for Whole Slide Images

Thiziri Nait Saada, Valentina Di Proietto, Benoit Schmauch, Katharina Von Loga, Lucas Fidon

TL;DR

This work tackles the spatial-context gap in Multiple Instance Learning for Whole Slide Images by introducing Context-Aware Regularization (CAR). CAR adds a spatial encoder and decoder to learn embeddings $Z$ that encode tile neighborhood structure and enforces reconstruction of the tile adjacency via a CAR loss, enabling any MIL model to leverage spatial context without redesigning the backbone. A DeltaCon-based metric is proposed to quantify Context-Awareness by comparing spatial and embedding-space adjacencies, providing a tool to assess spatial coherence of tile embeddings. Across glioblastoma and colon cancer survival tasks, CAR improves C-index by up to $4.8$ percentage points and yields more spatially coherent embeddings, with parameter analyses offering guidance for practical deployment.

Abstract

Multiple Instance Learning (MIL) models have proven effective for cancer prognosis from Whole Slide Images. However, the original MIL formulation incorrectly assumes the patches of the same image to be independent, leading to a loss of spatial context as information flows through the network. Incorporating contextual knowledge into predictions is particularly important given the inclination for cancerous cells to form clusters and the presence of spatial indicators for tumors. State-of-the-art methods often use attention mechanisms eventually combined with graphs to capture spatial knowledge. In this paper, we take a novel and transversal approach, addressing this issue through the lens of regularization. We propose Context-Aware Regularization for Multiple Instance Learning (CARMIL), a versatile regularization scheme designed to seamlessly integrate spatial knowledge into any MIL model. Additionally, we present a new and generic metric to quantify the Context-Awareness of any MIL model when applied to Whole Slide Images, resolving a previously unexplored gap in the field. The efficacy of our framework is evaluated for two survival analysis tasks on glioblastoma (TCGA GBM) and colon cancer data (TCGA COAD).

CARMIL: Context-Aware Regularization on Multiple Instance Learning models for Whole Slide Images

TL;DR

that encode tile neighborhood structure and enforces reconstruction of the tile adjacency via a CAR loss, enabling any MIL model to leverage spatial context without redesigning the backbone. A DeltaCon-based metric is proposed to quantify Context-Awareness by comparing spatial and embedding-space adjacencies, providing a tool to assess spatial coherence of tile embeddings. Across glioblastoma and colon cancer survival tasks, CAR improves C-index by up to

percentage points and yields more spatially coherent embeddings, with parameter analyses offering guidance for practical deployment.

Abstract

Paper Structure (30 sections, 13 equations, 3 figures, 4 tables)

This paper contains 30 sections, 13 equations, 3 figures, 4 tables.

Introduction
Methods
Background: classical MIL framework for risk prediction
Preprocessing.
MIL model.
Context-Aware Regularization (CAR)
Graph construction.
Context-Aware Regularization.
DeltaCon for quantitative measure of Context-Awareness
Implementation details
Datasets for survival prediction using whole slide images
Glioblastoma data.
Colon cancer data.
Evaluation
Preprocessing
...and 15 more sections

Figures (3)

Figure 1: Our proposed framework, CARMIL, enhances any MIL model by incorporating spatial information through Context-Aware Regularization. In CARMIL, tiles embeddings are finetuned to allow for graph reconstruction by a spatial decoder.
Figure 2: Spatial information retained in CARMIL models with C-index performance. Context-Awareness is quantified using DeltaCon similarity between $A$ and $A(Z)$; see sec. \ref{['sec:delta_con']}. Each point represents an ensemble of CARMIL models from nested cross-validation, with filled areas indicating one standard deviation around the mean. The vertical dotted line shows the average Context-Awareness in the original features.
Figure 3: Heatmap of mean euclidean distances between the representation of each tile given by (a) Phikon, the feature extractor, and (b) CARABMIL spatial encoder to its $8$ spatial nearest neighbors. All mean distances were scaled together to $[0, 1]$. The WSI is taken from TCGA GBM.

CARMIL: Context-Aware Regularization on Multiple Instance Learning models for Whole Slide Images

TL;DR

Abstract

CARMIL: Context-Aware Regularization on Multiple Instance Learning models for Whole Slide Images

Authors

TL;DR

Abstract

Table of Contents

Figures (3)