Table of Contents
Fetching ...

Are the Latent Representations of Foundation Models for Pathology Invariant to Rotation?

Matouš Elphick, Samra Turajlic, Guang Yang

TL;DR

This work examines whether latent representations from self-supervised foundation models for digital pathology are invariant to patch rotation. By rotating H&E patches across twelve foundation models and evaluating alignment with mutual kNN and cosine distance, it shows that rotation augmentation during training yields greater invariance, while transformers lack inherent rotational bias. The findings reveal model-to-model variability in invariance and highlight alignment challenges at certain rotation angles, underscoring the practical impact on robustness for downstream pathology tasks. The study informs design choices for foundation models in pathology by emphasizing rotation augmentation to improve reliability across rotated inputs.

Abstract

Self-supervised foundation models for digital pathology encode small patches from H\&E whole slide images into latent representations used for downstream tasks. However, the invariance of these representations to patch rotation remains unexplored. This study investigates the rotational invariance of latent representations across twelve foundation models by quantifying the alignment between non-rotated and rotated patches using mutual $k$-nearest neighbours and cosine distance. Models that incorporated rotation augmentation during self-supervised training exhibited significantly greater invariance to rotations. We hypothesise that the absence of rotational inductive bias in the transformer architecture necessitates rotation augmentation during training to achieve learned invariance. Code: https://github.com/MatousE/rot-invariance-analysis.

Are the Latent Representations of Foundation Models for Pathology Invariant to Rotation?

TL;DR

This work examines whether latent representations from self-supervised foundation models for digital pathology are invariant to patch rotation. By rotating H&E patches across twelve foundation models and evaluating alignment with mutual kNN and cosine distance, it shows that rotation augmentation during training yields greater invariance, while transformers lack inherent rotational bias. The findings reveal model-to-model variability in invariance and highlight alignment challenges at certain rotation angles, underscoring the practical impact on robustness for downstream pathology tasks. The study informs design choices for foundation models in pathology by emphasizing rotation augmentation to improve reliability across rotated inputs.

Abstract

Self-supervised foundation models for digital pathology encode small patches from H\&E whole slide images into latent representations used for downstream tasks. However, the invariance of these representations to patch rotation remains unexplored. This study investigates the rotational invariance of latent representations across twelve foundation models by quantifying the alignment between non-rotated and rotated patches using mutual -nearest neighbours and cosine distance. Models that incorporated rotation augmentation during self-supervised training exhibited significantly greater invariance to rotations. We hypothesise that the absence of rotational inductive bias in the transformer architecture necessitates rotation augmentation during training to achieve learned invariance. Code: https://github.com/MatousE/rot-invariance-analysis.

Paper Structure

This paper contains 8 sections, 2 equations, 3 figures.

Figures (3)

  • Figure 1: (Left) Representations are compared based on similarity of their $k$ nearest neighbors, here k = 1. (Right) Distance is measured by the cosine of the angle between representations.
  • Figure 2: Graphs showing how model characteristics relate to representation invariance under rotation. (Left) Mean mutual $k$-nearest neighbour across rotations, with higher values indicating greater invariance. (Right) Mean cosine distance, where lower values indicate stronger alignment in latent space representations under rotation.
  • Figure 3: (Top) Mutual $k$-nearest neighbour across rotation angles, where higher values indicate greater invariance. (Bottom) Cosine distance across rotations, with lower values indicating greater latent representation alignment.