Table of Contents
Fetching ...

GrapHist: Graph Self-Supervised Learning for Histopathology

Sevda Öğüt, Cédric Vincent-Cuaz, Natalia Dubljevic, Carlos Hurtado, Vaishnavi Subramanian, Pascal Frossard, Dorina Thanou

TL;DR

This work introduces GrapHist, a novel graph-based self-supervised learning framework for histopathology, which learns generalizable and structurally-informed embeddings that enable diverse downstream tasks that drastically outperforms fully-supervised graph models on cancer subtyping tasks.

Abstract

Self-supervised vision models have achieved notable success in digital pathology. However, their domain-agnostic transformer architectures are not originally designed to account for fundamental biological elements of histopathology images, namely cells and their complex interactions. In this work, we hypothesize that a biologically-informed modeling of tissues as cell graphs offers a more efficient representation learning. Thus, we introduce GrapHist, a novel graph-based self-supervised learning framework for histopathology, which learns generalizable and structurally-informed embeddings that enable diverse downstream tasks. GrapHist integrates masked autoencoders and heterophilic graph neural networks that are explicitly designed to capture the heterogeneity of tumor microenvironments. We pre-train GrapHist on a large collection of 11 million cell graphs derived from breast tissues and evaluate its transferability across in- and out-of-domain benchmarks. Our results show that GrapHist achieves competitive performance compared to its vision-based counterparts in slide-, region-, and cell-level tasks, while requiring four times fewer parameters. It also drastically outperforms fully-supervised graph models on cancer subtyping tasks. Finally, we also release five graph-based digital pathology datasets used in our study at https://huggingface.co/ogutsevda/datasets , establishing the first large-scale graph benchmark in this field. Our code is available at https://github.com/ogutsevda/graphist .

GrapHist: Graph Self-Supervised Learning for Histopathology

TL;DR

This work introduces GrapHist, a novel graph-based self-supervised learning framework for histopathology, which learns generalizable and structurally-informed embeddings that enable diverse downstream tasks that drastically outperforms fully-supervised graph models on cancer subtyping tasks.

Abstract

Self-supervised vision models have achieved notable success in digital pathology. However, their domain-agnostic transformer architectures are not originally designed to account for fundamental biological elements of histopathology images, namely cells and their complex interactions. In this work, we hypothesize that a biologically-informed modeling of tissues as cell graphs offers a more efficient representation learning. Thus, we introduce GrapHist, a novel graph-based self-supervised learning framework for histopathology, which learns generalizable and structurally-informed embeddings that enable diverse downstream tasks. GrapHist integrates masked autoencoders and heterophilic graph neural networks that are explicitly designed to capture the heterogeneity of tumor microenvironments. We pre-train GrapHist on a large collection of 11 million cell graphs derived from breast tissues and evaluate its transferability across in- and out-of-domain benchmarks. Our results show that GrapHist achieves competitive performance compared to its vision-based counterparts in slide-, region-, and cell-level tasks, while requiring four times fewer parameters. It also drastically outperforms fully-supervised graph models on cancer subtyping tasks. Finally, we also release five graph-based digital pathology datasets used in our study at https://huggingface.co/ogutsevda/datasets , establishing the first large-scale graph benchmark in this field. Our code is available at https://github.com/ogutsevda/graphist .
Paper Structure (44 sections, 3 equations, 6 figures, 15 tables, 1 algorithm)

This paper contains 44 sections, 3 equations, 6 figures, 15 tables, 1 algorithm.

Figures (6)

  • Figure 1: Pre-processing steps in GrapHist. Individual cells are segmented within a digital pathology image. These cells and their spatial arrangement are then converted into a graph. Each cell is a node with morphology, texture, and intensity descriptors. Edges connect neighboring cells and are weighted by their geographic distance. Best viewed in color.
  • Figure 2: Pre-training of GrapHist. We add a virtual node that is connected to all other nodes in the input cell graph and adopt a masked autoencoding strategy, where a subset of input node features is randomly masked. A GNN-based encoder–decoder architecture is then trained to recover the original node features, decoding from a re-masked version of the latent embeddings.
  • Figure 3: Kaplan-Meier curves with Cox risk groups.
  • Figure 4: Quantitative results of sensitivity analysis and patch size robustness.
  • Figure 5: Class label distributions of slide- and region-level datasets.
  • ...and 1 more figures