Table of Contents
Fetching ...

Detection of diabetic retinopathy using longitudinal self-supervised learning

Rachid Zeghlache, Pierre-Henri Conze, Mostafa El Habib Daho, Ramin Tadayoni, Pascal Massin, Béatrice Cochener, Gwenolé Quellec, Mathieu Lamard

TL;DR

This work probes longitudinal self-supervised learning to detect early diabetic retinopathy (DR) progression from pairs of consecutive color fundus photographs. It compares three LSSL-based pretext tasks—Longitudinal Siamese, LSSL with a trajectory direction constraint, and Longitudinal neighbourhood embedding—using a shared encoder to produce latent trajectory representations $\Delta z$ that capture disease dynamics. On the OPHDIAT dataset, LSSL variants, particularly Zhao 2021, achieve the best performance (AUC $\approx 0.962$) for early change detection from no/mild DR to more severe DR, significantly outperforming baselines and highlighting the latent space's capacity to encode progression. The results suggest that appropriately aligned longitudinal latent representations can meaningfully reflect DR progression and could enhance early screening and patient-specific management, though challenges remain in hyperparameter sensitivity and latent-space disentanglement. Overall, the study demonstrates the potential of longitudinal self-supervision to extract clinically relevant dynamic information from routinely collected CFPs.

Abstract

Longitudinal imaging is able to capture both static anatomical structures and dynamic changes in disease progression towards earlier and better patient-specific pathology management. However, conventional approaches for detecting diabetic retinopathy (DR) rarely take advantage of longitudinal information to improve DR analysis. In this work, we investigate the benefit of exploiting self-supervised learning with a longitudinal nature for DR diagnosis purposes. We compare different longitudinal self-supervised learning (LSSL) methods to model the disease progression from longitudinal retinal color fundus photographs (CFP) to detect early DR severity changes using a pair of consecutive exams. The experiments were conducted on a longitudinal DR screening dataset with or without those trained encoders (LSSL) acting as a longitudinal pretext task. Results achieve an AUC of 0.875 for the baseline (model trained from scratch) and an AUC of 0.96 (95% CI: 0.9593-0.9655 DeLong test) with a p-value < 2.2e-16 on early fusion using a simple ResNet alike architecture with frozen LSSL weights, suggesting that the LSSL latent space enables to encode the dynamic of DR progression.

Detection of diabetic retinopathy using longitudinal self-supervised learning

TL;DR

This work probes longitudinal self-supervised learning to detect early diabetic retinopathy (DR) progression from pairs of consecutive color fundus photographs. It compares three LSSL-based pretext tasks—Longitudinal Siamese, LSSL with a trajectory direction constraint, and Longitudinal neighbourhood embedding—using a shared encoder to produce latent trajectory representations that capture disease dynamics. On the OPHDIAT dataset, LSSL variants, particularly Zhao 2021, achieve the best performance (AUC ) for early change detection from no/mild DR to more severe DR, significantly outperforming baselines and highlighting the latent space's capacity to encode progression. The results suggest that appropriately aligned longitudinal latent representations can meaningfully reflect DR progression and could enhance early screening and patient-specific management, though challenges remain in hyperparameter sensitivity and latent-space disentanglement. Overall, the study demonstrates the potential of longitudinal self-supervision to extract clinically relevant dynamic information from routinely collected CFPs.

Abstract

Longitudinal imaging is able to capture both static anatomical structures and dynamic changes in disease progression towards earlier and better patient-specific pathology management. However, conventional approaches for detecting diabetic retinopathy (DR) rarely take advantage of longitudinal information to improve DR analysis. In this work, we investigate the benefit of exploiting self-supervised learning with a longitudinal nature for DR diagnosis purposes. We compare different longitudinal self-supervised learning (LSSL) methods to model the disease progression from longitudinal retinal color fundus photographs (CFP) to detect early DR severity changes using a pair of consecutive exams. The experiments were conducted on a longitudinal DR screening dataset with or without those trained encoders (LSSL) acting as a longitudinal pretext task. Results achieve an AUC of 0.875 for the baseline (model trained from scratch) and an AUC of 0.96 (95% CI: 0.9593-0.9655 DeLong test) with a p-value < 2.2e-16 on early fusion using a simple ResNet alike architecture with frozen LSSL weights, suggesting that the LSSL latent space enables to encode the dynamic of DR progression.
Paper Structure (10 sections, 4 equations, 5 figures)

This paper contains 10 sections, 4 equations, 5 figures.

Figures (5)

  • Figure 1: Evolution from no DR to severe NPDR for a patient in OPHDIAT ophdiat dataset.
  • Figure 2: The figure a) illustrates to longitudinal Siamese and takes as inputs a pair of consecutive images and predict the time between the examinations. The figure b) represents the longitudinal self-supervised learning which is composed of two independent modules, an AE and dense layers. The AE takes as input the pair of consecutive images and reconstruct the image pairs while the dense layer maps a dummy vector to the direction vector $\tau$. The figure c) corresponds to the LNE, and takes as input the consecutive pairs and build a dynamic graph to align in a neighborhood the subject-specific trajectory vector ($\Delta z$) and the pooled trajectory vector ($\Delta h$) that represents the local progression direction in latent space (green circle).
  • Figure 3: ROC Curve Analysis of the compared methods
  • Figure 4: Comparison of the approach on the early change detection with the frozen encoder.
  • Figure 5: Mean of the trajectory vector norm for the different self-supervised method used