Table of Contents
Fetching ...

Multi-style conversion for semantic segmentation of lesions in fundus images by adversarial attacks

Clément Playout, Renaud Duval, Marie Carole Boucher, Farida Cheriet

TL;DR

This paper introduces a novel method, termed adversarial style conversion, to address the lack of standardization in annotation styles across diverse databases, by training a single architecture on combined databases and demonstrating the ability to convert among different labeling styles.

Abstract

The diagnosis of diabetic retinopathy, which relies on fundus images, faces challenges in achieving transparency and interpretability when using a global classification approach. However, segmentation-based databases are significantly more expensive to acquire and combining them is often problematic. This paper introduces a novel method, termed adversarial style conversion, to address the lack of standardization in annotation styles across diverse databases. By training a single architecture on combined databases, the model spontaneously modifies its segmentation style depending on the input, demonstrating the ability to convert among different labeling styles. The proposed methodology adds a linear probe to detect dataset origin based on encoder features and employs adversarial attacks to condition the model's segmentation style. Results indicate significant qualitative and quantitative through dataset combination, offering avenues for improved model generalization, uncertainty estimation and continuous interpolation between annotation styles. Our approach enables training a segmentation model with diverse databases while controlling and leveraging annotation styles for improved retinopathy diagnosis.

Multi-style conversion for semantic segmentation of lesions in fundus images by adversarial attacks

TL;DR

This paper introduces a novel method, termed adversarial style conversion, to address the lack of standardization in annotation styles across diverse databases, by training a single architecture on combined databases and demonstrating the ability to convert among different labeling styles.

Abstract

The diagnosis of diabetic retinopathy, which relies on fundus images, faces challenges in achieving transparency and interpretability when using a global classification approach. However, segmentation-based databases are significantly more expensive to acquire and combining them is often problematic. This paper introduces a novel method, termed adversarial style conversion, to address the lack of standardization in annotation styles across diverse databases. By training a single architecture on combined databases, the model spontaneously modifies its segmentation style depending on the input, demonstrating the ability to convert among different labeling styles. The proposed methodology adds a linear probe to detect dataset origin based on encoder features and employs adversarial attacks to condition the model's segmentation style. Results indicate significant qualitative and quantitative through dataset combination, offering avenues for improved model generalization, uncertainty estimation and continuous interpolation between annotation styles. Our approach enables training a segmentation model with diverse databases while controlling and leveraging annotation styles for improved retinopathy diagnosis.

Paper Structure

This paper contains 32 sections, 8 equations, 12 figures, 10 tables.

Figures (12)

  • Figure 1: Classification of the images in each dataset into three quality levels, as assessed using MCF-Net.
  • Figure 2: Distributions $P^{(i)}(S, Q)$ for each lesion type for the five datasets. The crosses indicates the centroids of each dataset. Note that we use logarithmic scale to fit the distributions on a single graph: several orders of magnitude separate some centroids.
  • Figure 3: Graphical summary of our style conversion by adversarial attack.
  • Figure 4: Effect of random perturbations of the input images on the segmentations by $\mathop{\mathrm{\mathcal{M}_\mathcal{S}}}\nolimits$ (shown for four test images). Interestingly, the model appears to be robust to most perturbations. Compression artefacts may however partially fool the model toward a new style, as seen in the bottom left image in (c).
  • Figure 5: Accuracy of the probe depending on its position in the model. As the number of channels grows with the depth, the size $f$ of the input latent vector fed to the probe increases (it is extracted by spatial average pooling of the encoder's features).
  • ...and 7 more figures