Table of Contents
Fetching ...

Is Dataset Quality Still a Concern in Diagnosis Using Large Foundation Model?

Ziqin Lin, Heng Li, Zinan Li, Huazhu Fu, Jiang Liu

TL;DR

The paper investigates whether dataset quality issues such as image quality and bias affect diagnosis with large foundation models in fundus imaging. It employs RETFound, a Vision Transformer based foundation model trained via self supervision on 1.6 million unlabeled retinal images, and evaluates on EyeQ and iSee across varying image quality and class distributions. Key findings show that RETFound exhibits greater robustness to image quality degradation and dataset bias than ResNet, and that overall full-model fine tuning effectively mitigates quality-related degradation, while Tip-Adapter struggles due to medical data scarcity and imbalance. The results underscore that dataset quality remains a concern but foundation models offer improved resilience and easier adaptation, informing deployment and bias mitigation strategies in clinical imaging tasks.

Abstract

Recent advancements in pre-trained large foundation models (LFM) have yielded significant breakthroughs across various domains, including natural language processing and computer vision. These models have been particularly impactful in the domain of medical diagnostic tasks. With abundant unlabeled data, an LFM has been developed for fundus images using the Vision Transformer (VIT) and a self-supervised learning framework. This LFM has shown promising performance in fundus disease diagnosis across multiple datasets. On the other hand, deep learning models have long been challenged by dataset quality issues, such as image quality and dataset bias. To investigate the influence of data quality on LFM, we conducted explorations in two fundus diagnosis tasks using datasets of varying quality. Specifically, we explored the following questions: Is LFM more robust to image quality? Is LFM affected by dataset bias? Can fine-tuning techniques alleviate these effects? Our investigation found that LFM exhibits greater resilience to dataset quality issues, including image quality and dataset bias, compared to typical convolutional networks. Furthermore, we discovered that overall fine-tuning is an effective adapter for LFM to mitigate the impact of dataset quality issues.

Is Dataset Quality Still a Concern in Diagnosis Using Large Foundation Model?

TL;DR

The paper investigates whether dataset quality issues such as image quality and bias affect diagnosis with large foundation models in fundus imaging. It employs RETFound, a Vision Transformer based foundation model trained via self supervision on 1.6 million unlabeled retinal images, and evaluates on EyeQ and iSee across varying image quality and class distributions. Key findings show that RETFound exhibits greater robustness to image quality degradation and dataset bias than ResNet, and that overall full-model fine tuning effectively mitigates quality-related degradation, while Tip-Adapter struggles due to medical data scarcity and imbalance. The results underscore that dataset quality remains a concern but foundation models offer improved resilience and easier adaptation, informing deployment and bias mitigation strategies in clinical imaging tasks.

Abstract

Recent advancements in pre-trained large foundation models (LFM) have yielded significant breakthroughs across various domains, including natural language processing and computer vision. These models have been particularly impactful in the domain of medical diagnostic tasks. With abundant unlabeled data, an LFM has been developed for fundus images using the Vision Transformer (VIT) and a self-supervised learning framework. This LFM has shown promising performance in fundus disease diagnosis across multiple datasets. On the other hand, deep learning models have long been challenged by dataset quality issues, such as image quality and dataset bias. To investigate the influence of data quality on LFM, we conducted explorations in two fundus diagnosis tasks using datasets of varying quality. Specifically, we explored the following questions: Is LFM more robust to image quality? Is LFM affected by dataset bias? Can fine-tuning techniques alleviate these effects? Our investigation found that LFM exhibits greater resilience to dataset quality issues, including image quality and dataset bias, compared to typical convolutional networks. Furthermore, we discovered that overall fine-tuning is an effective adapter for LFM to mitigate the impact of dataset quality issues.
Paper Structure (9 sections, 6 figures)

This paper contains 9 sections, 6 figures.

Figures (6)

  • Figure 1: Summary of our EyeQ dataset, where DR-i indicates the presence of diabetic retinopathy at grade i based on the labels in the EyePACS dataset. This image shows the three levels of quality of the EyeQ dataset (high, usable, and low) corresponding to each level of DR severity. Number (rate), where number represents the quantity of images in this category, and rate indicates the proportion of images of this DR category in the corresponding quality subset.
  • Figure 2: Summary of our iSee dataset, which shows the fundus disease distribution. This image displays the iSee dataset’s two levels of quality (high and low) correspond to each level of the fundus disease category.
  • Figure 3: RETFound overall fine-tuning on the high-quality EyeQ and then evaluated on low-quality, usable quality, and high-quality EyeQ datasets. (a) represents the impact of different qualities on RETFound in EyeQ, while (b) represents the impact of different qualities on RETFound in iSee. The classification performance of RETFound decreases as image quality deteriorates.
  • Figure 4: RETFound and ResNet overall fine-tuning the training dataset of EyeQ and iSee high-quality subset, respectively. (a) and (b) show that RETFound is more stable than ResNet when encountering image degradation.
  • Figure 5: RETFound fine-tuning on the high-quality EyeQ and then evaluated on low quality, usable quality, and high-quality EyeQ dataset. The AUROC of five classes is distributed in five DR gradings. For classes with lower proportions, RETFound is more significantly affected by image degradation.
  • ...and 1 more figures