Is Dataset Quality Still a Concern in Diagnosis Using Large Foundation Model?
Ziqin Lin, Heng Li, Zinan Li, Huazhu Fu, Jiang Liu
TL;DR
The paper investigates whether dataset quality issues such as image quality and bias affect diagnosis with large foundation models in fundus imaging. It employs RETFound, a Vision Transformer based foundation model trained via self supervision on 1.6 million unlabeled retinal images, and evaluates on EyeQ and iSee across varying image quality and class distributions. Key findings show that RETFound exhibits greater robustness to image quality degradation and dataset bias than ResNet, and that overall full-model fine tuning effectively mitigates quality-related degradation, while Tip-Adapter struggles due to medical data scarcity and imbalance. The results underscore that dataset quality remains a concern but foundation models offer improved resilience and easier adaptation, informing deployment and bias mitigation strategies in clinical imaging tasks.
Abstract
Recent advancements in pre-trained large foundation models (LFM) have yielded significant breakthroughs across various domains, including natural language processing and computer vision. These models have been particularly impactful in the domain of medical diagnostic tasks. With abundant unlabeled data, an LFM has been developed for fundus images using the Vision Transformer (VIT) and a self-supervised learning framework. This LFM has shown promising performance in fundus disease diagnosis across multiple datasets. On the other hand, deep learning models have long been challenged by dataset quality issues, such as image quality and dataset bias. To investigate the influence of data quality on LFM, we conducted explorations in two fundus diagnosis tasks using datasets of varying quality. Specifically, we explored the following questions: Is LFM more robust to image quality? Is LFM affected by dataset bias? Can fine-tuning techniques alleviate these effects? Our investigation found that LFM exhibits greater resilience to dataset quality issues, including image quality and dataset bias, compared to typical convolutional networks. Furthermore, we discovered that overall fine-tuning is an effective adapter for LFM to mitigate the impact of dataset quality issues.
