Table of Contents
Fetching ...

Distributional Drift Detection in Medical Imaging with Sketching and Fine-Tuned Transformer

Yusen Wu, Phuong Nguyen, Rose Yesha, Yelena Yesha

TL;DR

This work tackles distributional drift in medical imaging by integrating data-sketching with a fine-tuned Vision Transformer to enable real-time drift detection. The approach builds a robust anomaly-detection baseline using MinHash sketches and couples it with a fine-tuned ViT to extract discriminative features, employing KS statistics and cosine similarity for drift assessment. It achieves $99.11\%$ accuracy on breast cancer imaging tasks and elevates cross-dataset cosine similarity from around $50\%$ to $99.1\%$, while proving highly sensitive to minor noise such as 1% salt-and-pepper and speckle disturbances but robust to lighting changes. The method is scalable and suitable for dynamic clinical environments, with potential extensions to other modalities and smoother hospital workflow integration.

Abstract

Distributional drift detection is important in medical applications as it helps ensure the accuracy and reliability of models by identifying changes in the underlying data distribution that could affect the prediction results of machine learning models. However, current methods have limitations in detecting drift, for example, the inclusion of abnormal datasets can lead to unfair comparisons. This paper presents an accurate and sensitive approach to detect distributional drift in CT-scan medical images by leveraging data-sketching and fine-tuning techniques. We developed a robust baseline library model for real-time anomaly detection, allowing for efficient comparison of incoming images and identification of anomalies. Additionally, we fine-tuned a pre-trained Vision Transformer model to extract relevant features, using mammography as a case study, significantly enhancing model accuracy to 99.11%. Combining with data-sketches and fine-tuning, our feature extraction evaluation demonstrated that cosine similarity scores between similar datasets provide greater improvements, from around 50% increased to 99.1%. Finally, the sensitivity evaluation shows that our solutions are highly sensitive to even 1% salt-and-pepper and speckle noise, and it is not sensitive to lighting noise (e.g., lighting conditions have no impact on data drift). The proposed methods offer a scalable and reliable solution for maintaining the accuracy of diagnostic models in dynamic clinical environments.

Distributional Drift Detection in Medical Imaging with Sketching and Fine-Tuned Transformer

TL;DR

This work tackles distributional drift in medical imaging by integrating data-sketching with a fine-tuned Vision Transformer to enable real-time drift detection. The approach builds a robust anomaly-detection baseline using MinHash sketches and couples it with a fine-tuned ViT to extract discriminative features, employing KS statistics and cosine similarity for drift assessment. It achieves accuracy on breast cancer imaging tasks and elevates cross-dataset cosine similarity from around to , while proving highly sensitive to minor noise such as 1% salt-and-pepper and speckle disturbances but robust to lighting changes. The method is scalable and suitable for dynamic clinical environments, with potential extensions to other modalities and smoother hospital workflow integration.

Abstract

Distributional drift detection is important in medical applications as it helps ensure the accuracy and reliability of models by identifying changes in the underlying data distribution that could affect the prediction results of machine learning models. However, current methods have limitations in detecting drift, for example, the inclusion of abnormal datasets can lead to unfair comparisons. This paper presents an accurate and sensitive approach to detect distributional drift in CT-scan medical images by leveraging data-sketching and fine-tuning techniques. We developed a robust baseline library model for real-time anomaly detection, allowing for efficient comparison of incoming images and identification of anomalies. Additionally, we fine-tuned a pre-trained Vision Transformer model to extract relevant features, using mammography as a case study, significantly enhancing model accuracy to 99.11%. Combining with data-sketches and fine-tuning, our feature extraction evaluation demonstrated that cosine similarity scores between similar datasets provide greater improvements, from around 50% increased to 99.1%. Finally, the sensitivity evaluation shows that our solutions are highly sensitive to even 1% salt-and-pepper and speckle noise, and it is not sensitive to lighting noise (e.g., lighting conditions have no impact on data drift). The proposed methods offer a scalable and reliable solution for maintaining the accuracy of diagnostic models in dynamic clinical environments.
Paper Structure (33 sections, 11 equations, 8 figures, 1 table, 1 algorithm)

This paper contains 33 sections, 11 equations, 8 figures, 1 table, 1 algorithm.

Figures (8)

  • Figure 1: Med-MNIST Benchmarks. We highlight three unqualified images in the benchmarks.
  • Figure 2: Workflow of our data-sketches-based and fine-tuned pre-train model for drift detection.
  • Figure 3: Baseline 1 (without fine-tuning): KS Statistic and P-value (\ref{['kspvalue']}) Trends Across 8 Med-MNIST Datasets. P-value shows that all the datasets are $<$ 0.05, meaning the low similarities in data distribution. KS statistics trends for the BreastMNIST dataset highlights a critical challenge in identifying data drift within highly variable datasets. Unlike other MNIST variants, the trends in BreastMNIST exhibit significant fluctuations across time, which complicates the interpretation of when drift occurs. The instability in KS statistics suggests that the dataset may be influenced by factors such as inconsistencies in data acquisition processes or intrinsic variability in the imaging data.
  • Figure 4: Baseline 2 (without fine-tuning): Cosine Similarity Score (\ref{['cosine']}) Across 8 Med-MNIST Datasets. The analysis of cosine similarity scores across the 8 Med-MNIST datasets reveals a fundamental limitation of feature extraction without fine-tuning. The similarity scores remain consistently low, hovering around 50%, which indicates that the extracted features fail to capture meaningful relationships between datasets. This low level of similarity suggests that the pre-trained model, when used without fine-tuning, is unable to adapt to the specific characteristics of medical imaging data.
  • Figure 5: Real-time image similarity comparison
  • ...and 3 more figures