Table of Contents
Fetching ...

FOSSIL: Regret-Minimizing Curriculum Learning for Metadata-Free and Low-Data Mpox Diagnosis

Sahng-Min Han, Minjae Kim, Jinho Cha, Se-woon Choe, Eunchan Daniel Cha, Jungwon Choi, Kyudong Jung

TL;DR

Mpox diagnosis from small, imbalanced dermatology datasets suffers from unstable optimization and poor calibration. The authors propose FOSSIL, a regret-minimizing, sample-sensitive weighting scheme where $w_i = \exp(-d_i/T)$ using softmax-based difficulty $d_i$, integrated into a four-stage Easy–Very Hard curriculum and applied across CNNs and transformers. The approach delivers superior discrimination (e.g., AUC up to $0.9573$) and calibration (ECE ~ $0.053$), with robustness to real-world perturbations and strong external validation (MCSI AUC $0.963$) without metadata or synthetic augmentation. The framework is architecture- and modality-agnostic, offering theoretical guarantees and practical reliability for data-scarce medical imaging and telemedicine, with broad applicability to radiology, histopathology, and longitudinal monitoring. Overall, FOSSIL provides a principled, interpretable path to stable, data-efficient clinical AI under strict data constraints.

Abstract

Deep learning in small and imbalanced biomedical datasets remains fundamentally constrained by unstable optimization and poor generalization. We present the first biomedical implementation of FOSSIL (Flexible Optimization via Sample-Sensitive Importance Learning), a regret-minimizing weighting framework that adaptively balances training emphasis according to sample difficulty. Using softmax-based uncertainty as a continuous measure of difficulty, we construct a four-stage curriculum (Easy-Very Hard) and integrate FOSSIL into both convolutional and transformer-based architectures for Mpox skin lesion diagnosis. Across all settings, FOSSIL substantially improves discrimination (AUC = 0.9573), calibration (ECE = 0.053), and robustness under real-world perturbations, outperforming conventional baselines without metadata, manual curation, or synthetic augmentation. The results position FOSSIL as a generalizable, data-efficient, and interpretable framework for difficulty-aware learning in medical imaging under data scarcity.

FOSSIL: Regret-Minimizing Curriculum Learning for Metadata-Free and Low-Data Mpox Diagnosis

TL;DR

Mpox diagnosis from small, imbalanced dermatology datasets suffers from unstable optimization and poor calibration. The authors propose FOSSIL, a regret-minimizing, sample-sensitive weighting scheme where using softmax-based difficulty , integrated into a four-stage Easy–Very Hard curriculum and applied across CNNs and transformers. The approach delivers superior discrimination (e.g., AUC up to ) and calibration (ECE ~ ), with robustness to real-world perturbations and strong external validation (MCSI AUC ) without metadata or synthetic augmentation. The framework is architecture- and modality-agnostic, offering theoretical guarantees and practical reliability for data-scarce medical imaging and telemedicine, with broad applicability to radiology, histopathology, and longitudinal monitoring. Overall, FOSSIL provides a principled, interpretable path to stable, data-efficient clinical AI under strict data constraints.

Abstract

Deep learning in small and imbalanced biomedical datasets remains fundamentally constrained by unstable optimization and poor generalization. We present the first biomedical implementation of FOSSIL (Flexible Optimization via Sample-Sensitive Importance Learning), a regret-minimizing weighting framework that adaptively balances training emphasis according to sample difficulty. Using softmax-based uncertainty as a continuous measure of difficulty, we construct a four-stage curriculum (Easy-Very Hard) and integrate FOSSIL into both convolutional and transformer-based architectures for Mpox skin lesion diagnosis. Across all settings, FOSSIL substantially improves discrimination (AUC = 0.9573), calibration (ECE = 0.053), and robustness under real-world perturbations, outperforming conventional baselines without metadata, manual curation, or synthetic augmentation. The results position FOSSIL as a generalizable, data-efficient, and interpretable framework for difficulty-aware learning in medical imaging under data scarcity.

Paper Structure

This paper contains 26 sections, 12 equations, 10 figures, 9 tables.

Figures (10)

  • Figure 1: Distribution of softmax-based difficulty scores across the four curriculum stages. (a) Histogram and (b) boxplot representations demonstrate clear separation between quantile-defined stages (Easy, Medium, Hard, and Very Hard), confirming the robustness of the stratification procedure.
  • Figure 2: Overview of model selection and curriculum learning stages for Mpox diagnosis. The workflow integrates diverse architectures under a unified weighting scheme.
  • Figure 3: Training curves for DenseNet121 and ConvNeXt-T under baseline and curriculum learning settings. As visualized, curriculum learning consistently led to smoother convergence, reduced overfitting, and improved validation stability for both CNN-based and transformer-based architectures.
  • Figure 4: Score-CAM visualizations of representative Easy and Very Hard samples obtained from the FOSSIL-trained ConvNeXt-T model (seed 123, fold 0). Each row displays the input image, activation heatmap, and overlay. Attention is well localized around the lesion in Easy cases but becomes diffuse under visually ambiguous conditions, indicating higher model uncertainty.
  • Figure 5: Diagnostic performance of the FOSSIL-trained ConvNeXt-T model compared with its baseline counterpart under five real-world perturbation settings. The FOSSIL model maintained high AUC and accuracy across all distortions, whereas the baseline model showed inflated AUC but degraded accuracy—consistent with overfitting and reduced reliability under realistic input shifts.
  • ...and 5 more figures