Table of Contents
Fetching ...

UCDSC: Open Set UnCertainty aware Deep Simplex Classifier for Medical Image Datasets

Arnav Aditya, Nitin Kumar, Saurabh Shigwan

TL;DR

The paper addresses open-set recognition in medical image classification under data scarcity by leveraging Neural Collapse geometry, fixing class centers at the vertices of a simplex ETF on a hypersphere. It introduces Uncertainty Aware Deep Simplex Classifier (UCDSC) with three losses: $L_{intra}$ to align features to class centers, $L_o$ to separate knowns from auxiliary background samples with margin $m$, and $L_u$ to quantify uncertainty via a distance-ratio metric; the total loss is $L_{total} = L_{intra} + \lambda_o L_o + \lambda_u L_u$. Incorporating a large auxiliary background dataset improves unknown rejection, and extensive experiments on BloodMNIST, OCTMNIST, DermaMNIST, TissueMNIST, and ASC show competitive or superior AUROC and OSCR compared to state-of-the-art OSR methods. The approach offers safer, calibrated open-set decisions for medical imaging in settings with limited labeled data and has potential for extension to more complex cancer or multi-class scenarios.

Abstract

Driven by advancements in deep learning, computer-aided diagnoses have made remarkable progress. However, outside controlled laboratory settings, algorithms may encounter several challenges. In the medical domain, these difficulties often stem from limited data availability due to ethical and legal restrictions, as well as the high cost and time required for expert annotations-especially in the face of emerging or rare diseases. In this context, open-set recognition plays a vital role by identifying whether a sample belongs to one of the known classes seen during training or should be rejected as an unknown. Recent studies have shown that features learned in the later stages of deep neural networks are observed to cluster around their class means, which themselves are arranged as individual vertices of a regular simplex [32]. The proposed method introduces a loss function designed to reject samples of unknown classes effectively by penalizing open space regions using auxiliary datasets. This approach achieves significant performance gain across four MedMNIST datasets-BloodMNIST, OCTMNIST, DermaMNIST, TissueMNIST and a publicly available skin dataset [29] outperforming state-of-the-art techniques.

UCDSC: Open Set UnCertainty aware Deep Simplex Classifier for Medical Image Datasets

TL;DR

The paper addresses open-set recognition in medical image classification under data scarcity by leveraging Neural Collapse geometry, fixing class centers at the vertices of a simplex ETF on a hypersphere. It introduces Uncertainty Aware Deep Simplex Classifier (UCDSC) with three losses: to align features to class centers, to separate knowns from auxiliary background samples with margin , and to quantify uncertainty via a distance-ratio metric; the total loss is . Incorporating a large auxiliary background dataset improves unknown rejection, and extensive experiments on BloodMNIST, OCTMNIST, DermaMNIST, TissueMNIST, and ASC show competitive or superior AUROC and OSCR compared to state-of-the-art OSR methods. The approach offers safer, calibrated open-set decisions for medical imaging in settings with limited labeled data and has potential for extension to more complex cancer or multi-class scenarios.

Abstract

Driven by advancements in deep learning, computer-aided diagnoses have made remarkable progress. However, outside controlled laboratory settings, algorithms may encounter several challenges. In the medical domain, these difficulties often stem from limited data availability due to ethical and legal restrictions, as well as the high cost and time required for expert annotations-especially in the face of emerging or rare diseases. In this context, open-set recognition plays a vital role by identifying whether a sample belongs to one of the known classes seen during training or should be rejected as an unknown. Recent studies have shown that features learned in the later stages of deep neural networks are observed to cluster around their class means, which themselves are arranged as individual vertices of a regular simplex [32]. The proposed method introduces a loss function designed to reject samples of unknown classes effectively by penalizing open space regions using auxiliary datasets. This approach achieves significant performance gain across four MedMNIST datasets-BloodMNIST, OCTMNIST, DermaMNIST, TissueMNIST and a publicly available skin dataset [29] outperforming state-of-the-art techniques.

Paper Structure

This paper contains 10 sections, 5 equations, 4 figures, 5 tables.

Figures (4)

  • Figure 1: Our method maps extracted features into a geometric space where class centers (stars) are fixed as the vertices of a Simplex ETF, inscribed on the boundary of a hypersphere. This structure ensures that the class centers are maximally separated. During training, the model learns to have class samples (dots) cluster tightly around their corresponding centers. This arrangement provides an intuitive way to measure uncertainty; a hypothetical sample (central green dot) would be equidistant from all class centers, resulting in a maximum uncertainty score of U=1. Conversely, samples mapped confidently near their true class center have an uncertainty score approaching zero ($U \approx 0$)
  • Figure 2: Sample images from the five datasets (a) BloodMNIST (b) TissueMNIST (c) OCTMNIST (d) DermaMNIST (e) Augmented Skin Conditions
  • Figure 3: Hyperparameter tuning results for all 5 datasets. These plots illustrate the effect of varying Batch Size, Expand factor and Margin on OSCR metric.
  • Figure 4: Receiver Operating Characteristic (ROC) curves for five datasets.