Table of Contents
Fetching ...

Class Distance Weighted Cross Entropy Loss for Classification of Disease Severity

Gorkem Polat, Ümit Mert Çağlar, Alptekin Temizel

TL;DR

This work introduces Class Distance Weighted Cross Entropy (CDW-CE), an ordinal loss that penalizes misclassifications more severely as the distance from the true class increases, with an optional margin to enforce tighter intra-class clustering. Evaluated on the LIMUC Ulcerative Colitis dataset across three CNN architectures, CDW-CE consistently surpasses standard CE and other ordinal losses in MES classification and in remission prediction, evidenced by higher QWK, F1, accuracy, MAE, and AUC, as well as improved CAM explainability and latent-space clustering (t-SNE/UMAP). The paper also analyzes the impact of the distance exponent $\alpha$ and margin $m$, showing that higher $\alpha$ values (e.g., 5–7) and margin terms improve performance, with domain experts confirming better alignment of model attention to clinical symptoms. Overall, CDW-CE provides a robust, explainable ordinal loss that enhances both predictive power and clinical interpretability in disease severity classification.

Abstract

Assessing disease severity with ordinal classes, where each class reflects increasing severity levels, benefits from loss functions designed for this ordinal structure. Traditional categorical loss functions, like Cross-Entropy (CE), often perform suboptimally in these scenarios. To address this, we propose a novel loss function, Class Distance Weighted Cross-Entropy (CDW-CE), which penalizes misclassifications more severely when the predicted and actual classes are farther apart. We evaluated CDW-CE using various deep architectures, comparing its performance against several categorical and ordinal loss functions. To assess the quality of latent representations, we used t-distributed stochastic neighbor embedding (t-SNE) and uniform manifold approximation and projection (UMAP) visualizations, quantified the clustering quality using the Silhouette Score, and compared Class Activation Maps (CAM) generated by models trained with CDW-CE and CE loss. Feedback from domain experts was incorporated to evaluate how well model attention aligns with expert opinion. Our results show that CDW-CE consistently improves performance in ordinal image classification tasks. It achieves higher Silhouette Scores, indicating better class discrimination capability, and its CAM visualizations show a stronger focus on clinically significant regions, as validated by domain experts. Receiver operator characteristics (ROC) curves and the area under the curve (AUC) scores highlight that CDW-CE outperforms other loss functions, including prominent ordinal loss functions from the literature.

Class Distance Weighted Cross Entropy Loss for Classification of Disease Severity

TL;DR

This work introduces Class Distance Weighted Cross Entropy (CDW-CE), an ordinal loss that penalizes misclassifications more severely as the distance from the true class increases, with an optional margin to enforce tighter intra-class clustering. Evaluated on the LIMUC Ulcerative Colitis dataset across three CNN architectures, CDW-CE consistently surpasses standard CE and other ordinal losses in MES classification and in remission prediction, evidenced by higher QWK, F1, accuracy, MAE, and AUC, as well as improved CAM explainability and latent-space clustering (t-SNE/UMAP). The paper also analyzes the impact of the distance exponent and margin , showing that higher values (e.g., 5–7) and margin terms improve performance, with domain experts confirming better alignment of model attention to clinical symptoms. Overall, CDW-CE provides a robust, explainable ordinal loss that enhances both predictive power and clinical interpretability in disease severity classification.

Abstract

Assessing disease severity with ordinal classes, where each class reflects increasing severity levels, benefits from loss functions designed for this ordinal structure. Traditional categorical loss functions, like Cross-Entropy (CE), often perform suboptimally in these scenarios. To address this, we propose a novel loss function, Class Distance Weighted Cross-Entropy (CDW-CE), which penalizes misclassifications more severely when the predicted and actual classes are farther apart. We evaluated CDW-CE using various deep architectures, comparing its performance against several categorical and ordinal loss functions. To assess the quality of latent representations, we used t-distributed stochastic neighbor embedding (t-SNE) and uniform manifold approximation and projection (UMAP) visualizations, quantified the clustering quality using the Silhouette Score, and compared Class Activation Maps (CAM) generated by models trained with CDW-CE and CE loss. Feedback from domain experts was incorporated to evaluate how well model attention aligns with expert opinion. Our results show that CDW-CE consistently improves performance in ordinal image classification tasks. It achieves higher Silhouette Scores, indicating better class discrimination capability, and its CAM visualizations show a stronger focus on clinically significant regions, as validated by domain experts. Receiver operator characteristics (ROC) curves and the area under the curve (AUC) scores highlight that CDW-CE outperforms other loss functions, including prominent ordinal loss functions from the literature.

Paper Structure

This paper contains 18 sections, 7 equations, 11 figures, 5 tables.

Figures (11)

  • Figure 1: Assuming MES-2 is the true class, the loss calculated with CE is the same for both cases. On the other hand, the unimodal distribution is more intuitive for ordinal classification.
  • Figure 2: The flowchart of the experimental setup.
  • Figure 3: A sample CAM output of two different models, trained with and without CDW-CE, presented to medical experts for their feedback.
  • Figure 4: Mean confusion matrix of each CNN model trained with CE and CDW-CE for full MES classification.
  • Figure 5: ROC curves obtained with the same deep learning architecture trained with different loss functions.
  • ...and 6 more figures