Regularizing CNNs using Confusion Penalty Based Label Smoothing for Histopathology Images
Somenath Kuiry, Alaka Das, Mita Nasipuri, Nibaran Das
TL;DR
The paper tackles overconfidence in CNNs for medical image analysis by addressing limitations of vanilla Label Smoothing. It introduces Confusion Penalty Label Smoothing (CPLS), which leverages the epoch-wise validation confusion matrix to create class-specific soft targets that emphasize confusable classes, combined with a hybrid training objective to stabilize learning. Across multiple CNN architectures on the ColorectalHistology dataset, CPLS improves calibration (lower ECE) and often enhances accuracy, with reliable evidence from reliability diagrams and feature-space visualizations like t-SNE. The approach advances medical image analysis by yielding better-calibrated models and clearer cluster structure in feature space, with potential extensions to segmentation tasks.
Abstract
Deep Learning, particularly Convolutional Neural Networks (CNN), has been successful in computer vision tasks and medical image analysis. However, modern CNNs can be overconfident, making them difficult to deploy in real-world scenarios. Researchers propose regularizing techniques, such as Label Smoothing (LS), which introduces soft labels for training data, making the classifier more regularized. LS captures disagreements or lack of confidence in the training phase, making the classifier more regularized. Although LS is quite simple and effective, traditional LS techniques utilize a weighted average between target distribution and a uniform distribution across the classes, which limits the objective of LS as well as the performance. This paper introduces a novel LS technique based on the confusion penalty, which treats model confusion for each class with more importance than others. We have performed extensive experiments with well-known CNN architectures with this technique on publicly available Colorectal Histology datasets and got satisfactory results. Also, we have compared our findings with the State-of-the-art and shown our method's efficacy with Reliability diagrams and t-distributed Stochastic Neighbor Embedding (t-SNE) plots of feature space.
