Table of Contents
Fetching ...

Dynamic Correlation Learning and Regularization for Multi-Label Confidence Calibration

Tianshui Chen, Weihang Wang, Tao Pu, Jinghui Qin, Zhijing Yang, Jie Liu, Liang Lin

TL;DR

The paper tackles overconfidence in multi-label recognition by introducing MLCC and the Dynamic Correlation Learning and Regularization (DCLR) framework. DCLR learns both instance-level and prototype-level category correlations to produce adaptive soft label vectors that regularize MLR training in a plug-in fashion. A unified MLCC benchmark is established by re-implementing baselines and evaluating on MS-COCO and Visual Genome across three backbones, using ACE, ECE, MCE, and mAP as metrics. Empirical results show that DCLR consistently improves calibration without sacrificing recognition accuracy, demonstrating its practical value for reliable multi-label confidence estimation. The work also highlights the importance of semantic correlations and provides a foundation for future MLCC research, including holistic correlation modeling and few-shot or partial-label settings.

Abstract

Modern visual recognition models often display overconfidence due to their reliance on complex deep neural networks and one-hot target supervision, resulting in unreliable confidence scores that necessitate calibration. While current confidence calibration techniques primarily address single-label scenarios, there is a lack of focus on more practical and generalizable multi-label contexts. This paper introduces the Multi-Label Confidence Calibration (MLCC) task, aiming to provide well-calibrated confidence scores in multi-label scenarios. Unlike single-label images, multi-label images contain multiple objects, leading to semantic confusion and further unreliability in confidence scores. Existing single-label calibration methods, based on label smoothing, fail to account for category correlations, which are crucial for addressing semantic confusion, thereby yielding sub-optimal performance. To overcome these limitations, we propose the Dynamic Correlation Learning and Regularization (DCLR) algorithm, which leverages multi-grained semantic correlations to better model semantic confusion for adaptive regularization. DCLR learns dynamic instance-level and prototype-level similarities specific to each category, using these to measure semantic correlations across different categories. With this understanding, we construct adaptive label vectors that assign higher values to categories with strong correlations, thereby facilitating more effective regularization. We establish an evaluation benchmark, re-implementing several advanced confidence calibration algorithms and applying them to leading multi-label recognition (MLR) models for fair comparison. Through extensive experiments, we demonstrate the superior performance of DCLR over existing methods in providing reliable confidence scores in multi-label scenarios.

Dynamic Correlation Learning and Regularization for Multi-Label Confidence Calibration

TL;DR

The paper tackles overconfidence in multi-label recognition by introducing MLCC and the Dynamic Correlation Learning and Regularization (DCLR) framework. DCLR learns both instance-level and prototype-level category correlations to produce adaptive soft label vectors that regularize MLR training in a plug-in fashion. A unified MLCC benchmark is established by re-implementing baselines and evaluating on MS-COCO and Visual Genome across three backbones, using ACE, ECE, MCE, and mAP as metrics. Empirical results show that DCLR consistently improves calibration without sacrificing recognition accuracy, demonstrating its practical value for reliable multi-label confidence estimation. The work also highlights the importance of semantic correlations and provides a foundation for future MLCC research, including holistic correlation modeling and few-shot or partial-label settings.

Abstract

Modern visual recognition models often display overconfidence due to their reliance on complex deep neural networks and one-hot target supervision, resulting in unreliable confidence scores that necessitate calibration. While current confidence calibration techniques primarily address single-label scenarios, there is a lack of focus on more practical and generalizable multi-label contexts. This paper introduces the Multi-Label Confidence Calibration (MLCC) task, aiming to provide well-calibrated confidence scores in multi-label scenarios. Unlike single-label images, multi-label images contain multiple objects, leading to semantic confusion and further unreliability in confidence scores. Existing single-label calibration methods, based on label smoothing, fail to account for category correlations, which are crucial for addressing semantic confusion, thereby yielding sub-optimal performance. To overcome these limitations, we propose the Dynamic Correlation Learning and Regularization (DCLR) algorithm, which leverages multi-grained semantic correlations to better model semantic confusion for adaptive regularization. DCLR learns dynamic instance-level and prototype-level similarities specific to each category, using these to measure semantic correlations across different categories. With this understanding, we construct adaptive label vectors that assign higher values to categories with strong correlations, thereby facilitating more effective regularization. We establish an evaluation benchmark, re-implementing several advanced confidence calibration algorithms and applying them to leading multi-label recognition (MLR) models for fair comparison. Through extensive experiments, we demonstrate the superior performance of DCLR over existing methods in providing reliable confidence scores in multi-label scenarios.
Paper Structure (25 sections, 17 equations, 5 figures, 6 tables, 1 algorithm)

This paper contains 25 sections, 17 equations, 5 figures, 6 tables, 1 algorithm.

Figures (5)

  • Figure 1: Two examples of predicted scores by current advanced MLR models with and without DCLR calibration. The categories existing in the image are highlighted in bold.
  • Figure 2: An overall illustration of the proposed DCLR algorithm and its integration into MLR models. Initially, an input image is processed through a backbone network followed by the SARL module to extract category-specific features. Subsequently, we compute instance-level and prototype-level correlation matrices by calculating the similarities between the extracted features and those of selected instance samples, as well as between the extracted features and prototype representations, to effectively model semantic confusion. Finally, we calculate the instance-level and prototype-level softened label vectors based on the respective correlation matrices and the ground truth labels. The softened label vectors are utilized for MLR model training.
  • Figure 3: The reliability diagrams for SSGRL, ML-GCN, and C-Tran models without and with the existing competing and proposed DCLR algorithms on the MS-COCO dataset. The results for these models are systematically organized: SSGRL is shown in the first row, ML-GCN in the second row, and C-Tran in the last row.
  • Figure 4: The reliability diagrams of each category using no calibration algorithm, as well as using the LS and proposed DCLR algorithms. The results for these algorithms are organized sequentially: the results using no calibration algorithm are in the first row, those using LS in the second row, and those with the DCLR algorithm are displayed in the last row. The evaluations are conducted using the SSGRL model on the MS-COCO dataset.
  • Figure 5: The visualization examples from the MS-COCO dataset. These examples can illustrate how our DCLR calibrates the confidence for better predictions.