Table of Contents
Fetching ...

Ordinal Adaptive Correction: A Data-Centric Approach to Ordinal Image Classification with Noisy Labels

Alireza Sedighi Moghaddam, Mohammad Reza Mohammadi

TL;DR

This paper addresses label noise in ordinal image classification by shifting from sample deletion to adaptive label correction. It introduces ORDAC, a data-centric framework that represents each ordinal label as a Gaussian distribution $\mathcal{N}(\mu_i, \sigma_i^2)$ via Label Distribution Learning and iteratively updates the mean $\mu_i$ and uncertainty $\sigma_i$ using cross-fold predictions. The method combines class-wise prediction debiasing with a sample-wise distribution update governed by per-sample correction factors, providing robust performance under asymmetric label noise. Empirical results on Adience and Diabetic Retinopathy demonstrate strong improvements over baselines and state-of-the-art sample-selection methods, and reveal that ORDAC can also correct intrinsic dataset label errors, underscoring the practical value of a data-centric, correction-focused approach for ordinal tasks.

Abstract

Labeled data is a fundamental component in training supervised deep learning models for computer vision tasks. However, the labeling process, especially for ordinal image classification where class boundaries are often ambiguous, is prone to error and noise. Such label noise can significantly degrade the performance and reliability of machine learning models. This paper addresses the problem of detecting and correcting label noise in ordinal image classification tasks. To this end, a novel data-centric method called ORDinal Adaptive Correction (ORDAC) is proposed for adaptive correction of noisy labels. The proposed approach leverages the capabilities of Label Distribution Learning (LDL) to model the inherent ambiguity and uncertainty present in ordinal labels. During training, ORDAC dynamically adjusts the mean and standard deviation of the label distribution for each sample. Rather than discarding potentially noisy samples, this approach aims to correct them and make optimal use of the entire training dataset. The effectiveness of the proposed method is evaluated on benchmark datasets for age estimation (Adience) and disease severity detection (Diabetic Retinopathy) under various asymmetric Gaussian noise scenarios. Results show that ORDAC and its extended versions (ORDAC_C and ORDAC_R) lead to significant improvements in model performance. For instance, on the Adience dataset with 40% noise, ORDAC_R reduced the mean absolute error from 0.86 to 0.62 and increased the recall metric from 0.37 to 0.49. The method also demonstrated its effectiveness in correcting intrinsic noise present in the original datasets. This research indicates that adaptive label correction using label distributions is an effective strategy to enhance the robustness and accuracy of ordinal classification models in the presence of noisy data.

Ordinal Adaptive Correction: A Data-Centric Approach to Ordinal Image Classification with Noisy Labels

TL;DR

This paper addresses label noise in ordinal image classification by shifting from sample deletion to adaptive label correction. It introduces ORDAC, a data-centric framework that represents each ordinal label as a Gaussian distribution via Label Distribution Learning and iteratively updates the mean and uncertainty using cross-fold predictions. The method combines class-wise prediction debiasing with a sample-wise distribution update governed by per-sample correction factors, providing robust performance under asymmetric label noise. Empirical results on Adience and Diabetic Retinopathy demonstrate strong improvements over baselines and state-of-the-art sample-selection methods, and reveal that ORDAC can also correct intrinsic dataset label errors, underscoring the practical value of a data-centric, correction-focused approach for ordinal tasks.

Abstract

Labeled data is a fundamental component in training supervised deep learning models for computer vision tasks. However, the labeling process, especially for ordinal image classification where class boundaries are often ambiguous, is prone to error and noise. Such label noise can significantly degrade the performance and reliability of machine learning models. This paper addresses the problem of detecting and correcting label noise in ordinal image classification tasks. To this end, a novel data-centric method called ORDinal Adaptive Correction (ORDAC) is proposed for adaptive correction of noisy labels. The proposed approach leverages the capabilities of Label Distribution Learning (LDL) to model the inherent ambiguity and uncertainty present in ordinal labels. During training, ORDAC dynamically adjusts the mean and standard deviation of the label distribution for each sample. Rather than discarding potentially noisy samples, this approach aims to correct them and make optimal use of the entire training dataset. The effectiveness of the proposed method is evaluated on benchmark datasets for age estimation (Adience) and disease severity detection (Diabetic Retinopathy) under various asymmetric Gaussian noise scenarios. Results show that ORDAC and its extended versions (ORDAC_C and ORDAC_R) lead to significant improvements in model performance. For instance, on the Adience dataset with 40% noise, ORDAC_R reduced the mean absolute error from 0.86 to 0.62 and increased the recall metric from 0.37 to 0.49. The method also demonstrated its effectiveness in correcting intrinsic noise present in the original datasets. This research indicates that adaptive label correction using label distributions is an effective strategy to enhance the robustness and accuracy of ordinal classification models in the presence of noisy data.

Paper Structure

This paper contains 25 sections, 8 equations, 5 figures, 5 tables.

Figures (5)

  • Figure 1: An overview of the proposed ORDAC framework. Using a K-fold setup, models are trained on training folds and make predictions on a validation fold. These predictions are used to correct the label distributions (mean and standard deviation) of the validation samples, which are then propagated back to the training sets for subsequent epochs.
  • Figure 2: MAE of the proposed methods and baselines on the Adience and DR datasets as a function of the injected noise rate ($\tau$). Lower is better.
  • Figure 3: Examples of successful (green box) and unsuccessful (red box) corrections on the Adience dataset with synthetic noise ($\tau=0.4$).
  • Figure 4: Histogram of label changes made by ORDAC on the original (clean) Adience dataset.
  • Figure 5: Number of samples per class before and after correction on Adience ($\tau=0.4$), with and without the class-wise debiasing step. Debiasing prevents a collapse into the majority class.