Table of Contents
Fetching ...

Sample Selection via Contrastive Fragmentation for Noisy Label Regression

Chris Dongjoo Kim, Sangwoo Moon, Jihwan Moon, Dongyeon Woo, Gunhee Kim

TL;DR

ConFrag introduces a contrastive fragmentation framework to tackle noisy labeled regression by partitioning the label space into fragments, training specialized experts on contrastive fragment pairs, and using a neighborhood agreement mechanism to select clean samples. A mixture-of-experts model aggregates local consensus across neighboring fragments, enhanced by neighborhood jittering to regularize learning. The method achieves state-of-the-art performance across six real-world regression benchmarks under symmetric and Gaussian noise, validated by the ERR and MRAE metrics and extensive ablations. This approach demonstrates the benefit of converting regression noise into structured, open-set-like signals via contrastive fragment pairs, with implications for scalable, robust regression in noisy data regimes.

Abstract

As with many other problems, real-world regression is plagued by the presence of noisy labels, an inevitable issue that demands our attention. Fortunately, much real-world data often exhibits an intrinsic property of continuously ordered correlations between labels and features, where data points with similar labels are also represented with closely related features. In response, we propose a novel approach named ConFrag, where we collectively model the regression data by transforming them into disjoint yet contrasting fragmentation pairs. This enables the training of more distinctive representations, enhancing the ability to select clean samples. Our ConFrag framework leverages a mixture of neighboring fragments to discern noisy labels through neighborhood agreement among expert feature extractors. We extensively perform experiments on six newly curated benchmark datasets of diverse domains, including age prediction, price prediction, and music production year estimation. We also introduce a metric called Error Residual Ratio (ERR) to better account for varying degrees of label noise. Our approach consistently outperforms fourteen state-of-the-art baselines, being robust against symmetric and random Gaussian label noise.

Sample Selection via Contrastive Fragmentation for Noisy Label Regression

TL;DR

ConFrag introduces a contrastive fragmentation framework to tackle noisy labeled regression by partitioning the label space into fragments, training specialized experts on contrastive fragment pairs, and using a neighborhood agreement mechanism to select clean samples. A mixture-of-experts model aggregates local consensus across neighboring fragments, enhanced by neighborhood jittering to regularize learning. The method achieves state-of-the-art performance across six real-world regression benchmarks under symmetric and Gaussian noise, validated by the ERR and MRAE metrics and extensive ablations. This approach demonstrates the benefit of converting regression noise into structured, open-set-like signals via contrastive fragment pairs, with implications for scalable, robust regression in noisy data regimes.

Abstract

As with many other problems, real-world regression is plagued by the presence of noisy labels, an inevitable issue that demands our attention. Fortunately, much real-world data often exhibits an intrinsic property of continuously ordered correlations between labels and features, where data points with similar labels are also represented with closely related features. In response, we propose a novel approach named ConFrag, where we collectively model the regression data by transforming them into disjoint yet contrasting fragmentation pairs. This enables the training of more distinctive representations, enhancing the ability to select clean samples. Our ConFrag framework leverages a mixture of neighboring fragments to discern noisy labels through neighborhood agreement among expert feature extractors. We extensively perform experiments on six newly curated benchmark datasets of diverse domains, including age prediction, price prediction, and music production year estimation. We also introduce a metric called Error Residual Ratio (ERR) to better account for varying degrees of label noise. Our approach consistently outperforms fourteen state-of-the-art baselines, being robust against symmetric and random Gaussian label noise.

Paper Structure

This paper contains 43 sections, 6 equations, 23 figures, 15 tables, 1 algorithm.

Figures (23)

  • Figure 1: (a) An example of t-SNE illustration of contrastive fragment pairing. The data with label noise are grouped into six fragments ($f\in[1\text{-}6]$) and formed into three contrastive pairs ($f\in[1, 4], [2, 5], [3, 6]$). Contrastive fragment pairing transforms some of closed-set noise (whose ground truth is within the target label set) into open-set noise (whose ground truth is not within the label set). For example, in the [1,4] figure, label noise whose ground truth fragment is either 1 or 4 is closed-set noise, and the others are open-set noise. The t-SNE illustration shows that learned features of open-set noises tend to reside outside the feature clusters of the clean samples. (b) The open-set noise is less harmful with much lower errors (MRAE) in the downstream regression. (c) The contrastive pairing ($[1, 4], [2, 5], [3, 6]$) is more effective than using all-fragments together ($[1\text{-}6]$), resulting in much lower MRAE scores. All experiments are based on IMDB-Clean-B with more details in Appendix \ref{['subsec:contrast_combination']}--\ref{['subsec:disruptive_anomaly_noise']}.
  • Figure 2: The contrastive fragment pairing algorithm.
  • Figure 3: Contrastive Fragmentation framework. (a) The overall sequential process of our framework. (b) Shows the fragmentation of the continuous label space to obtain contrasting fragment pairs (§ \ref{['subsec:fragmentation']}) and train feature extractors on them. (c) Sample Selection by Mixture of Neighboring Fragments obtains the selection probability in both prediction and representation perspectives (§ \ref{['subsec:mixture_of_contrasing_fragments']}). (d) Illustration of Neighborhood Jittering (§ \ref{['sec:jittering']}).
  • Figure 4: Jittering analysis. (a) When trained without jittering, feature extractors easily overfit the noisy training data (yellow-shaded region), while jittering-regularized feature extractors robustly learn from the noisy training data. (b) Overfitted feature extractors (yellow-shaded region) on noisy samples increase their likelihood, leading to a higher selection rate and ERR. It exhibits nearly twice higher ERR (a lower value is better). (c) Most importantly, jittering regularization improves performance in regression.
  • Figure 5: Selection/ERR/MRAE comparison between ConFrag and strong baselines of CNLCU-H, BMM, DY-S, AUX and Selfie on IMDB-Clean-B. We exclude the performance during the warm-up.
  • ...and 18 more figures