Enhancing Sample Selection Against Label Noise by Cutting Mislabeled Easy Examples
Suqin Yuan, Lei Feng, Bo Han, Tongliang Liu
TL;DR
The paper addresses learning with noisy labels by showing that mislabeled examples learned early can disproportionately harm generalization. It defines Mislabeled Easy Examples (MEEs) as mislabeled samples that the model predicts correctly early in training, which distort the learning of simple patterns. To mitigate MEEs, it proposes Early Cutting, a recalibration step that uses a later-stage model to reselect the confident subset initially identified, by filtering samples with high loss and high confidence and low input-gradient (MEEs) via thresholds on $L_i$, $c_i$, and $g_i$ and a rate $\gamma$. Empirically, on CIFAR-10/100, WebVision, and full ImageNet-1k with various noise types, Early Cutting improves performance over state-of-the-art methods with modest overhead and shows transferability (including to MixMatch-based semi-supervised setups); limitations include scope to vision tasks and a need for theoretical grounding.
Abstract
Sample selection is a prevalent approach in learning with noisy labels, aiming to identify confident samples for training. Although existing sample selection methods have achieved decent results by reducing the noise rate of the selected subset, they often overlook that not all mislabeled examples harm the model's performance equally. In this paper, we demonstrate that mislabeled examples correctly predicted by the model early in the training process are particularly harmful to model performance. We refer to these examples as Mislabeled Easy Examples (MEEs). To address this, we propose Early Cutting, which introduces a recalibration step that employs the model's later training state to re-select the confident subset identified early in training, thereby avoiding misleading confidence from early learning and effectively filtering out MEEs. Experiments on the CIFAR, WebVision, and full ImageNet-1k datasets demonstrate that our method effectively improves sample selection and model performance by reducing MEEs.
