Robust Noisy Label Learning via Two-Stream Sample Distillation
Sihan Bai, Sanping Zhou, Zheng Qin, Le Wang, Nanning Zheng
TL;DR
The paper tackles noisy-label learning by introducing Two-Stream Sample Distillation (TSSD), a framework that combines loss-space priors and feature-space structure to robustly select high-quality training samples from noisy data. It comprises two modules: Parallel Sample Division (PSD), which partitions data using dual-space cues and Gaussian Mixture Models to form a certain and an uncertain set, and Meta Sample Purification (MSP), which trains a meta-classifier on golden data to refine semi-hard samples from the uncertain set. The semi-supervised learning stage then treats the certain set as labeled and the uncertain set as unlabeled, using refined labels and MixMatch-like losses to train a robust model, with loss term L = L_C + \lambda_u L_U + \lambda_r L_reg. Empirical results on CIFAR-10/100, Tiny-ImageNet, and Clothing-1M show state-of-the-art or competitive performance across noise settings, demonstrating the effectiveness of jointly leveraging loss and feature information for sample distillation in noisy-label scenarios. Limitations include reliance on two particular feature-loss spaces, suggesting avenues for incorporating additional selection criteria in future work.
Abstract
Noisy label learning aims to learn robust networks under the supervision of noisy labels, which plays a critical role in deep learning. Existing work either conducts sample selection or label correction to deal with noisy labels during the model training process. In this paper, we design a simple yet effective sample selection framework, termed Two-Stream Sample Distillation (TSSD), for noisy label learning, which can extract more high-quality samples with clean labels to improve the robustness of network training. Firstly, a novel Parallel Sample Division (PSD) module is designed to generate a certain training set with sufficient reliable positive and negative samples by jointly considering the sample structure in feature space and the human prior in loss space. Secondly, a novel Meta Sample Purification (MSP) module is further designed to mine adequate semi-hard samples from the remaining uncertain training set by learning a strong meta classifier with extra golden data. As a result, more and more high-quality samples will be distilled from the noisy training set to train networks robustly in every iteration. Extensive experiments on four benchmark datasets, including CIFAR-10, CIFAR-100, Tiny-ImageNet, and Clothing-1M, show that our method has achieved state-of-the-art results over its competitors.
