Learning with Noisy Labels: Interconnection of Two Expectation-Maximizations
Heewon Kim, Hyun Sung Chang, Kiho Cho, Jaeyun Lee, Bohyung Han
TL;DR
This work tackles learning with noisy labels (LNL) by formulating a probabilistic objective with a structured data manifold and introducing LNL-flywheel, a two-EM framework with a main network for distinguishing clean vs corrupted data and an auxiliary network for refurbishing corrupted labels. The two EM cycles are interlinked, sharing a single objective and aided by a confidence regularizer to prevent collapse, enabling cooperative optimization and robust learning under diverse label-noise conditions. Empirical results on CIFAR-10/100, Tiny-ImageNet, Clothing1M, and WebVision show state-of-the-art or competitive performance across symmetric, asymmetric, and instance-dependent noise, with the practical advantage of single-model inference. The approach offers a scalable, memory-efficient alternative to ensembles, providing substantial robustness and improved data utilization through label refurbishment and pseudo-labeling strategies.
Abstract
Labor-intensive labeling becomes a bottleneck in developing computer vision algorithms based on deep learning. For this reason, dealing with imperfect labels has increasingly gained attention and has become an active field of study. We address learning with noisy labels (LNL) problem, which is formalized as a task of finding a structured manifold in the midst of noisy data. In this framework, we provide a proper objective function and an optimization algorithm based on two expectation-maximization (EM) cycles. The separate networks associated with the two EM cycles collaborate to optimize the objective function, where one model is for distinguishing clean labels from corrupted ones while the other is for refurbishing the corrupted labels. This approach results in a non-collapsing LNL-flywheel model in the end. Experiments show that our algorithm achieves state-of-the-art performance in multiple standard benchmarks with substantial margins under various types of label noise.
