MoMBS: Mixed-order minibatch sampling enhances model training from diverse-quality images

Han Li; Hu Han; S. Kevin Zhou

MoMBS: Mixed-order minibatch sampling enhances model training from diverse-quality images

Han Li, Hu Han, S. Kevin Zhou

TL;DR

MoMBS tackles learning from diverse-quality images by jointly using loss and uncertainty to assess sample difficulty and by mixing samples of varying difficulty within minibatches. The method introduces an assessor to compute a robust difficulty score and a scheduler that constructs minibatches to maximize positive updates while limiting harmful influence from poorly labeled or overfitted samples. Across universal lesion detection, COVID-19 CT segmentation, long-tailed, and noisy-label CIFAR-100 tasks, MoMBS yields consistent improvements, particularly with limited training data, while remaining architecture-agnostic. This approach enhances training robustness in realistic, quality-variant datasets and can simplify deployment by reducing reliance on specialized network designs.

Abstract

Natural images exhibit label diversity (clean vs. noisy) in noisy-labeled image classification and prevalence diversity (abundant vs. sparse) in long-tailed image classification. Similarly, medical images in universal lesion detection (ULD) exhibit substantial variations in image quality, encompassing attributes such as clarity and label correctness. How to effectively leverage training images with diverse qualities becomes a problem in learning deep models. Conventional training mechanisms, such as self-paced curriculum learning (SCL) and online hard example mining (OHEM), relieve this problem by reweighting images with high loss values. Despite their success, these methods still confront two challenges: (i) the loss-based measure of sample hardness is imprecise, preventing optimum handling of different cases, and (ii) there exists under-utilization in SCL or over-utilization OHEM with the identified hard samples. To address these issues, this paper revisits the minibatch sampling (MBS), a technique widely used in deep network training but largely unexplored concerning the handling of diverse-quality training samples. We discover that the samples within a minibatch influence each other during training; thus, we propose a novel Mixed-order Minibatch Sampling (MoMBS) method to optimize the use of training samples with diverse qualities. MoMBS introduces a measure that takes both loss and uncertainty into account to surpass a sole reliance on loss and allows for a more refined categorization of high-loss samples by distinguishing them as either poorly labeled and under represented or well represented and overfitted. We prioritize under represented samples as the main gradient contributors in a minibatch and keep them from the negative influences of poorly labeled or overfitted samples with a mixed-order minibatch sampling design.

MoMBS: Mixed-order minibatch sampling enhances model training from diverse-quality images

TL;DR

Abstract

MoMBS: Mixed-order minibatch sampling enhances model training from diverse-quality images

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (7)