Instance-dependent Early Stopping

Suqin Yuan; Runqi Lin; Lei Feng; Bo Han; Tongliang Liu

Instance-dependent Early Stopping

Suqin Yuan, Runqi Lin, Lei Feng, Bo Han, Tongliang Liu

TL;DR

The paper addresses inefficiency in traditional early stopping by introducing Instance-dependent Early Stopping (IES), which stops training on a per-instance basis once an instance is mastered. Mastery is detected via the second-order loss difference $|\Delta^2 L_i(w^{(t)})|$ (with a global threshold $\delta$), enabling dynamic pruning of mastered samples from backpropagation. Across CIFAR-10/100 and ImageNet-1k, IES achieves 10%-50% reduction in backpropagation while maintaining or slightly improving test accuracy and enabling better transfer learning, with reported wall-time speedups of roughly 1.3×–1.4× and notable improvements in gradient norms and loss landscape sharpness. The approach is robust to threshold variations, outperforms several data-efficiency baselines, and extends to high-level vision tasks, though theoretical guarantees and fairness implications require further study.

Abstract

In machine learning practice, early stopping has been widely used to regularize models and can save computational costs by halting the training process when the model's performance on a validation set stops improving. However, conventional early stopping applies the same stopping criterion to all instances without considering their individual learning statuses, which leads to redundant computations on instances that are already well-learned. To further improve the efficiency, we propose an Instance-dependent Early Stopping (IES) method that adapts the early stopping mechanism from the entire training set to the instance level, based on the core principle that once the model has mastered an instance, the training on it should stop. IES considers an instance as mastered if the second-order differences of its loss value remain within a small range around zero. This offers a more consistent measure of an instance's learning status compared with directly using the loss value, and thus allows for a unified threshold to determine when an instance can be excluded from further backpropagation. We show that excluding mastered instances from backpropagation can increase the gradient norms, thereby accelerating the decrease of the training loss and speeding up the training process. Extensive experiments on benchmarks demonstrate that IES method can reduce backpropagation instances by 10%-50% while maintaining or even slightly improving the test accuracy and transfer learning performance of a model.

Instance-dependent Early Stopping

TL;DR

Abstract

Instance-dependent Early Stopping

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (8)