Table of Contents
Fetching ...

EarlyStopping: Implicit Regularization for Iterative Learning Procedures in Python

Eric Ziebell, Ratmir Miftachov, Bernhard Stankewitz, Laura Hucker

TL;DR

This work addresses implicit regularisation of iterative estimators through data-driven in-sample early stopping and introduces the EarlyStopping-package, a Python toolkit for running multiple procedures such as truncated SVD, Landweber, conjugate gradient, $L^{2}$-Boosting, and regression trees. It grounds stopping rules in discrepancy principles and balanced-oracle theory, enabling adaptive stopping times $\widehat{m}$ and $\tau^{\text{DP}}$ while preserving computational efficiency and reproducibility. The package provides modular implementations, oracle-tracking capabilities, and Monte Carlo simulations to replicate foundational results from the literature, including inverse problems and high-dimensional regression settings. Collectively, it offers a unified framework for comparing implicit regularisation via early stopping across diverse iterative methods, with practical impact on both theory-driven experimentation and scalable analysis.

Abstract

Iterative learning procedures are ubiquitous in machine learning and modern statistics. Regularision is typically required to prevent inflating the expected loss of a procedure in later iterations via the propagation of noise inherent in the data. Significant emphasis has been placed on achieving this regularisation implicitly by stopping procedures early. The EarlyStopping-package provides a toolbox of (in-sample) sequential early stopping rules for several well-known iterative estimation procedures, such as truncated SVD, Landweber (gradient descent), conjugate gradient descent, L2-boosting and regression trees. One of the central features of the package is that the algorithms allow the specification of the true data-generating process and keep track of relevant theoretical quantities. In this paper, we detail the principles governing the implementation of the EarlyStopping-package and provide a survey of recent foundational advances in the theoretical literature. We demonstrate how to use the EarlyStopping-package to explore core features of implicit regularisation and replicate results from the literature.

EarlyStopping: Implicit Regularization for Iterative Learning Procedures in Python

TL;DR

This work addresses implicit regularisation of iterative estimators through data-driven in-sample early stopping and introduces the EarlyStopping-package, a Python toolkit for running multiple procedures such as truncated SVD, Landweber, conjugate gradient, -Boosting, and regression trees. It grounds stopping rules in discrepancy principles and balanced-oracle theory, enabling adaptive stopping times and while preserving computational efficiency and reproducibility. The package provides modular implementations, oracle-tracking capabilities, and Monte Carlo simulations to replicate foundational results from the literature, including inverse problems and high-dimensional regression settings. Collectively, it offers a unified framework for comparing implicit regularisation via early stopping across diverse iterative methods, with practical impact on both theory-driven experimentation and scalable analysis.

Abstract

Iterative learning procedures are ubiquitous in machine learning and modern statistics. Regularision is typically required to prevent inflating the expected loss of a procedure in later iterations via the propagation of noise inherent in the data. Significant emphasis has been placed on achieving this regularisation implicitly by stopping procedures early. The EarlyStopping-package provides a toolbox of (in-sample) sequential early stopping rules for several well-known iterative estimation procedures, such as truncated SVD, Landweber (gradient descent), conjugate gradient descent, L2-boosting and regression trees. One of the central features of the package is that the algorithms allow the specification of the true data-generating process and keep track of relevant theoretical quantities. In this paper, we detail the principles governing the implementation of the EarlyStopping-package and provide a survey of recent foundational advances in the theoretical literature. We demonstrate how to use the EarlyStopping-package to explore core features of implicit regularisation and replicate results from the literature.

Paper Structure

This paper contains 15 sections, 60 equations, 12 figures, 1 table, 2 algorithms.

Figures (12)

  • Figure 1: Decomposition of the risk into approximation and stochastic error for two signals $f^{*}$ and $g^{*}$ with slowly and fast decaying squared bias respectively.
  • Figure 2: Weak (a) and strong risk decomposition (b) for the smooth signal, with bias (blue), variance (red), error (black), balanced oracle (orange), and stopping time (green). (c) SVD representation of a super-smooth (blue), a smooth (purple), and a rough (olive) signal. (d) Relative efficiency: \ref{['eq:relative_efficiency_tSVD']}. https://github.com/EarlyStop/EarlyStopping/blob/main/simulations/TruncatedSVD_Replication.py
  • Figure 3: (a) Relative number of Landweber iterations for $\tau^{\text{DP}}$ divided by the balanced oracle iteration. (b) Relative efficiency: \ref{['eq:relative_efficiency_tSVD']}. https://github.com/EarlyStop/EarlyStopping/blob/main/simulations/Landweber_Replication.py
  • Figure 4: (a) Relative number of conjugate gradient iterations for $\tau^{\text{DP}}$ divided by the balanced oracle iteration. (b) Relative efficiency: \ref{['eq:relative_efficiency_cg']}. https://github.com/EarlyStop/EarlyStopping/blob/main/simulations/ConjugateGradients_Replication.py
  • Figure 5: Signals from the simulation in stankewitz2024early.
  • ...and 7 more figures

Theorems & Definitions (2)

  • Remark 2.1: Numerical implementation
  • Remark 2.2