Table of Contents
Fetching ...

Automatic model training under restrictive time constraints

Lukas Cironis, Jan Palczewski, Georgios Aivaliotis

TL;DR

AutoBCT tackles hyperparameter tuning under strict time constraints by formulating it as a stochastic control problem with partial information. It represents the unknown score and cost mappings as linear surrogates H(u)=α^Tφ(u) and T(u)=β^Tψ(u) with Gaussian priors updated via Kalman filtering, and solves a dynamic programming problem with a precomputed value function V that guides stopping and next-hyperparameter decisions. The framework is validated on synthetic tasks, CNN training, and large-scale datasets, showing efficient stopping and competitive accuracy within budget; VI/Map-based implementations generally outperform on-the-fly variants when precomputed maps are available. The approach enables meta-learning and reuse of value-function maps across problems, contributing a principled, eco-efficient AutoML paradigm grounded in stochastic control and regression Monte Carlo methods.

Abstract

We develop a hyperparameter optimisation algorithm, Automated Budget Constrained Training (AutoBCT), which balances the quality of a model with the computational cost required to tune it. The relationship between hyperparameters, model quality and computational cost must be learnt and this learning is incorporated directly into the optimisation problem. At each training epoch, the algorithm decides whether to terminate or continue training, and, in the latter case, what values of hyperparameters to use. This decision weighs optimally potential improvements in the quality with the additional training time and the uncertainty about the learnt quantities. The performance of our algorithm is verified on a number of machine learning problems encompassing random forests and neural networks. Our approach is rooted in the theory of Markov decision processes with partial information and we develop a numerical method to compute the value function and an optimal strategy.

Automatic model training under restrictive time constraints

TL;DR

AutoBCT tackles hyperparameter tuning under strict time constraints by formulating it as a stochastic control problem with partial information. It represents the unknown score and cost mappings as linear surrogates H(u)=α^Tφ(u) and T(u)=β^Tψ(u) with Gaussian priors updated via Kalman filtering, and solves a dynamic programming problem with a precomputed value function V that guides stopping and next-hyperparameter decisions. The framework is validated on synthetic tasks, CNN training, and large-scale datasets, showing efficient stopping and competitive accuracy within budget; VI/Map-based implementations generally outperform on-the-fly variants when precomputed maps are available. The approach enables meta-learning and reuse of value-function maps across problems, contributing a principled, eco-efficient AutoML paradigm grounded in stochastic control and regression Monte Carlo methods.

Abstract

We develop a hyperparameter optimisation algorithm, Automated Budget Constrained Training (AutoBCT), which balances the quality of a model with the computational cost required to tune it. The relationship between hyperparameters, model quality and computational cost must be learnt and this learning is incorporated directly into the optimisation problem. At each training epoch, the algorithm decides whether to terminate or continue training, and, in the latter case, what values of hyperparameters to use. This decision weighs optimally potential improvements in the quality with the additional training time and the uncertainty about the learnt quantities. The performance of our algorithm is verified on a number of machine learning problems encompassing random forests and neural networks. Our approach is rooted in the theory of Markov decision processes with partial information and we develop a numerical method to compute the value function and an optimal strategy.

Paper Structure

This paper contains 44 sections, 6 theorems, 67 equations, 6 figures, 12 tables, 5 algorithms.

Key Result

Theorem 2.2

Optimisation problem eqn:3 is equivalent to

Figures (6)

  • Figure 1: Function to be predicted in the synthetic example of Section \ref{['subsec:syntetic']}.
  • Figure 3: One run for the synthetic classification example using AutoBCT (VI/Map $N=2$, $\epsilon=0$).
  • Figure 4: CNN architecture, where input is $50\times 50$ tissue image with $3$ colour channels and scaling parameter $s=1$ that affects the number of filters in convolution layers and sizes of flattened, densely connected arrays. A detailed information about the architecture of CNN can be found in Appendix \ref{['app:breastcancerarch']}.
  • Figure 5: Visualisation of a final posterior distribution of a 1D CNN batch example.
  • Figure 6: Prior and posterior distributions for 1D CNN r example.
  • ...and 1 more figures

Theorems & Definitions (15)

  • Remark 2.1
  • Theorem 2.2
  • Theorem 2.3
  • Definition 3.1: Truth
  • Lemma B.1
  • proof
  • Lemma B.2
  • proof
  • Lemma B.3
  • proof
  • ...and 5 more