Automated Computational Energy Minimization of ML Algorithms using Constrained Bayesian Optimization
Pallavi Mitra, Felix Biessmann
TL;DR
The paper addresses the rising energy costs of training large hyperparameter configurations by proposing energy-aware hyperparameter optimization. It introduces constrained Bayesian optimization that minimizes $tau(x)$ subject to a performance constraint on $c(x)$ (classification) or $cr(x)$ (regression), using log-scale surrogates modeled by Gaussian Processes with a Matern 5/2 kernel. The approach employs a joint acquisition combining Expected Improvement (EI) for the objective and Probability of Feasibility (PoF) for the constraint, and it compares against unconstrained BO with a quadratic penalty. Across regression and classification tasks, CBO reduces wallclock runtime while maintaining the required performance threshold, demonstrating practical utility for energy-efficient training and suggesting avenues for modeling energy-performance interactions in future work.
Abstract
Bayesian optimization (BO) is an efficient framework for optimization of black-box objectives when function evaluations are costly and gradient information is not easily accessible. BO has been successfully applied to automate the task of hyperparameter optimization (HPO) in machine learning (ML) models with the primary objective of optimizing predictive performance on held-out data. In recent years, however, with ever-growing model sizes, the energy cost associated with model training has become an important factor for ML applications. Here we evaluate Constrained Bayesian Optimization (CBO) with the primary objective of minimizing energy consumption and subject to the constraint that the generalization performance is above some threshold. We evaluate our approach on regression and classification tasks and demonstrate that CBO achieves lower energy consumption without compromising the predictive performance of ML models.
