Lift What You Can: Green Online Learning with Heterogeneous Ensembles

Kirsten Köbschall; Sebastian Buschjäger; Raphael Fischer; Lisa Hartung; Stefan Kramer

Lift What You Can: Green Online Learning with Heterogeneous Ensembles

Kirsten Köbschall, Sebastian Buschjäger, Raphael Fischer, Lisa Hartung, Stefan Kramer

TL;DR

The paper tackles sustainable online learning by enabling resource-aware training of heterogeneous ensembles (HEROS) in data streams. It formalizes training under budget as a Markov decision process and introduces the $\zeta$-policy, a greedy, threshold-based method that prioritizes low-cost models within a near-optimal performance band, with theoretical guarantees and empirical validation on 11 datasets. The theoretical analysis shows that for small $\zeta$, the $\zeta$-policy achieves higher asymptotic performance and lower resource costs than baseline methods, while remaining within $\zeta$ of the best possible performance. Experiments demonstrate that $\zeta$-policy often yields higher AUROC and substantially lower energy consumption than alternatives, and it effectively adapts to concept drift, offering a practical approach to green online learning. Overall, the framework provides a principled, tunable balance between predictive quality and environmental impact in streaming scenarios.

Abstract

Ensemble methods for stream mining necessitate managing multiple models and updating them as data distributions evolve. Considering the calls for more sustainability, established methods are however not sufficiently considerate of ensemble members' computational expenses and instead overly focus on predictive capabilities. To address these challenges and enable green online learning, we propose heterogeneous online ensembles (HEROS). For every training step, HEROS chooses a subset of models from a pool of models initialized with diverse hyperparameter choices under resource constraints to train. We introduce a Markov decision process to theoretically capture the trade-offs between predictive performance and sustainability constraints. Based on this framework, we present different policies for choosing which models to train on incoming data. Most notably, we propose the novel $ζ$-policy, which focuses on training near-optimal models at reduced costs. Using a stochastic model, we theoretically prove that our $ζ$-policy achieves near optimal performance while using fewer resources compared to the best performing policy. In our experiments across 11 benchmark datasets, we find empiric evidence that our $ζ$-policy is a strong contribution to the state-of-the-art, demonstrating highly accurate performance, in some cases even outperforming competitors, and simultaneously being much more resource-friendly.

Lift What You Can: Green Online Learning with Heterogeneous Ensembles

TL;DR

-policy, a greedy, threshold-based method that prioritizes low-cost models within a near-optimal performance band, with theoretical guarantees and empirical validation on 11 datasets. The theoretical analysis shows that for small

, the

-policy achieves higher asymptotic performance and lower resource costs than baseline methods, while remaining within

of the best possible performance. Experiments demonstrate that

-policy often yields higher AUROC and substantially lower energy consumption than alternatives, and it effectively adapts to concept drift, offering a practical approach to green online learning. Overall, the framework provides a principled, tunable balance between predictive quality and environmental impact in streaming scenarios.

Abstract

-policy, which focuses on training near-optimal models at reduced costs. Using a stochastic model, we theoretically prove that our

-policy achieves near optimal performance while using fewer resources compared to the best performing policy. In our experiments across 11 benchmark datasets, we find empiric evidence that our

-policy is a strong contribution to the state-of-the-art, demonstrating highly accurate performance, in some cases even outperforming competitors, and simultaneously being much more resource-friendly.

Lift What You Can: Green Online Learning with Heterogeneous Ensembles

TL;DR

Abstract

Lift What You Can: Green Online Learning with Heterogeneous Ensembles

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (14)