The More the Merrier? Navigating Accuracy vs. Energy Efficiency Design Trade-Offs in Ensemble Learning Systems

Rafiullah Omar; Justus Bogner; Henry Muccini; Patricia Lago; Silverio Martínez-Fernández; Xavier Franch

The More the Merrier? Navigating Accuracy vs. Energy Efficiency Design Trade-Offs in Ensemble Learning Systems

Rafiullah Omar, Justus Bogner, Henry Muccini, Patricia Lago, Silverio Martínez-Fernández, Xavier Franch

TL;DR

This work investigates energy-accuracy trade-offs in ensemble learning under a Green AI lens. Using a controlled full-factorial experiment with four traditional classifiers, four datasets, two fusion methods, and two training schemes, the study jointly measures $F_1$-score and energy in $J$ for training and inference. Key findings show that larger ensembles substantially increase energy consumption without meaningful $F_1$ improvements, while majority voting outperforms meta-model fusion in both energy efficiency and accuracy. Subset-based training combined with small ensemble sizes and majority voting offers robust energy savings with negligible accuracy loss, providing practical guidance for sustainable ML deployments.

Abstract

Background: Machine learning (ML) model composition is a popular technique to mitigate shortcomings of a single ML model and to design more effective ML-enabled systems. While ensemble learning, i.e., forwarding the same request to several models and fusing their predictions, has been studied extensively for accuracy, we have insufficient knowledge about how to design energy-efficient ensembles. Objective: We therefore analyzed three types of design decisions for ensemble learning regarding a potential trade-off between accuracy and energy consumption: a) ensemble size, i.e., the number of models in the ensemble, b) fusion methods (majority voting vs. a meta-model), and c) partitioning methods (whole-dataset vs. subset-based training). Methods: By combining four popular ML algorithms for classification in different ensembles, we conducted a full factorial experiment with 11 ensembles x 4 datasets x 2 fusion methods x 2 partitioning methods (176 combinations). For each combination, we measured accuracy (F1-score) and energy consumption in J (for both training and inference). Results: While a larger ensemble size significantly increased energy consumption (size 2 ensembles consumed 37.49% less energy than size 3 ensembles, which in turn consumed 26.96% less energy than the size 4 ensembles), it did not significantly increase accuracy. Furthermore, majority voting outperformed meta-model fusion both in terms of accuracy (Cohen's d of 0.38) and energy consumption (Cohen's d of 0.92). Lastly, subset-based training led to significantly lower energy consumption (Cohen's d of 0.91), while training on the whole dataset did not increase accuracy significantly. Conclusions: From a Green AI perspective, we recommend designing ensembles of small size (2 or maximum 3 models), using subset-based training, majority voting, and energy-efficient ML algorithms like decision trees, Naive Bayes, or KNN.

The More the Merrier? Navigating Accuracy vs. Energy Efficiency Design Trade-Offs in Ensemble Learning Systems

TL;DR

-score and energy in

for training and inference. Key findings show that larger ensembles substantially increase energy consumption without meaningful

improvements, while majority voting outperforms meta-model fusion in both energy efficiency and accuracy. Subset-based training combined with small ensemble sizes and majority voting offers robust energy savings with negligible accuracy loss, providing practical guidance for sustainable ML deployments.

Abstract

Paper Structure (22 sections, 9 figures, 3 tables)

This paper contains 22 sections, 9 figures, 3 tables.

Introduction
Background and Related Work
Energy Efficiency of ML-Enabled Systems
Ensemble Learning
Related Work
Experiment Design
Objective and Research Questions
ML Algorithms
Datasets
Fusion Methods
Dataset Partitioning Methods
Experiment Variables
Experiment Execution
Data Analysis
Results
...and 7 more sections

Figures (9)

Figure 1: Example of an ensemble of doctors, taken from kunapuli2023ensemble
Figure 2: Basic architecture of ensemble learning, taken from dietterich2000ensemble
Figure 3: Energy consumption of ensembles of different sizes
Figure 4: Accuracy of ensembles of different sizes
Figure 5: Energy consumption of meta-model fusion vs. majority voting fusion
...and 4 more figures

The More the Merrier? Navigating Accuracy vs. Energy Efficiency Design Trade-Offs in Ensemble Learning Systems

TL;DR

Abstract

The More the Merrier? Navigating Accuracy vs. Energy Efficiency Design Trade-Offs in Ensemble Learning Systems

Authors

TL;DR

Abstract

Table of Contents

Figures (9)