Towards a Systematic Approach to Design New Ensemble Learning Algorithms

João Mendes-Moreira; Tiago Mendes-Neves

Towards a Systematic Approach to Design New Ensemble Learning Algorithms

João Mendes-Moreira, Tiago Mendes-Neves

TL;DR

This paper addresses the challenge of designing effective ensemble learning algorithms by reexamining the ensemble error decomposition and proposing SA2DELA, a two-level framework that uses the $bias$-$variance$-$diversity$ decomposition to guide the pairing of seven generation strategies for neural-network ensembles in regression. It introduces 21 new ensemble algorithms derived from 7 strategies and demonstrates, via Level-0 and Level-1 experiments on OpenML CTR23 datasets, that snapshot-based aggregations—especially snapshot with dropout or stacking—achieve strong predictive performance, validated by Friedman and Conover tests. The study contributes a concrete, data-driven process for constructing ensembles and provides a suite of competitive algorithms, illustrating that ensemble error decomposition can meaningfully inform algorithm design. The framework is extensible to other base learners and tasks, offering a replicable pathway for systematic development of ensemble methods in regression and beyond.

Abstract

Ensemble learning has been a focal point of machine learning research due to its potential to improve predictive performance. This study revisits the foundational work on ensemble error decomposition, historically confined to bias-variance-covariance analysis for regression problems since the 1990s. Recent advancements introduced a "unified theory of diversity," which proposes an innovative bias-variance-diversity decomposition framework. Leveraging this contemporary understanding, our research systematically explores the application of this decomposition to guide the creation of new ensemble learning algorithms. Focusing on regression tasks, we employ neural networks as base learners to investigate the practical implications of this theoretical framework. This approach used 7 simple ensemble methods, we name them strategies, for neural networks that were used to generate 21 new ensemble algorithms. Among these, most of the methods aggregated with the snapshot strategy, one of the 7 strategies used, showcase superior predictive performance across diverse datasets w.r.t. the Friedman rank test with the Conover post-hoc test. Our systematic design approach contributes a suite of effective new algorithms and establishes a structured pathway for future ensemble learning algorithm development.

Towards a Systematic Approach to Design New Ensemble Learning Algorithms

TL;DR

This paper addresses the challenge of designing effective ensemble learning algorithms by reexamining the ensemble error decomposition and proposing SA2DELA, a two-level framework that uses the

decomposition to guide the pairing of seven generation strategies for neural-network ensembles in regression. It introduces 21 new ensemble algorithms derived from 7 strategies and demonstrates, via Level-0 and Level-1 experiments on OpenML CTR23 datasets, that snapshot-based aggregations—especially snapshot with dropout or stacking—achieve strong predictive performance, validated by Friedman and Conover tests. The study contributes a concrete, data-driven process for constructing ensembles and provides a suite of competitive algorithms, illustrating that ensemble error decomposition can meaningfully inform algorithm design. The framework is extensible to other base learners and tasks, offering a replicable pathway for systematic development of ensemble methods in regression and beyond.

Abstract

Paper Structure (19 sections, 5 figures, 2 tables)

This paper contains 19 sections, 5 figures, 2 tables.

Introduction
Related Work
Ensemble error's decomposition
Neural network ensembles
Combining strategies to generate ensembles
SA2DELA: A systematic approach to design ensemble learning algorithms
level-0 experiments
Experimental setup
Data Acquisition and Preprocessing
Hyperparameter Optimization
Ensemble Testing
Strategies to generate the ensemble models
Results and discussion
Level-1 Experiments
Experimental Setup
...and 4 more sections

Figures (5)

Figure 1: Level-0 results. Neural Network ensemble results. Besides the 7 algorithms described, an ensemble method using the simple average as the integration method and a single neural network were also used as baselines.
Figure 2: Level-1 results showing the 21 new ensemble methods plus the simple average ensemble and the single neural network model.
Figure 3: Ensemble size sensitivity for the dropout-snapshot aggregated method ordered by the increasing order of the expected risk.
Figure 4: The Friedman-Conover post-hoc test for level-0 experiments.
Figure 5: The Friedman-Conover post-hoc test for level-1 experiments.

Towards a Systematic Approach to Design New Ensemble Learning Algorithms

TL;DR

Abstract

Towards a Systematic Approach to Design New Ensemble Learning Algorithms

Authors

TL;DR

Abstract

Table of Contents

Figures (5)