Enhancing Time Series Classification with Diversity-Driven Neural Network Ensembles
Javidan Abdullayev, Maxime Devanne, Cyril Meyer, Ali Ismail-Fawaz, Jonathan Weber, Germain Forestier
TL;DR
The paper tackles the limitation of redundancy in homogeneous neural network ensembles for time series classification by introducing a diversity-driven framework that enforces feature diversity through a feature orthogonality loss. By training ensemble members sequentially and applying the FO loss to high-level features, the approach yields complementary representations without increasing model complexity, achieving state-of-the-art performance on the UCR archive with fewer models. Key contributions include the formulation of a feature-space orthogonality loss, a sequential training protocol, and extensive analysis showing improved feature diversity via FID and filter visualizations. The work offers a practical, scalable method for more efficient deep ensembles in TSC with potential extensions to multivariate data and alternative diversity strategies.
Abstract
Ensemble methods have played a crucial role in achieving state-of-the-art (SOTA) performance across various machine learning tasks by leveraging the diversity of features learned by individual models. In Time Series Classification (TSC), ensembles have proven highly effective whether based on neural networks (NNs) or traditional methods like HIVE-COTE. However most existing NN-based ensemble methods for TSC train multiple models with identical architectures and configurations. These ensembles aggregate predictions without explicitly promoting diversity which often leads to redundant feature representations and limits the benefits of ensembling. In this work, we introduce a diversity-driven ensemble learning framework that explicitly encourages feature diversity among neural network ensemble members. Our approach employs a decorrelated learning strategy using a feature orthogonality loss applied directly to the learned feature representations. This ensures that each model in the ensemble captures complementary rather than redundant information. We evaluate our framework on 128 datasets from the UCR archive and show that it achieves SOTA performance with fewer models. This makes our method both efficient and scalable compared to conventional NN-based ensemble approaches.
