SAE: Single Architecture Ensemble Neural Networks
Martin Ferianc, Hongxiang Fan, Miguel Rodrigues
TL;DR
The paper addresses hardware-efficient neural network ensembles by proposing SAE, a framework that automatically searches over early exits and multi-input multi-output configurations within a single architecture. SAE combines a scalable search space that generalizes EE, MIMO, MIMMO and in-between configurations with an optimization objective based on variational inference to learn both network weights and per-input exit depth distributions. Through multi-task experiments on TinyImageNet, BloodMNIST, PneumoniaMNIST, and RetinaMNIST with diverse backbones, SAE achieves competitive accuracy and calibration while reducing FLOPs and parameter counts by up to $1.5\sim 3.7\times$ relative to baselines. The results demonstrate there is no universal best configuration and that automatic search yields diverse, task-dependent configurations, offering practical hardware-efficiency benefits and a flexible framework for architecture design.
Abstract
Ensembles of separate neural networks (NNs) have shown superior accuracy and confidence calibration over single NN across tasks. To improve the hardware efficiency of ensembles of separate NNs, recent methods create ensembles within a single network via adding early exits or considering multi input multi output approaches. However, it is unclear which of these methods is the most effective for a given task, needing a manual and separate search through each method. Our novel Single Architecture Ensemble (SAE) framework enables an automatic and joint search through the early exit and multi input multi output configurations and their previously unobserved in-between combinations. SAE consists of two parts: a scalable search space that generalises the previous methods and their in-between configurations, and an optimisation objective that allows learning the optimal configuration for a given task. Our image classification and regression experiments show that with SAE we can automatically find diverse configurations that fit the task, achieving competitive accuracy or confidence calibration to baselines while reducing the compute operations or parameter count by up to $1.5{\sim}3.7\times$.
