Neural Subnetwork Ensembles
Tim Whitaker
TL;DR
This work proposes Subnetwork Ensembles, a low-cost framework for building neural network ensembles by sampling, perturbing, and optimizing subnetworks from a trained parent model. It formalizes three perturbation families—Noisy, Sparse, and Stochastic—and introduces Neural Partitioning to maximize diversity while reducing parameter overlap. Across ImageNet, CIFAR, and ProcGen benchmarks, the approach achieves consistent generalization gains while dramatically reducing training cost and parameter usage, with sparse and stochastic variants providing further robustness and scalability. The framework enables dynamic ensemble growth, leverages pre-trained models, and offers rich diversity analysis through both output metrics and interpretability-based representations, suggesting practical impact for efficient, robust ensemble learning in large-scale deep networks.
Abstract
Neural network ensembles have been effectively used to improve generalization by combining the predictions of multiple independently trained models. However, the growing scale and complexity of deep neural networks have led to these methods becoming prohibitively expensive and time consuming to implement. Low-cost ensemble methods have become increasingly important as they can alleviate the need to train multiple models from scratch while retaining the generalization benefits that traditional ensemble learning methods afford. This dissertation introduces and formalizes a low-cost framework for constructing Subnetwork Ensembles, where a collection of child networks are formed by sampling, perturbing, and optimizing subnetworks from a trained parent model. We explore several distinct methodologies for generating child networks and we evaluate their efficacy through a variety of ablation studies and established benchmarks. Our findings reveal that this approach can greatly improve training efficiency, parametric utilization, and generalization performance while minimizing computational cost. Subnetwork Ensembles offer a compelling framework for exploring how we can build better systems by leveraging the unrealized potential of deep neural networks.
