PePR: Performance Per Resource Unit as a Metric to Promote Small-Scale Deep Learning in Medical Image Analysis
Raghavendra Selvan, Bob Pepin, Christian Igel, Gabrielle Samuel, Erik B Dam
TL;DR
This work addresses the growing environmental and equity concerns of resource-intensive deep learning in medical image analysis by introducing PePR, a composite metric that computes $P_{\text{ePR}}(R,P)=\frac{P}{1+R}$ to balance performance $P$ with normalized resource cost $R$. The authors evaluate 131 pretrained architectures across three datasets, demonstrating that small-scale, pretrained models often yield better performance-per-resource trade-offs in resource-constrained settings. They also formalize PePR curves and plan for variants capturing different costs (e.g., energy, memory, carbon), showing that PePR can guide resource-aware model selection and promote AI equity in healthcare. The findings support prioritizing small-scale, well-pretrained architectures to reduce compute, data, and energy requirements while maintaining useful predictive performance.
Abstract
The recent advances in deep learning (DL) have been accelerated by access to large-scale data and compute. These large-scale resources have been used to train progressively larger models which are resource intensive in terms of compute, data, energy, and carbon emissions. These costs are becoming a new type of entry barrier to researchers and practitioners with limited access to resources at such scale, particularly in the Global South. In this work, we take a comprehensive look at the landscape of existing DL models for medical image analysis tasks and demonstrate their usefulness in settings where resources are limited. To account for the resource consumption of DL models, we introduce a novel measure to estimate the performance per resource unit, which we call the PePR score. Using a diverse family of 131 unique DL architectures (spanning 1M to 130M trainable parameters) and three medical image datasets, we capture trends about the performance-resource trade-offs. In applications like medical image analysis, we argue that small-scale, specialized models are better than striving for large-scale models. Furthermore, we show that using existing pretrained models that are fine-tuned on new data can significantly reduce the computational resources and data required compared to training models from scratch. We hope this work will encourage the community to focus on improving AI equity by developing methods and models with smaller resource footprints.
