Operating critical machine learning models in resource constrained regimes
Raghavendra Selvan, Julian Schön, Erik B Dam
TL;DR
This paper addresses the challenge of running deep learning models in resource-constrained clinical environments where data, compute, and energy requirements are barriers to deployment. It evaluates a subset of resource-efficiency strategies—automatic mixed precision (AMP), 8-bit optimiser, and half-precision weights, along with gradient/activation quantisation—on RSNA Mammography and LIDC-IDRI datasets across CNNs and transformer architectures. Key findings show that AMP can reduce memory and training time without hurting performance for CNNs, the 8-bit optimiser often improves convergence and lowers resource use, while transformer models are more sensitive to low-precision settings; best configurations include DenseNet with 8-bit optimiser and Swin Transformer with 8-bit optimiser plus half precision. The results suggest that resource-efficient techniques should be integrated into standard clinical-deployment pipelines to enable faster, greener, and more accessible medical imaging tools, though limitations such as lack of NAS exploration and real-edge deployment validation remain.
Abstract
The accelerated development of machine learning methods, primarily deep learning, are causal to the recent breakthroughs in medical image analysis and computer aided intervention. The resource consumption of deep learning models in terms of amount of training data, compute and energy costs are known to be massive. These large resource costs can be barriers in deploying these models in clinics, globally. To address this, there are cogent efforts within the machine learning community to introduce notions of resource efficiency. For instance, using quantisation to alleviate memory consumption. While most of these methods are shown to reduce the resource utilisation, they could come at a cost in performance. In this work, we probe into the trade-off between resource consumption and performance, specifically, when dealing with models that are used in critical settings such as in clinics.
