Double-Exponential Increases in Inference Energy: The Cost of the Race for Accuracy

Zeyu Yang; Karel Adamek; Wesley Armour

Double-Exponential Increases in Inference Energy: The Cost of the Race for Accuracy

Zeyu Yang, Karel Adamek, Wesley Armour

TL;DR

This paper addresses the rising energy demands of vision-model inference and provides a large-scale, real-hardware benchmarking study across $1{,}200$ pretrained ImageNet models using both PyTorch and TensorRT on state-of-the-art GPUs. By systematically measuring energy per inference and accuracy across diverse architectures and datasets, it reveals a steep diminishing return: as energy increases by orders of magnitude, accuracy improves only modestly, suggesting reevaluation of marginal accuracy gains. The authors introduce an energy-efficiency scoring system and an interactive web app to enable side-by-side comparisons, enabling practitioners to navigate the trade-offs between energy, throughput, and accuracy in real-world deployments. Their findings, including strong correlations of energy with FLOPs and activations and the substantial energy savings from TensorRT, provide actionable guidance for sustainable AI design and deployment, and they advocate a shift toward efficiency-aware reporting and benchmarking in the community. The work culminates in practical recommendations and tools to promote energy-conscious decisions in model selection and deployment, with a view toward scalable, reproducible, and environmentally responsible AI development.

Abstract

Deep learning models in computer vision have achieved significant success but pose increasing concerns about energy consumption and sustainability. Despite these concerns, there is a lack of comprehensive understanding of their energy efficiency during inference. In this study, we conduct a comprehensive analysis of the inference energy consumption of 1,200 ImageNet classification models - the largest evaluation of its kind to date. Our findings reveal a steep diminishing return in accuracy gains relative to the increase in energy usage, highlighting sustainability concerns in the pursuit of marginal improvements. We identify key factors contributing to energy consumption and demonstrate methods to improve energy efficiency. To promote more sustainable AI practices, we introduce an energy efficiency scoring system and develop an interactive web application that allows users to compare models based on accuracy and energy consumption. By providing extensive empirical data and practical tools, we aim to facilitate informed decision-making and encourage collaborative efforts in developing energy-efficient AI technologies.

Double-Exponential Increases in Inference Energy: The Cost of the Race for Accuracy

TL;DR

This paper addresses the rising energy demands of vision-model inference and provides a large-scale, real-hardware benchmarking study across

pretrained ImageNet models using both PyTorch and TensorRT on state-of-the-art GPUs. By systematically measuring energy per inference and accuracy across diverse architectures and datasets, it reveals a steep diminishing return: as energy increases by orders of magnitude, accuracy improves only modestly, suggesting reevaluation of marginal accuracy gains. The authors introduce an energy-efficiency scoring system and an interactive web app to enable side-by-side comparisons, enabling practitioners to navigate the trade-offs between energy, throughput, and accuracy in real-world deployments. Their findings, including strong correlations of energy with FLOPs and activations and the substantial energy savings from TensorRT, provide actionable guidance for sustainable AI design and deployment, and they advocate a shift toward efficiency-aware reporting and benchmarking in the community. The work culminates in practical recommendations and tools to promote energy-conscious decisions in model selection and deployment, with a view toward scalable, reproducible, and environmentally responsible AI development.

Double-Exponential Increases in Inference Energy: The Cost of the Race for Accuracy

TL;DR

Abstract

Double-Exponential Increases in Inference Energy: The Cost of the Race for Accuracy

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (10)