Energy-Aware DNN Graph Optimization
Yu Wang, Rong Ge, Shuang Qiu
TL;DR
The paper addresses energy-aware optimization of DNN computation graphs by enabling substitutions that produce equivalent graphs and by assigning node-level algorithms to minimize a cost combining energy and time. It introduces a two-level search: an outer search over graph substitutions with a tunable parameter α and an inner search over per-node algorithm assignments with a neighborhood distance d, guided by a cost model that aggregates node-level energy and time into graph-level metrics. The cost model supports linear and multiplicative combinations of energy and time and can also optimize for power, enabling flexible energy-delay tradeoffs. Evaluations on CNNs (Inception-v3, SqueezeNet, ResNet-50) show up to 24% energy savings with negligible performance impact, validating the approach and its applicability to energy-constrained deployments. The work offers practical methods for real-time or online deployment, leveraging per-node profiling and a scalable search to navigate the expanded graph-and-algorithm space while preserving accuracy.
Abstract
Unlike existing work in deep neural network (DNN) graphs optimization for inference performance, we explore DNN graph optimization for energy awareness and savings for power- and resource-constrained machine learning devices. We present a method that allows users to optimize energy consumption or balance between energy and inference performance for DNN graphs. This method efficiently searches through the space of equivalent graphs, and identifies a graph and the corresponding algorithms that incur the least cost in execution. We implement the method and evaluate it with multiple DNN models on a GPU-based machine. Results show that our method achieves significant energy savings, i.e., 24% with negligible performance impact.
