Table of Contents
Fetching ...

Can Adjusting Hyperparameters Lead to Green Deep Learning: An Empirical Study on Correlations between Hyperparameters and Energy Consumption of Deep Learning Models

Taoran Wang, Yanhui Li, Mingliang Ma, Lin Chen, Yuming Zhou

TL;DR

It is suggested that hyperparameters need more attention in developing DL models, as appropriately adjusting hyperparameters would cause green DL models.

Abstract

Context: Along with developing Deep learning (DL) models, larger datasets and more complex model structures are applied, leading to rising computing resources and energy consumption, which is an alert that green DL models should receive more attention. Objective: This paper focuses on a novel view to analyze DL energy consumption: the effect of hyperparameters on the energy cost of DL models. Method: Our approach involves using mutation operators to simulate how practitioners adjust hyperparameters, such as epochs and learning rates. We train the original and mutated models separately and gather energy information and run-time performance metrics. Moreover, we focus on the parallel scenario where multiple DL models are trained in parallel. Results: To examine the effect of hyperparameters on energy consumption, we conducted extensive experiments on five real-world DL models. The results show that (1) many hyperparameters studied have a (positive or negative) correlation with energy consumption, (2) adjusting hyperparameters can make DL models greener, i.e., lead to less energy consumption without performance damage, and (3) in a parallel environment, energy consumption becomes more susceptible to change. Conclusions: We suggest that hyperparameters need more attention in developing DL models, as appropriately adjusting hyperparameters would cause green DL models.

Can Adjusting Hyperparameters Lead to Green Deep Learning: An Empirical Study on Correlations between Hyperparameters and Energy Consumption of Deep Learning Models

TL;DR

It is suggested that hyperparameters need more attention in developing DL models, as appropriately adjusting hyperparameters would cause green DL models.

Abstract

Context: Along with developing Deep learning (DL) models, larger datasets and more complex model structures are applied, leading to rising computing resources and energy consumption, which is an alert that green DL models should receive more attention. Objective: This paper focuses on a novel view to analyze DL energy consumption: the effect of hyperparameters on the energy cost of DL models. Method: Our approach involves using mutation operators to simulate how practitioners adjust hyperparameters, such as epochs and learning rates. We train the original and mutated models separately and gather energy information and run-time performance metrics. Moreover, we focus on the parallel scenario where multiple DL models are trained in parallel. Results: To examine the effect of hyperparameters on energy consumption, we conducted extensive experiments on five real-world DL models. The results show that (1) many hyperparameters studied have a (positive or negative) correlation with energy consumption, (2) adjusting hyperparameters can make DL models greener, i.e., lead to less energy consumption without performance damage, and (3) in a parallel environment, energy consumption becomes more susceptible to change. Conclusions: We suggest that hyperparameters need more attention in developing DL models, as appropriately adjusting hyperparameters would cause green DL models.
Paper Structure (33 sections, 11 figures, 18 tables)

This paper contains 33 sections, 11 figures, 18 tables.

Figures (11)

  • Figure 1: An example of how the learning rate leads to a greener network training phase. The left figure shows the total GPU energy consumption of the training phase. The right figure shows the accuracy of the model tested on the test dataset. The example is a Siamese net base on Resnet20 resnet18 train on the dataset MNIST mnist-d with different Learning rates. The red line shows the medians, and the green line shows the averages.
  • Figure 2: The flowchart of our approach.
  • Figure 3: Trade-off between energy consumption and performance when mutating epochs. Numbers in each grid mean how many mutated models "win-tie-loss" in energy consumption and network performance. For example, 1 in Figure (a) shows only one model "wins" in package energy consumption and "loses" in network performance.
  • Figure 4: Trade off between energy consumption and performance when mutating learning rate
  • Figure 5: Trade off between energy consumption and performance when mutating gamma
  • ...and 6 more figures