Table of Contents
Fetching ...

How green is continual learning, really? Analyzing the energy consumption in continual training of vision foundation models

Tomaso Trinci, Simone Magistri, Roberto Verdecchia, Andrew D. Bagdanov

TL;DR

An extensive set of empirical experiments are conducted comparing the energy consumption of recent representation-, prompt-, and exemplar-based continual learning algorithms and two standard baseline levels when used to continually adapt a pre-trained ViT-B/16 foundation model to gain a systematic understanding of the energy efficiency of continual learning algorithms.

Abstract

With the ever-growing adoption of AI, its impact on the environment is no longer negligible. Despite the potential that continual learning could have towards Green AI, its environmental sustainability remains relatively uncharted. In this work we aim to gain a systematic understanding of the energy efficiency of continual learning algorithms. To that end, we conducted an extensive set of empirical experiments comparing the energy consumption of recent representation-, prompt-, and exemplar-based continual learning algorithms and two standard baseline (fine tuning and joint training) when used to continually adapt a pre-trained ViT-B/16 foundation model. We performed our experiments on three standard datasets: CIFAR-100, ImageNet-R, and DomainNet. Additionally, we propose a novel metric, the Energy NetScore, which we use measure the algorithm efficiency in terms of energy-accuracy trade-off. Through numerous evaluations varying the number and size of the incremental learning steps, our experiments demonstrate that different types of continual learning algorithms have very different impacts on energy consumption during both training and inference. Although often overlooked in the continual learning literature, we found that the energy consumed during the inference phase is crucial for evaluating the environmental sustainability of continual learning models.

How green is continual learning, really? Analyzing the energy consumption in continual training of vision foundation models

TL;DR

An extensive set of empirical experiments are conducted comparing the energy consumption of recent representation-, prompt-, and exemplar-based continual learning algorithms and two standard baseline levels when used to continually adapt a pre-trained ViT-B/16 foundation model to gain a systematic understanding of the energy efficiency of continual learning algorithms.

Abstract

With the ever-growing adoption of AI, its impact on the environment is no longer negligible. Despite the potential that continual learning could have towards Green AI, its environmental sustainability remains relatively uncharted. In this work we aim to gain a systematic understanding of the energy efficiency of continual learning algorithms. To that end, we conducted an extensive set of empirical experiments comparing the energy consumption of recent representation-, prompt-, and exemplar-based continual learning algorithms and two standard baseline (fine tuning and joint training) when used to continually adapt a pre-trained ViT-B/16 foundation model. We performed our experiments on three standard datasets: CIFAR-100, ImageNet-R, and DomainNet. Additionally, we propose a novel metric, the Energy NetScore, which we use measure the algorithm efficiency in terms of energy-accuracy trade-off. Through numerous evaluations varying the number and size of the incremental learning steps, our experiments demonstrate that different types of continual learning algorithms have very different impacts on energy consumption during both training and inference. Although often overlooked in the continual learning literature, we found that the energy consumed during the inference phase is crucial for evaluating the environmental sustainability of continual learning models.
Paper Structure (12 sections, 3 equations, 9 figures, 2 tables)

This paper contains 12 sections, 3 equations, 9 figures, 2 tables.

Figures (9)

  • Figure 1: In continual learning a model $\mathcal{M}_k$ (in green) learns and adapts using only current data without forgetting previous information. Conversely, the joint incremental training strategy (in red) uses both previous and current data, leading to comprehensive learning but higher computational and storage costs. In this work we aim to understand the impact on the energy consumption of model $\mathcal{M}_k$ when trained following different CL approaches and how they compare to joint incremental training.
  • Figure 2: Overview of our experimental methodology. PILOT sun2023pilot is the framework that implements the CL approaches measuring the accuracy over incremental training steps, while CodeCarbon benoit_courty_2024_11171501 evaluates energy consumption during training and inference.
  • Figure 2: Results DN4IL. $A_K$, $E_K$ and $\Omega_K$ represent accuracy, energy consumed, and Energy NetScore at the end of the task sequence, respectively.
  • Figure 3: Comparison in terms of training energy consumption ($x$-axis) and accuracy after the final incremental step ($y$-axis) across benchmarks and task sequence lengths.
  • Figure 4: Trainable parameters versus Energy Consumption. SimpleCIL has zero parameters, while for RanPAC we select the number of trainable parameters for the first task.
  • ...and 4 more figures