Table of Contents
Fetching ...

Green MLOps to Green GenOps: An Empirical Study of Energy Consumption in Discriminative and Generative AI Operations

Adrián Sánchez-Mompó, Ioannis Mavromatis, Peizheng Li, Konstantinos Katsaros, Aftab Khan

TL;DR

The paper empirically examines energy consumption across discriminative and generative AI within real-world MLOps/GenOps pipelines using software-based power measurements. It demonstrates that discriminative energy can be significantly reduced through architectural, hyperparameter, and hardware optimizations, while generative AI energy efficiency depends on balancing model size, reasoning complexity, and request handling, with larger models not necessarily more energy-intensive at low utilization. The study introduces Metrics such as energy_per_sample and MACs-to-parameters correlations to predict energy use and provides practical guidelines for designing greener MLOps/GenOps workflows. By offering a replicable framework and a benchmarking perspective, it lays groundwork for energy-conscious deployment and future hardware-software co-design in AI systems.

Abstract

This study presents an empirical investigation into the energy consumption of Discriminative and Generative AI models within real-world MLOps pipelines. For Discriminative models, we examine various architectures and hyperparameters during training and inference and identify energy-efficient practices. For Generative AI, Large Language Models (LLMs) are assessed, focusing primarily on energy consumption across different model sizes and varying service requests. Our study employs software-based power measurements, ensuring ease of replication across diverse configurations, models, and datasets. We analyse multiple models and hardware setups to uncover correlations among various metrics, identifying key contributors to energy consumption. The results indicate that for Discriminative models, optimising architectures, hyperparameters, and hardware can significantly reduce energy consumption without sacrificing performance. For LLMs, energy efficiency depends on balancing model size, reasoning complexity, and request-handling capacity, as larger models do not necessarily consume more energy when utilisation remains low. This analysis provides practical guidelines for designing green and sustainable ML operations, emphasising energy consumption and carbon footprint reductions while maintaining performance. This paper can serve as a benchmark for accurately estimating total energy use across different types of AI models.

Green MLOps to Green GenOps: An Empirical Study of Energy Consumption in Discriminative and Generative AI Operations

TL;DR

The paper empirically examines energy consumption across discriminative and generative AI within real-world MLOps/GenOps pipelines using software-based power measurements. It demonstrates that discriminative energy can be significantly reduced through architectural, hyperparameter, and hardware optimizations, while generative AI energy efficiency depends on balancing model size, reasoning complexity, and request handling, with larger models not necessarily more energy-intensive at low utilization. The study introduces Metrics such as energy_per_sample and MACs-to-parameters correlations to predict energy use and provides practical guidelines for designing greener MLOps/GenOps workflows. By offering a replicable framework and a benchmarking perspective, it lays groundwork for energy-conscious deployment and future hardware-software co-design in AI systems.

Abstract

This study presents an empirical investigation into the energy consumption of Discriminative and Generative AI models within real-world MLOps pipelines. For Discriminative models, we examine various architectures and hyperparameters during training and inference and identify energy-efficient practices. For Generative AI, Large Language Models (LLMs) are assessed, focusing primarily on energy consumption across different model sizes and varying service requests. Our study employs software-based power measurements, ensuring ease of replication across diverse configurations, models, and datasets. We analyse multiple models and hardware setups to uncover correlations among various metrics, identifying key contributors to energy consumption. The results indicate that for Discriminative models, optimising architectures, hyperparameters, and hardware can significantly reduce energy consumption without sacrificing performance. For LLMs, energy efficiency depends on balancing model size, reasoning complexity, and request-handling capacity, as larger models do not necessarily consume more energy when utilisation remains low. This analysis provides practical guidelines for designing green and sustainable ML operations, emphasising energy consumption and carbon footprint reductions while maintaining performance. This paper can serve as a benchmark for accurately estimating total energy use across different types of AI models.

Paper Structure

This paper contains 20 sections, 5 equations, 11 figures, 6 tables.

Figures (11)

  • Figure S1: ML model development and deployment phase and the associated MLOps and GenOps life cycles.
  • Figure S2: Training and inference duration (for $50$k samples).
  • Figure S3: Average power usage with HC-2.
  • Figure S4: Utilisation and power consumption (considering the GPU RAM usage) - HC-1.
  • Figure S5: Loss, energy and accuracy per epoch, averaged across all models - the shaded areas show the range of values - HC-3.
  • ...and 6 more figures