Green MLOps to Green GenOps: An Empirical Study of Energy Consumption in Discriminative and Generative AI Operations
Adrián Sánchez-Mompó, Ioannis Mavromatis, Peizheng Li, Konstantinos Katsaros, Aftab Khan
TL;DR
The paper empirically examines energy consumption across discriminative and generative AI within real-world MLOps/GenOps pipelines using software-based power measurements. It demonstrates that discriminative energy can be significantly reduced through architectural, hyperparameter, and hardware optimizations, while generative AI energy efficiency depends on balancing model size, reasoning complexity, and request handling, with larger models not necessarily more energy-intensive at low utilization. The study introduces Metrics such as energy_per_sample and MACs-to-parameters correlations to predict energy use and provides practical guidelines for designing greener MLOps/GenOps workflows. By offering a replicable framework and a benchmarking perspective, it lays groundwork for energy-conscious deployment and future hardware-software co-design in AI systems.
Abstract
This study presents an empirical investigation into the energy consumption of Discriminative and Generative AI models within real-world MLOps pipelines. For Discriminative models, we examine various architectures and hyperparameters during training and inference and identify energy-efficient practices. For Generative AI, Large Language Models (LLMs) are assessed, focusing primarily on energy consumption across different model sizes and varying service requests. Our study employs software-based power measurements, ensuring ease of replication across diverse configurations, models, and datasets. We analyse multiple models and hardware setups to uncover correlations among various metrics, identifying key contributors to energy consumption. The results indicate that for Discriminative models, optimising architectures, hyperparameters, and hardware can significantly reduce energy consumption without sacrificing performance. For LLMs, energy efficiency depends on balancing model size, reasoning complexity, and request-handling capacity, as larger models do not necessarily consume more energy when utilisation remains low. This analysis provides practical guidelines for designing green and sustainable ML operations, emphasising energy consumption and carbon footprint reductions while maintaining performance. This paper can serve as a benchmark for accurately estimating total energy use across different types of AI models.
