Good for the Planet, Bad for Me? Intended and Unintended Consequences of AI Energy Consumption Disclosure

Michael Klesel; Uwe Messer

Good for the Planet, Bad for Me? Intended and Unintended Consequences of AI Energy Consumption Disclosure

Michael Klesel, Uwe Messer

Abstract

To address the high energy consumption of artificial intelligence, energy consumption disclosure (ECD) has been proposed to steer users toward more sustainable practices, such as choosing efficient small language models (SLMs) over large language models (LLMs). This presents a performance-sustainability trade-off for users. In an experiment with 365 participants, we explore the impact of ECD and the perceptual and behavioral consequences of choosing an SLM over an LLM. Our findings reveal that ECD is a highly effective measure to nudge individuals toward a pro-environmental choice, increasing the odds of choosing an energy efficient SLM over an LLM by more than 12. Interestingly, this choice did not significantly impact subsequent behavior, as individuals who selected an SLM and those who selected an LLM demonstrated similar prompt behavior. Nevertheless, the choice created a perceptual bias. A placebo effect emerged, with individuals who selected the "eco-friendly" SLM reporting significantly lower satisfaction and perceived quality. These results highlight the double-edged nature of ECD, which holds critical implications for the design of sustainable human-computer interactions.

Good for the Planet, Bad for Me? Intended and Unintended Consequences of AI Energy Consumption Disclosure

Abstract

Paper Structure (25 sections, 7 figures, 4 tables)

This paper contains 25 sections, 7 figures, 4 tables.

Introduction
Related Work
The Resource Cost of Large-Scale AI
Supporting User Decisions Through Energy Disclosure
The Challenge of Energy Disclosure: Performance Trade-offs in AI
Theoretical Foundation
Conceptualization and Research Model
Effect of Energy Consumption Disclosure on Pro-Environmental Behavior (Model Choice)
Consequences of Model Choice on Use Behavior
Consequences of Model Choice on Perception
Methodology
Experimental Design
Participants
Implementation of the Experiment
Results
...and 10 more sections

Figures (7)

Figure 1: This figure presents an analogy between the luminous efficiency of lightbulbs and the trade-offs in language models. On the left, different lightbulbs produce an identical output (e.g., 1000 lumens) but vary significantly in their energy efficiency. On the right, language models (LLMs and SLMs) differ not only in energy efficiency but also in absolute performance (e.g., MMLU scores of 43.9 vs. 25.9). Critically, unlike the standardized lumen, most users lack the expertise to accurately estimate the true performance difference ($\Delta$) between these models. Error bars indicate variance in energy consumption due to differences in hardware and infrastructure used for inference jegham2025hungry.
Figure 2: Research Model
Figure 3: The experimental manipulation shown to participants. The control condition (left) displayed only performance ratings, while the treatment condition (right) included an energy efficiency score (A--G), creating a choice between performance and sustainability.
Figure 4: ECD altered the choices made by participants. Count plot illustrating the number of individuals in the control (n = 192) and treatment (n = 173) groups who selected either the LLM or the SLM. The distribution of choices between the two groups was statistically significant ($\chi^2$(1, 365) = 269.63, p < .0001).
Figure 5: Differences in behavior and perception based on model choice. The results are based on N = 173 observations that received the treatment. The figure shows four boxplots comparing key dependent variables for participants grouped by their choice of the LLM (n = 105) versus the SLM (n = 68). The variables displayed are (A) Average Tokens per Prompt, (B) Number of Prompts, (C) Perceived Satisfaction, and (D) Perceived Quality. Pairwise comparisons were conducted using Mann--Whitney U tests with Holm--Bonferroni correction. Significance levels are denoted as: $^*p \leq 0.05$, $^{**}p \leq 0.01$, $^{***}p \leq 0.001$, $^{****}p \leq 0.0001$.
...and 2 more figures

Good for the Planet, Bad for Me? Intended and Unintended Consequences of AI Energy Consumption Disclosure

Abstract

Good for the Planet, Bad for Me? Intended and Unintended Consequences of AI Energy Consumption Disclosure

Authors

Abstract

Table of Contents

Figures (7)