Table of Contents
Fetching ...

On the Carbon Footprint of Economic Research in the Age of Generative AI

Andres Alonso-Robisco, Carlos Esparcia, Francisco Jareño

Abstract

Generative artificial intelligence (AI) is increasingly used to write and refactor research code, expanding computational workflows. At the same time, Green AI research has largely measured the footprint of models rather than the downstream workflows in which GenAI is a tool. We shift the unit of analysis from models to workflows and treat prompts as decision policies that allocate discretion between researcher and system, governing what is executed and when iteration stops. We contribute in two ways. First, we map the recent Green AI literature into seven themes: training footprint is the largest cluster, while inference efficiency and system level optimisation are growing rapidly, alongside measurement protocols, green algorithms, governance, and security and efficiency trade-offs. Second, we benchmark a modern economic survey workflow, an LDA-based literature mapping implemented with GenAI assisted coding and executed in a fixed cloud notebook, measuring runtime and estimated CO2e with CodeCarbon. Injecting generic green language into prompts has no reliable effect, whereas operational constraints and decision rule prompts deliver large and stable footprint reductions while preserving decision equivalent topic outputs. The results identify human in the loop governance as a practical lever to align GenAI productivity with environmental efficiency.

On the Carbon Footprint of Economic Research in the Age of Generative AI

Abstract

Generative artificial intelligence (AI) is increasingly used to write and refactor research code, expanding computational workflows. At the same time, Green AI research has largely measured the footprint of models rather than the downstream workflows in which GenAI is a tool. We shift the unit of analysis from models to workflows and treat prompts as decision policies that allocate discretion between researcher and system, governing what is executed and when iteration stops. We contribute in two ways. First, we map the recent Green AI literature into seven themes: training footprint is the largest cluster, while inference efficiency and system level optimisation are growing rapidly, alongside measurement protocols, green algorithms, governance, and security and efficiency trade-offs. Second, we benchmark a modern economic survey workflow, an LDA-based literature mapping implemented with GenAI assisted coding and executed in a fixed cloud notebook, measuring runtime and estimated CO2e with CodeCarbon. Injecting generic green language into prompts has no reliable effect, whereas operational constraints and decision rule prompts deliver large and stable footprint reductions while preserving decision equivalent topic outputs. The results identify human in the loop governance as a practical lever to align GenAI productivity with environmental efficiency.

Paper Structure

This paper contains 18 sections, 8 equations, 14 figures, 13 tables.

Figures (14)

  • Figure 1: Identification and screening pipeline for the Green AI literature map. Counts reflect our arXiv retrieval and filtering protocol.
  • Figure 2: Model-selection diagnostics for the Green AI corpus: topic coherence ($c_v$, higher is better) and held-out perplexity (lower is better) for LDA models estimated over $K\in\{5,\ldots,15\}$. Each point corresponds to one fitted model on the preprocessed abstract corpus; the main specification uses $K=7$ based on the coherence plateau and interpretability checks.
  • Figure 3: Topic co-occurrence in the Green AI corpus. Cells report the frequency with which two topics appear within the same document (based on document-level topic mixtures from the $K=7$ LDA model), highlighting which themes tend to co-appear in abstracts.
  • Figure 4: Yearly evolution of Green AI themes. For each year, we count documents by their dominant LDA topic (highest topic share within the document) using the $K=7$ model, showing how the composition of the retained arXiv corpus shifts over time.
  • Figure 5: Conceptual mechanism: prompt design allocates discretion between researcher and system, shaping search scope, stopping, and which outputs are computed, which determines executed compute and therefore runtime and CO$_2$e. Output equivalence is required to interpret footprint reductions as efficiency gains.
  • ...and 9 more figures