Table of Contents
Fetching ...

Green Distributed AI Training: Orchestrating Compute Across Renewable-Powered Micro Datacenters

Giuseppe Tomei, Andrea Mayer, Giuseppe Alcini, Stefano Salsano

TL;DR

This work develops a quantitative feasibility-domain model to enable migratory AI training across renewable-powered micro-datacenters. It demonstrates that energy savings are typically achievable within minutes, making time the dominant constraint rather than energy, and that a feasibility-aware orchestrator can substantially reduce non-renewable energy use while improving job completion time. The approach confines migrations to a feasible domain defined by checkpoint size and WAN bandwidth, outperforming energy-only strategies in trace-based evaluations. The study also outlines future extensions to edge hardware, grid integration, federated setups, and economic mechanisms to broaden the viability of renewable-aligned distributed AI.

Abstract

The accelerating expansion of AI workloads is colliding with an energy landscape increasingly dominated by intermittent renewable generation. While vast quantities of zero-carbon energy are routinely curtailed, today's centralized datacenter architectures remain poorly matched to this reality in both energy proportionality and geographic flexibility. This work envisions a shift toward a distributed fabric of renewable-powered micro-datacenters that dynamically follow the availability of surplus green energy through live workload migration. At the core of this vision lies a formal feasibility-domain model that delineates when migratory AI computation is practically achievable. By explicitly linking checkpoint size, wide-area bandwidth, and renewable-window duration, the model reveals that migration is almost always energetically justified, and that time-not energy-is the dominant constraint shaping feasibility. This insight enables the design of a feasibility-aware orchestration framework that transforms migration from a best-effort heuristic into a principled control mechanism. Trace-driven evaluation shows that such orchestration can simultaneously reduce non-renewable energy use and improve performance stability, overcoming the tradeoffs of purely energy-driven strategies. Beyond the immediate feasibility analysis, the extended version explores the architectural horizon of renewable-aware AI infrastructures. It examines the role of emerging ultra-efficient GPU-enabled edge platforms, anticipates integration with grid-level control and demand-response ecosystems, and outlines paths toward supporting partially migratable and distributed workloads. The work positions feasibility-aware migration as a foundational building block for a future computing paradigm in which AI execution becomes fluid, geographically adaptive, and aligned with renewable energy availability.

Green Distributed AI Training: Orchestrating Compute Across Renewable-Powered Micro Datacenters

TL;DR

This work develops a quantitative feasibility-domain model to enable migratory AI training across renewable-powered micro-datacenters. It demonstrates that energy savings are typically achievable within minutes, making time the dominant constraint rather than energy, and that a feasibility-aware orchestrator can substantially reduce non-renewable energy use while improving job completion time. The approach confines migrations to a feasible domain defined by checkpoint size and WAN bandwidth, outperforming energy-only strategies in trace-based evaluations. The study also outlines future extensions to edge hardware, grid integration, federated setups, and economic mechanisms to broaden the viability of renewable-aligned distributed AI.

Abstract

The accelerating expansion of AI workloads is colliding with an energy landscape increasingly dominated by intermittent renewable generation. While vast quantities of zero-carbon energy are routinely curtailed, today's centralized datacenter architectures remain poorly matched to this reality in both energy proportionality and geographic flexibility. This work envisions a shift toward a distributed fabric of renewable-powered micro-datacenters that dynamically follow the availability of surplus green energy through live workload migration. At the core of this vision lies a formal feasibility-domain model that delineates when migratory AI computation is practically achievable. By explicitly linking checkpoint size, wide-area bandwidth, and renewable-window duration, the model reveals that migration is almost always energetically justified, and that time-not energy-is the dominant constraint shaping feasibility. This insight enables the design of a feasibility-aware orchestration framework that transforms migration from a best-effort heuristic into a principled control mechanism. Trace-driven evaluation shows that such orchestration can simultaneously reduce non-renewable energy use and improve performance stability, overcoming the tradeoffs of purely energy-driven strategies. Beyond the immediate feasibility analysis, the extended version explores the architectural horizon of renewable-aware AI infrastructures. It examines the role of emerging ultra-efficient GPU-enabled edge platforms, anticipates integration with grid-level control and demand-response ecosystems, and outlines paths toward supporting partially migratable and distributed workloads. The work positions feasibility-aware migration as a foundational building block for a future computing paradigm in which AI execution becomes fluid, geographically adaptive, and aligned with renewable energy availability.

Paper Structure

This paper contains 56 sections, 16 equations, 2 figures, 8 tables.

Figures (2)

  • Figure 1: Energy breakeven curves for checkpoint sizes from 1--100 GB. All breakeven points occur within minutes, confirming that migration’s energetic cost is negligible relative to multi-hour renewable windows.
  • Figure 2: Feasibility domain: transfer-time isolines show that only sub-20 GB states migrate efficiently on 1--10 Gbps links.