Carbon-Aware Computing for Data Centers with Probabilistic Performance Guarantees
Sophie Hall, Francesco Micheli, Giuseppe Belgioioso, Ana Radovanović, Florian Dörfler
TL;DR
The paper tackles the problem of reducing data-center carbon footprint and peak power by coordinating when and where to execute flexible compute jobs across a fleet. It introduces a two-layer control approach: day-ahead planning via distributionally robust optimization using a $d_W$-based ambiguity set and a $CVaR$-based constraint, followed by real-time placement that tracks the planned schedule. A joint optimization of virtual capacity curves (VCCs) and the scheduling policy $Y$ yields provable probabilistic guarantees and enables exploitation of spatial and temporal job flexibility, including DR signals. Experiments on Google-like load profiles show substantial reductions in carbon cost and peak power compared with myopic greedy policies, with tunable robustness and performance trade-offs. The results highlight practical viability and potential for DR participation and long-term grid planning, thanks to the scalable LP reformulation and receding-horizon extensions.
Abstract
Data centers are significant contributors to carbon emissions and can strain power systems due to their high electricity consumption. To mitigate this impact and to participate in demand response programs, cloud computing companies strive to balance and optimize operations across their global fleets by making strategic decisions about when and where to place compute jobs for execution. In this paper, we introduce a load shaping scheme which reacts to time-varying grid signals by leveraging both temporal and spatial flexibility of compute jobs to provide risk-aware management guidelines and job placement with provable performance guarantees based on distributionally robust optimization. Our approach divides the problem into two key components: (i) day-ahead planning, which generates an optimal scheduling strategy based on historical load data, and (ii) real-time job placement and (time) scheduling, which dynamically tracks the optimal strategy generated in (i). We validate our method in simulation using normalized load profiles from randomly selected Google clusters, incorporating time-varying grid signals. We can demonstrate significant reductions in carbon cost and peak power with our approach compared to myopic greedy policies, while maintaining computational efficiency and abiding to system and grid constraints.
