Carbon and Reliability-Aware Computing for Heterogeneous Data Centers
Yichao Zhang, Yubo Song, Subham Sahoo
TL;DR
This paper tackles the carbon- and reliability-aware problem of spatio-temporal workload migration in distributed data centers. It introduces a MILP framework that jointly minimizes operational and embodied carbon while accounting for server aging, heterogeneity, and backup resource provisioning to meet SLA. The embodied emissions model links manufacturing footprint to server lifetimes and utilization, and the optimization incorporates interactive and batch workloads, server dispatch with redundancy, and a linearization scheme for tractability. Numerical results on two interconnected DCs show up to 21% total carbon reductions and SLA reliability improvements to under 1% violations, with an optimal server utilization around 0.6 that balances energy efficiency and reliability. The work provides a practical, degradation-aware approach for sustainable and dependable DC operations in heterogeneous, geo-distributed environments.
Abstract
The rapid expansion of data centers (DCs) has intensified energy and carbon footprint, incurring a massive environmental computing cost. While carbon-aware workload migration strategies have been examined, existing approaches often overlook reliability metrics such as server lifetime degradation, and quality-of-service (QoS) that substantially affects both carbon and operational efficiency of DCs. Hence, this paper proposes a comprehensive optimization framework for spatio-temporal workload migration across distributed DCs that jointly minimizes operational and embodied carbon emissions while complying with service-level agreements (SLA). A key contribution is the development of an embodied carbon emission model based on servers' expected lifetime analysis, which explicitly considers server heterogeneity resulting from aging and utilization conditions. These issues are accommodated using new server dispatch strategies, and backup resource allocation model, accounting hardware, software and workload-induced failure. The overall model is formulated as a mixed-integer optimization problem with multiple linearization techniques to ensure computational tractability. Numerical case studies demonstrate that the proposed method reduces total carbon emissions by up to 21%, offering a pragmatic approach to sustainable DC operations.
