Table of Contents
Fetching ...

The Unpaid Toll: Quantifying and Addressing the Public Health Impact of Data Centers

Yuelin Han, Zhifeng Wu, Pengfei Li, Adam Wierman, Shaolei Ren

TL;DR

This paper addresses the overlooked public health burden of data centers by developing an end-to-end framework to quantify lifecycle pollutant emissions and health costs across three scopes (on-site, power-supply, and supply chain). It combines dispersion modeling (COBRA/PCAPS/InMAP) with exposure–response economics to translate emissions into monetized health impacts, revealing a projected U.S. health cost of over $20B by 2028 driven by AI demand and infrastructure. A key finding is the high spatial variability of health impacts, with disadvantaged communities bearing disproportionate costs, and that health costs can diverge significantly from carbon-centric metrics. The authors propose health-informed computing (HI-GLB) and standard reporting to reduce health burdens while preserving energy and carbon goals, underscoring the need for equity-driven, data-driven decision-making in AI infrastructure deployment.

Abstract

The surging demand for AI has led to a rapid expansion of energy-intensive data centers, impacting the environment through escalating carbon emissions and water consumption. While significant attention has been paid to data centers' growing environmental footprint, the public health burden, a hidden toll of data centers, has been largely overlooked. Specifically, data centers' lifecycle, from chip manufacturing to operation, can significantly degrade air quality through emissions of criteria air pollutants such as fine particulate matter, substantially impacting public health. This paper introduces a principled methodology to model lifecycle pollutant emissions for data centers and computing tasks, quantifying the public health impacts. Our findings reveal that training a large AI model comparable to the Llama-3.1 scale can produce air pollutants equivalent to more than 10,000 round trips by car between Los Angeles and New York City. The growing demand for AI is projected to push the total annual public health burden of U.S. data centers up to more than $20 billion in 2028, rivaling that of on-road emissions of California. Further, the public health costs are more felt in disadvantaged communities, where the per-household health burden could be 200x more than that in less-impacted communities. Finally, we propose a health-informed computing framework that explicitly incorporates public health risk as a key metric for scheduling data center workloads across space and time, which can effectively mitigate adverse health impacts while advancing environmental sustainability. More broadly, we also recommend adopting a standard reporting protocol for the public health impacts of data centers and paying attention to all impacted communities.

The Unpaid Toll: Quantifying and Addressing the Public Health Impact of Data Centers

TL;DR

This paper addresses the overlooked public health burden of data centers by developing an end-to-end framework to quantify lifecycle pollutant emissions and health costs across three scopes (on-site, power-supply, and supply chain). It combines dispersion modeling (COBRA/PCAPS/InMAP) with exposure–response economics to translate emissions into monetized health impacts, revealing a projected U.S. health cost of over $20B by 2028 driven by AI demand and infrastructure. A key finding is the high spatial variability of health impacts, with disadvantaged communities bearing disproportionate costs, and that health costs can diverge significantly from carbon-centric metrics. The authors propose health-informed computing (HI-GLB) and standard reporting to reduce health burdens while preserving energy and carbon goals, underscoring the need for equity-driven, data-driven decision-making in AI infrastructure deployment.

Abstract

The surging demand for AI has led to a rapid expansion of energy-intensive data centers, impacting the environment through escalating carbon emissions and water consumption. While significant attention has been paid to data centers' growing environmental footprint, the public health burden, a hidden toll of data centers, has been largely overlooked. Specifically, data centers' lifecycle, from chip manufacturing to operation, can significantly degrade air quality through emissions of criteria air pollutants such as fine particulate matter, substantially impacting public health. This paper introduces a principled methodology to model lifecycle pollutant emissions for data centers and computing tasks, quantifying the public health impacts. Our findings reveal that training a large AI model comparable to the Llama-3.1 scale can produce air pollutants equivalent to more than 10,000 round trips by car between Los Angeles and New York City. The growing demand for AI is projected to push the total annual public health burden of U.S. data centers up to more than $20 billion in 2028, rivaling that of on-road emissions of California. Further, the public health costs are more felt in disadvantaged communities, where the per-household health burden could be 200x more than that in less-impacted communities. Finally, we propose a health-informed computing framework that explicitly incorporates public health risk as a key metric for scheduling data center workloads across space and time, which can effectively mitigate adverse health impacts while advancing environmental sustainability. More broadly, we also recommend adopting a standard reporting protocol for the public health impacts of data centers and paying attention to all impacted communities.

Paper Structure

This paper contains 36 sections, 4 equations, 10 figures, 9 tables.

Figures (10)

  • Figure 1: The overview of data centers' contribution to air pollutants and public health impacts. Scope-1 and scope-2 impacts occur during the operation of data centers ("operational"), whereas scope-3 impacts arise from activities across the supply chain ("embodied").
  • Figure 2: The county-level total scope-1 health cost of data center backup generators operated in Virginia (mostly in Loudoun County, Fairfax County, and Prince William County) Air_DataCenter_Diesel_Generator_PiedmontEnvironmentalCouncil_Web_Map. The backup generators are assumed to emit air pollutants at 10% of the permitted levels per year. The total annual public health cost is $220-300 million, including $190-260 million incurred in Virginia, West Virginia, Maryland, Pennsylvania, New York, New Jersey, Delaware, and Washington D.C. (a) County-level health cost in Virginia, West Virginia, Maryland, Pennsylvania, New York, New Jersey, Delaware, and Washington D.C. Counties with data centers are marked in orange, except for Loudoun County (marked in yellow). (b) CDF of the county-level cost. (c) Top-10 counties by the total health cost.
  • Figure 3: Public health costs of electricity generation and on-road emissions in the contiguous U.S. in 2023 and 2028 Health_COBRA_EPA_Website. The error bars represent high and low estimates returned by COBRA using two different exposure-response functions.
  • Figure 4: The public health costs of U.S. data centers and top-3 state on-road emissions from 2019 to 2023 and the 2028 projection based on the Lawrence Berkeley National Lab's report DoE_DataCenter_EnergyReport_US_2024. The cost for U.S. data centers includes scope-1 and scope-2 impacts. The "High" and "Low" represent the high and low growth rates considered in DoE_DataCenter_EnergyReport_US_2024.
  • Figure 5: The county-level total health cost of U.S. data centers from 2019 to 2023. (a) Health cost map; (b) CDF of county-level health cost; (c) Top-10 counties by total health cost.
  • ...and 5 more figures