A Water Efficiency Dataset for African Data Centers
Noah Shumba, Opelo Tshekiso, Pengfei Li, Giulia Fanti, Shaolei Ren
TL;DR
The paper tackles the gap in understanding data-center water footprints in Africa by constructing a first-of-its-kind dataset of onsite $WUE$ and offsite water intensity across 41 African countries, classified into five climate regions, with hourly estimates over a year using WeatherAPI and national fuel mixes. It builds on the Gupta et al. methodology to model onsite and offsite water use, defining $WUE$ as water withdrawal minus discharge and deriving both components from region-specific weather and energy data. As a demonstration, the authors estimate AI model inference water use for Llama-3-70B and GPT-4 in 11 countries, finding about 0.7 L versus up to ~60 L for a 10-page report, and identify regional patterns where eight countries fall below the global average due to fuel mix, while some steppe-region countries approach or exceed it. The dataset, publicly available on HuggingFace, supports region-specific cooling and energy planning, informs sustainable AI deployment, and highlights the need for transparency about water usage in African data-center operations amid local water-stress concerns.
Abstract
AI computing and data centers consume a large amount of freshwater, both directly for cooling and indirectly for electricity generation. While most attention has been paid to developed countries such as the U.S., this paper presents the first-of-its-kind dataset that combines nation-level weather and electricity generation data to estimate water usage efficiency for data centers in 41 African countries across five different climate regions. We also use our dataset to evaluate and estimate the water consumption of inference on two large language models (i.e., Llama-3-70B and GPT-4) in 11 selected African countries. Our findings show that writing a 10-page report using Llama-3-70B could consume about \textbf{0.7 liters} of water, while the water consumption by GPT-4 for the same task may go up to about 60 liters. For writing a medium-length email of 120-200 words, Llama-3-70B and GPT-4 could consume about \textbf{0.13 liters} and 3 liters of water, respectively. Interestingly, given the same AI model, 8 out of the 11 selected African countries consume less water than the global average, mainly because of lower water intensities for electricity generation. However, water consumption can be substantially higher in some African countries with a steppe climate than the U.S. and global averages, prompting more attention when deploying AI computing in these countries. Our dataset is publicly available on \href{https://huggingface.co/datasets/masterlion/WaterEfficientDatasetForAfricanCountries/tree/main}{Hugging Face}.
