GroundHog: Revolutionizing GLDAS Groundwater Storage Downscaling for Enhanced Recharge Estimation in Bangladesh
Saleh Sakib Ahmed, Rashed Uz Zzaman, Saifur Rahman Jony, Faizur Rahman Himel, Afroza Sharmin, A. H. M. Khalequr Rahman, M. Sohel Rahman, Sara Nowreen
TL;DR
GroundHog addresses the challenge of scarce in-situ groundwater level data in Bangladesh by combining two trainable components: (i) a pseudo-ground-truth pipeline that predicts high-resolution 2 km Max/Min GWL from sparse observations using 17 hydro-geological factors and year as input, and (ii) a year-agnostic Upsampling Model that downscales coarse GLDAS GWS data (25 km) to 2 km GWL by leveraging 2 km HGFs and a grid-representative hydro-geological context. The approach delivers strong quantitative performance ($R^2$ up to 0.96 for downscaling; $R^2$ 0.855/0.963 for Max/Min GWL pseudo-ground-truth) and robust year-generalization (LOYO $R^2 ext{≈}0.93$ on average). Temporal analysis of the downscaled time series reveals rising GWL with declining recharge across 2003–2022, signaling over-extraction and stressing the need for policies such as Managed Aquifer Recharge (MAR) and stricter extraction controls. The work provides a practical, globally applicable framework to upsample GLDAS-derived storage to high-resolution groundwater levels and offers a public web portal for stakeholder access, enabling data-driven groundwater management decisions in Bangladesh and beyond.
Abstract
Long-term groundwater level (GWL) measurement is vital for effective policymaking and recharge estimation using annual maxima and minima. However, current methods prioritize short-term predictions and lack multi-year applicability, limiting their utility. Moreover, sparse in-situ measurements lead to reliance on low-resolution satellite data like GLDAS as the ground truth for Machine Learning models, further constraining accuracy. To overcome these challenges, we first develop an ML model to mitigate data gaps, achieving $R^2$ scores of 0.855 and 0.963 for maximum and minimum GWL predictions, respectively. Subsequently, using these predictions and well observations as ground truth, we train an Upsampling Model that uses low-resolution (25 km) GLDAS data as input to produce high-resolution (2 km) GWLs, achieving an excellent $R^2$ score of 0.96. Our approach successfully upscales GLDAS data for 2003-2024, allowing high-resolution recharge estimations and revealing critical trends for proactive resource management. Our method allows upsampling of groundwater storage (GWS) from GLDAS to high-resolution GWLs for any points independently of officially curated piezometer data, making it a valuable tool for decision-making.
