Table of Contents
Fetching ...

GroundHog: Revolutionizing GLDAS Groundwater Storage Downscaling for Enhanced Recharge Estimation in Bangladesh

Saleh Sakib Ahmed, Rashed Uz Zzaman, Saifur Rahman Jony, Faizur Rahman Himel, Afroza Sharmin, A. H. M. Khalequr Rahman, M. Sohel Rahman, Sara Nowreen

TL;DR

GroundHog addresses the challenge of scarce in-situ groundwater level data in Bangladesh by combining two trainable components: (i) a pseudo-ground-truth pipeline that predicts high-resolution 2 km Max/Min GWL from sparse observations using 17 hydro-geological factors and year as input, and (ii) a year-agnostic Upsampling Model that downscales coarse GLDAS GWS data (25 km) to 2 km GWL by leveraging 2 km HGFs and a grid-representative hydro-geological context. The approach delivers strong quantitative performance ($R^2$ up to 0.96 for downscaling; $R^2$ 0.855/0.963 for Max/Min GWL pseudo-ground-truth) and robust year-generalization (LOYO $R^2 ext{≈}0.93$ on average). Temporal analysis of the downscaled time series reveals rising GWL with declining recharge across 2003–2022, signaling over-extraction and stressing the need for policies such as Managed Aquifer Recharge (MAR) and stricter extraction controls. The work provides a practical, globally applicable framework to upsample GLDAS-derived storage to high-resolution groundwater levels and offers a public web portal for stakeholder access, enabling data-driven groundwater management decisions in Bangladesh and beyond.

Abstract

Long-term groundwater level (GWL) measurement is vital for effective policymaking and recharge estimation using annual maxima and minima. However, current methods prioritize short-term predictions and lack multi-year applicability, limiting their utility. Moreover, sparse in-situ measurements lead to reliance on low-resolution satellite data like GLDAS as the ground truth for Machine Learning models, further constraining accuracy. To overcome these challenges, we first develop an ML model to mitigate data gaps, achieving $R^2$ scores of 0.855 and 0.963 for maximum and minimum GWL predictions, respectively. Subsequently, using these predictions and well observations as ground truth, we train an Upsampling Model that uses low-resolution (25 km) GLDAS data as input to produce high-resolution (2 km) GWLs, achieving an excellent $R^2$ score of 0.96. Our approach successfully upscales GLDAS data for 2003-2024, allowing high-resolution recharge estimations and revealing critical trends for proactive resource management. Our method allows upsampling of groundwater storage (GWS) from GLDAS to high-resolution GWLs for any points independently of officially curated piezometer data, making it a valuable tool for decision-making.

GroundHog: Revolutionizing GLDAS Groundwater Storage Downscaling for Enhanced Recharge Estimation in Bangladesh

TL;DR

GroundHog addresses the challenge of scarce in-situ groundwater level data in Bangladesh by combining two trainable components: (i) a pseudo-ground-truth pipeline that predicts high-resolution 2 km Max/Min GWL from sparse observations using 17 hydro-geological factors and year as input, and (ii) a year-agnostic Upsampling Model that downscales coarse GLDAS GWS data (25 km) to 2 km GWL by leveraging 2 km HGFs and a grid-representative hydro-geological context. The approach delivers strong quantitative performance ( up to 0.96 for downscaling; 0.855/0.963 for Max/Min GWL pseudo-ground-truth) and robust year-generalization (LOYO on average). Temporal analysis of the downscaled time series reveals rising GWL with declining recharge across 2003–2022, signaling over-extraction and stressing the need for policies such as Managed Aquifer Recharge (MAR) and stricter extraction controls. The work provides a practical, globally applicable framework to upsample GLDAS-derived storage to high-resolution groundwater levels and offers a public web portal for stakeholder access, enabling data-driven groundwater management decisions in Bangladesh and beyond.

Abstract

Long-term groundwater level (GWL) measurement is vital for effective policymaking and recharge estimation using annual maxima and minima. However, current methods prioritize short-term predictions and lack multi-year applicability, limiting their utility. Moreover, sparse in-situ measurements lead to reliance on low-resolution satellite data like GLDAS as the ground truth for Machine Learning models, further constraining accuracy. To overcome these challenges, we first develop an ML model to mitigate data gaps, achieving scores of 0.855 and 0.963 for maximum and minimum GWL predictions, respectively. Subsequently, using these predictions and well observations as ground truth, we train an Upsampling Model that uses low-resolution (25 km) GLDAS data as input to produce high-resolution (2 km) GWLs, achieving an excellent score of 0.96. Our approach successfully upscales GLDAS data for 2003-2024, allowing high-resolution recharge estimations and revealing critical trends for proactive resource management. Our method allows upsampling of groundwater storage (GWS) from GLDAS to high-resolution GWLs for any points independently of officially curated piezometer data, making it a valuable tool for decision-making.

Paper Structure

This paper contains 14 sections, 1 equation, 10 figures.

Figures (10)

  • Figure 1: Methodology:A. Phase 1: Predicting Max GWL using 17 HGFs and year as inputs. B. Phase 2: Predicting Min GWL conditioned on Max GWL to ensure consistency, with inputs including 17 HGFs, year, and Max GWL. C. Year-agnostic Upsampling Model combining low-resolution (Max/Min GWS, 17 HGFs) and high-resolution data, merged by GLDAS SerialID and year, to predict high-resolution Max/Min GWL for future use.
  • Figure 2: Comparison of original, IDW-interpolated, pseudo-ground truth, GLDAS, and downscaled GLDAS results for 2008 (chosen as an example year):A. Original yearly GWL (top: Max, bottom: Min) obtained from various organizations, displayed with increased size for better visualization. B. IDW-interpolated GWL (top Max, bottom: Min). The results are oversimplified and in many areas the minimum GWL exceeds the maximum GWL, which contradicts reality. C. Predicted GWL (top: Max, bottom: Min), ensuring consistency due to conditioning of the Min GWL Model on the maximum GWL. D. GLDAS Maximum GWS at 25 km resolution, capturing broad patterns. E. Downscaled maximum GWL from GLDAS Maximum GWS. F. GLDAS Minimum GWS, showing slightly lower values than the maximum. G. Downscaled minimum GWL from GLDAS Minimum GWS.
  • Figure 3: Metrics and Effects of Variable Changes on GWL:A.$R^2$ scores of Pseudo-Ground Truth for each year. B. Leave-one-out analysis confirms robust upsampling with an average MSE of 0.7286 and $R^2$ of 0.9275. C. Partial Dependence Plot (PDP) for the maximum GWL regressor illustrates feature impacts while holding others constant. For lithology_clay_thickness, higher clay thickness raises GWL (BGL) unit, consistent with findings on recharge obstruction. Drainage_density shows GWL peaking at 0, dropping at 0.2, and rising again, indicating efficient water transport. Increased TRI, slope, and lower SPI correlate with deeper GWL due to reduced infiltration and increased runoff. D. Feature correlation heatmap highlights key relationships. NDWI and NDVI are strongly negatively correlated; higher NDWI corresponds to lower GWL, while higher NDVI aligns with increased GWL. SPI and TWI show positive correlation, with increased TWI raising GWL. These findings validate the model's ability to capture real-world groundwater dynamics. E. 3D Partial Dependence Plot of SPI and TWI.
  • Figure 4: Feature importance and SHAP values: For the SHAP plot, red values indicate higher feature values, while blue values indicate lower feature values. The X-axis represents SHAP values, showing how each feature influences the output, either positively (predicting higher output values) or negatively (predicting lower output values). A.& B. Max GWL Model, C.& D. Min GWL Model, E.& F. & G. Upsampling Model.
  • Figure 5: Temporal comparison of Downscaled Results:A. Downscaled minimum groundwater levels (GWL) in Bangladesh for 2003 in the top and 2022 in the bottom. In 2003, most regions have water levels from 0 to 5.3 meters. However, the central areas, including Dhaka, the densely populated capital, the arid northwest, and the hilly southeast exhibited significantly deeper levels. For 2022, the groundwater level increased notably in the surrounding areas of those aforementioned places, with the inclusion of the southeastern hilly regions of Khagrachari and Bandarban. B. Downscaled maximum groundwater levels (GWL) for 2003 in the top and 2022 in the bottom. For 2003, depths ranged from 0 to 5.3 meters in the southern central areas and the northernmost parts, while most other regions falling between 5.3 and 7.8 meters or 7.8 and 9.8 meters. The arid northwest, central Dhaka region, and southeastern hill tracts record deeper levels of 11.3 to 15 meters. In 2022, these regions exhibit even greater depths of 15 to 26, reflecting a substantial overall rise in both maximum and minimum GWLs. C. Recharge levels for 2003 in the top and 2022 in the bottom. These show a significant decline, particularly in regions with notable changes in minimum and maximum GWLs. Overall, recharge rates have decreased noticeably across the country, with a maximum negative change of $\sim$-27 cm at a point and an average change (which is also negative) of $\sim$-1.48 cm with a standard deviation of 2.83 cm.
  • ...and 5 more figures