High-Resolution Flood Probability Mapping Using Generative Machine Learning with Large-Scale Synthetic Precipitation and Inundation Data
Lipai Huang, Federico Antolini, Ali Mostafavi, Russell Blessing, Matthew Garcia, Samuel D. Brody
TL;DR
The paper tackles the challenge of producing high-resolution probabilistic flood maps in data-limited settings by introducing the Precipitation-Flood Depth Generative Pipeline, a surrogate ML framework that generates large-scale synthetic inundation data from CTGAN-generated precipitation conditioned on region-specific features. A cell-wise depth estimator (MaxFloodCast V2) trained on physics-based flood scenarios, combined with a structured sampling and smoothing pipeline, yields thousands of synthetic rainfall events and corresponding flood depths, enabling probabilistic maps across multiple depth thresholds. Key contributions include the cell-wise depth-estimator approach, a constrained CTGAN for precipitation generation, and an all-to-one event-sampling strategy that preserves nonlinearity while scaling to many events; validation shows the synthetic depth distributions closely resemble training data, and the resulting maps reveal meaningful spatial patterns of flood risk. This framework offers a scalable, region-adaptable tool for flood risk assessment and planning, with potential extensions to other regions and real-time forecasting contexts.
Abstract
High-resolution flood probability maps are instrumental for assessing flood risk but are often limited by the availability of historical data. Additionally, producing simulated data needed for creating probabilistic flood maps using physics-based models involves significant computation and time effort, which inhibit its feasibility. To address this gap, this study introduces Precipitation-Flood Depth Generative Pipeline, a novel methodology that leverages generative machine learning to generate large-scale synthetic inundation data to produce probabilistic flood maps. With a focus on Harris County, Texas, Precipitation-Flood Depth Generative Pipeline begins with training a cell-wise depth estimator using a number of precipitation-flood events model with a physics-based model. This cell-wise depth estimator, which emphasizes precipitation-based features, outperforms universal models. Subsequently, the Conditional Generative Adversarial Network (CTGAN) is used to conditionally generate synthetic precipitation point cloud, which are filtered using strategic thresholds to align with realistic precipitation patterns. Hence, a precipitation feature pool is constructed for each cell, enabling strategic sampling and the generation of synthetic precipitation events. After generating 10,000 synthetic events, flood probability maps are created for various inundation depths. Validation using similarity and correlation metrics confirms the accuracy of the synthetic depth distributions. The Precipitation-Flood Depth Generative Pipeline provides a scalable solution to generate synthetic flood depth data needed for high-resolution flood probability maps, which can enhance flood mitigation planning.
