Bayesian Modeling of Zero-Shot Classifications for Urban Flood Detection
Matt Franchi, Nikhil Garg, Wendy Ju, Emma Pierson
TL;DR
BayFlood presents a two-stage framework that leverages zero-shot vision-language models to detect urban flooding in unlabeled street images and a subsequent Bayesian spatial model to quantify uncertainty, smooth across space, and incorporate external flood-risk covariates. The approach is validated on over 1 million dashcam images from NYC storms and supplemented with external data such as 311 reports, FloodNet sensors, stormwater maps, DEM, and ACS demographics. Results show strong zero-shot flood signals, improved out-of-sample predictions, and robust performance with few ground-truth annotations, with inferred flood risk correlating with known risk markers and revealing biases in public reporting. Applied to New York City, BayFlood identifies flood-prone areas overlooked by existing methods and suggests sensor-placement strategies, illustrating practical impact for urban resiliency and policy. The work demonstrates a general paradigm for integrating foundation-model annotations with Bayesian inference to obtain calibrated uncertainty and actionable insights without large labeled datasets.
Abstract
Street scene datasets, collected from Street View or dashboard cameras, offer a promising means of detecting urban objects and incidents like street flooding. However, a major challenge in using these datasets is their lack of reliable labels: there are myriad types of incidents, many types occur rarely, and ground-truth measures of where incidents occur are lacking. Here, we propose BayFlood, a two-stage approach which circumvents this difficulty. First, we perform zero-shot classification of where incidents occur using a pretrained vision-language model (VLM). Second, we fit a spatial Bayesian model on the VLM classifications. The zero-shot approach avoids the need to annotate large training sets, and the Bayesian model provides frequent desiderata in urban settings - principled measures of uncertainty, smoothing across locations, and incorporation of external data like stormwater accumulation zones. We comprehensively validate this two-stage approach, showing that VLMs provide strong zero-shot signal for floods across multiple cities and time periods, the Bayesian model improves out-of-sample prediction relative to baseline methods, and our inferred flood risk correlates with known external predictors of risk. Having validated our approach, we show it can be used to improve urban flood detection: our analysis reveals 113,738 people who are at high risk of flooding overlooked by current methods, identifies demographic biases in existing methods, and suggests locations for new flood sensors. More broadly, our results showcase how Bayesian modeling of zero-shot LM annotations represents a promising paradigm because it avoids the need to collect large labeled datasets and leverages the power of foundation models while providing the expressiveness and uncertainty quantification of Bayesian models.
