Table of Contents
Fetching ...

Urban Waterlogging Detection: A Challenging Benchmark and Large-Small Model Co-Adapter

Suqi Song, Chenxu Zhang, Peng Zhang, Pengkun Li, Fenglong Song, Lei Zhang

TL;DR

The paper tackles urban waterlogging detection under adverse conditions with limited labeled data by introducing UW-Bench, a challenging, real-world surveillance dataset. It proposes a SAM-guided Large-Small Model co-adapter (LSM-Adapter) that fuses a large-model branch (with a Histogram Equalization Adapter) and a task-specific small-model branch through a Triple-S Prompt Adapter and a Dynamic Prompt Combiner, enabling robust segmentation despite challenging lighting and reflections. Key contributions include the UW-Bench dataset, HE-Adapt, TSP-Adapt (spatial, semantic, style prompts), and a two-stage training strategy that yields state-of-the-art performance over baselines including SAM-Adapter across general and hard samples. The approach demonstrates the practical potential of leveraging vision foundation models for real-world waterlogging detection and provides a scalable pathway for deployment under data scarcity, with insights on prompt design, training strategy, and hyper-parameter effects.

Abstract

Urban waterlogging poses a major risk to public safety and infrastructure. Conventional methods using water-level sensors need high-maintenance to hardly achieve full coverage. Recent advances employ surveillance camera imagery and deep learning for detection, yet these struggle amidst scarce data and adverse environmental conditions. In this paper, we establish a challenging Urban Waterlogging Benchmark (UW-Bench) under diverse adverse conditions to advance real-world applications. We propose a Large-Small Model co-adapter paradigm (LSM-adapter), which harnesses the substantial generic segmentation potential of large model and the specific task-directed guidance of small model. Specifically, a Triple-S Prompt Adapter module alongside a Dynamic Prompt Combiner are proposed to generate then merge multiple prompts for mask decoder adaptation. Meanwhile, a Histogram Equalization Adap-ter module is designed to infuse the image specific information for image encoder adaptation. Results and analysis show the challenge and superiority of our developed benchmark and algorithm. Project page: \url{https://github.com/zhang-chenxu/LSM-Adapter}

Urban Waterlogging Detection: A Challenging Benchmark and Large-Small Model Co-Adapter

TL;DR

The paper tackles urban waterlogging detection under adverse conditions with limited labeled data by introducing UW-Bench, a challenging, real-world surveillance dataset. It proposes a SAM-guided Large-Small Model co-adapter (LSM-Adapter) that fuses a large-model branch (with a Histogram Equalization Adapter) and a task-specific small-model branch through a Triple-S Prompt Adapter and a Dynamic Prompt Combiner, enabling robust segmentation despite challenging lighting and reflections. Key contributions include the UW-Bench dataset, HE-Adapt, TSP-Adapt (spatial, semantic, style prompts), and a two-stage training strategy that yields state-of-the-art performance over baselines including SAM-Adapter across general and hard samples. The approach demonstrates the practical potential of leveraging vision foundation models for real-world waterlogging detection and provides a scalable pathway for deployment under data scarcity, with insights on prompt design, training strategy, and hyper-parameter effects.

Abstract

Urban waterlogging poses a major risk to public safety and infrastructure. Conventional methods using water-level sensors need high-maintenance to hardly achieve full coverage. Recent advances employ surveillance camera imagery and deep learning for detection, yet these struggle amidst scarce data and adverse environmental conditions. In this paper, we establish a challenging Urban Waterlogging Benchmark (UW-Bench) under diverse adverse conditions to advance real-world applications. We propose a Large-Small Model co-adapter paradigm (LSM-adapter), which harnesses the substantial generic segmentation potential of large model and the specific task-directed guidance of small model. Specifically, a Triple-S Prompt Adapter module alongside a Dynamic Prompt Combiner are proposed to generate then merge multiple prompts for mask decoder adaptation. Meanwhile, a Histogram Equalization Adap-ter module is designed to infuse the image specific information for image encoder adaptation. Results and analysis show the challenge and superiority of our developed benchmark and algorithm. Project page: \url{https://github.com/zhang-chenxu/LSM-Adapter}
Paper Structure (29 sections, 10 equations, 7 figures, 8 tables, 1 algorithm)

This paper contains 29 sections, 10 equations, 7 figures, 8 tables, 1 algorithm.

Figures (7)

  • Figure 1: Waterlogging detection under general and hard conditions, such as strong-light reflection, low-light condition and clear water. The first 4 rows show general samples and hard samples for the last 4 rows. The practical difficulty of this task is witnessed.
  • Figure 2: The proposed Large-Small Model Co-adapter Paradigm, which include a histogram equalization adapter, a triple-S prompt adapter and a dynamic prompt combiner. All components except the image encoder of SAM are trained for prompt generation, learning and adaptation, toward adverse waterlogging detection.
  • Figure 3: Details of the proposed histogram equalization adapter and the prototype learning based semantic prompter.
  • Figure 4: One-stage and Two-stage training strategies of the proposed large-small model paradigm for collaborative optimization.
  • Figure 5: Training and testing examples in the developed UW-Bench. For objectively evaluating the capability of the model in real-world applications, we consider both general-sample and hard-sample cases in test set.
  • ...and 2 more figures