WeatherDG: LLM-assisted Diffusion Model for Procedural Weather Generation in Domain-Generalized Semantic Segmentation
Chenghao Qian, Yuhu Guo, Yuhong Mo, Wenjing Li
TL;DR
WeatherDG tackles domain generalization for semantic segmentation under adverse weather by integrating a fine-tuned Stable Diffusion model with a chain of LLM agents to generate realistic, weather-diverse driving scenes. The approach comprises SD fine-tuning to inject driving-scene priors, procedural prompt generation with instance sampling, scene composition, and scene description, and UDA-based training to leverage synthetic data. Key contributions include the driving-scene priors alignment, a three-agent prompt-generation framework with a balanced sampling strategy for tailed classes, and demonstrated improvements across Cityscapes→ACDC, BDD100K, and DarkZurich. The framework is model-agnostic and yields substantial performance gains, highlighting its practical impact for improving robustness of autonomous driving perception in varied weather conditions.
Abstract
In this work, we propose a novel approach, namely WeatherDG, that can generate realistic, weather-diverse, and driving-screen images based on the cooperation of two foundation models, i.e, Stable Diffusion (SD) and Large Language Model (LLM). Specifically, we first fine-tune the SD with source data, aligning the content and layout of generated samples with real-world driving scenarios. Then, we propose a procedural prompt generation method based on LLM, which can enrich scenario descriptions and help SD automatically generate more diverse, detailed images. In addition, we introduce a balanced generation strategy, which encourages the SD to generate high-quality objects of tailed classes under various weather conditions, such as riders and motorcycles. This segmentation-model-agnostic method can improve the generalization ability of existing models by additionally adapting them with the generated synthetic data. Experiments on three challenging datasets show that our method can significantly improve the segmentation performance of different state-of-the-art models on target domains. Notably, in the setting of ''Cityscapes to ACDC'', our method improves the baseline HRDA by 13.9% in mIoU.
