Table of Contents
Fetching ...

WeatherDiffusion: Controllable Weather Editing in Intrinsic Space

Yixin Zhu, Zuoliang Zhu, Jian Yang, Miloš Hašan, Jin Xie, Beibei Wang

TL;DR

WeatherDiffusion tackles robust, controllable weather editing in outdoor driving scenes by operating in intrinsic space. It jointly learns an inverse renderer to infer weather-invariant maps (albedo, normals, etc.) and a forward renderer that re-synthesizes weather-conditioned images guided by text prompts, enhanced by intrinsic-map-aware attention and CLIP-space interpolation. The authors introduce WeatherSynthetic and WeatherReal datasets with intrinsic maps to support learning and evaluation, and demonstrate advantages over pixel-space editing, weather restoration, and rendering-based methods, including measurable gains in downstream perception tasks. This intrinsic-space approach enables fine-grained, physically grounded control of weather effects while preserving scene geometry and material integrity, with potential impact on autonomous driving robustness under adverse conditions.

Abstract

We present WeatherDiffusion, a diffusion-based framework for controllable weather editing in intrinsic space. Our framework includes two components based on diffusion priors: an inverse renderer that estimates material properties, scene geometry, and lighting as intrinsic maps from an input image, and a forward renderer that utilizes these geometry and material maps along with a text prompt that describes specific weather conditions to generate a final image. The intrinsic maps enhance controllability compared to traditional pixel-space editing approaches. We propose an intrinsic map-aware attention mechanism that improves spatial correspondence and decomposition quality in large outdoor scenes. For forward rendering, we leverage CLIP-space interpolation of weather prompts to achieve fine-grained weather control. We also introduce a synthetic and a real-world dataset, containing 38k and 18k images under various weather conditions, each with intrinsic map annotations. WeatherDiffusion outperforms state-of-the-art pixel-space editing approaches, weather restoration methods, and rendering-based methods, showing promise for downstream tasks such as autonomous driving, enhancing the robustness of detection and segmentation in challenging weather scenarios.

WeatherDiffusion: Controllable Weather Editing in Intrinsic Space

TL;DR

WeatherDiffusion tackles robust, controllable weather editing in outdoor driving scenes by operating in intrinsic space. It jointly learns an inverse renderer to infer weather-invariant maps (albedo, normals, etc.) and a forward renderer that re-synthesizes weather-conditioned images guided by text prompts, enhanced by intrinsic-map-aware attention and CLIP-space interpolation. The authors introduce WeatherSynthetic and WeatherReal datasets with intrinsic maps to support learning and evaluation, and demonstrate advantages over pixel-space editing, weather restoration, and rendering-based methods, including measurable gains in downstream perception tasks. This intrinsic-space approach enables fine-grained, physically grounded control of weather effects while preserving scene geometry and material integrity, with potential impact on autonomous driving robustness under adverse conditions.

Abstract

We present WeatherDiffusion, a diffusion-based framework for controllable weather editing in intrinsic space. Our framework includes two components based on diffusion priors: an inverse renderer that estimates material properties, scene geometry, and lighting as intrinsic maps from an input image, and a forward renderer that utilizes these geometry and material maps along with a text prompt that describes specific weather conditions to generate a final image. The intrinsic maps enhance controllability compared to traditional pixel-space editing approaches. We propose an intrinsic map-aware attention mechanism that improves spatial correspondence and decomposition quality in large outdoor scenes. For forward rendering, we leverage CLIP-space interpolation of weather prompts to achieve fine-grained weather control. We also introduce a synthetic and a real-world dataset, containing 38k and 18k images under various weather conditions, each with intrinsic map annotations. WeatherDiffusion outperforms state-of-the-art pixel-space editing approaches, weather restoration methods, and rendering-based methods, showing promise for downstream tasks such as autonomous driving, enhancing the robustness of detection and segmentation in challenging weather scenarios.

Paper Structure

This paper contains 22 sections, 5 equations, 12 figures, 3 tables.

Figures (12)

  • Figure 1: We introduce WeatherDiffusion, a framework for controllable weather editing in intrinsic space. Our framework includes two components, an inverse renderer and a forward renderer. The inverse renderer decomposes an input image into intrinsic maps, including weather-invariant material maps (albedo, roughness, metallicity), a normal map, and an irradiance map that captures illumination and weather effects. The forward renderer then combines these maps with a prompt specifying the target weather to synthesize a new image. By disentangling materials, geometry, and illumination, WeatherDiffusion enables realistic and controllable weather manipulation.
  • Figure 2: Overview of WeatherDiffusion. We propose a diffusion-based framework for controllable weather editing for autonomous driving in intrinsic space. The weather-aware inverse renderer decomposes images into weather-invariant and weather-variant maps, while the weather-conditioned forward renderer re-renders images based on given decomposed maps and text prompts that specify the target condition. For the inverse renderer, we design intrinsic map-aware attention to help the inverse renderer focus on important regions corresponding to target intrinsic maps, where the learned map embeddings filter patch tokens via a gating mechanism. For the forward renderer, we design an alpha interpolation in the CLIP semantic space to achieve fine-grained weather control, leveraging the prior in the original Stable Diffusion. By sampling different alpha values, the forward renderer can render natural transitional weather conditions.
  • Figure 3: Attention guidance helps recover distant small objects and fine geometry details.
  • Figure 4: IMAA visualization. Normal estimation primarily concerns the geometry details, especially in regions with sharp variations in surface normals. Metallicity predictions need to selectively attend to metallic objects such as vehicles, poles, and railings. IMAA provides attention guidance for the diffusion model, ensuring spatial correspondence between input and maps.
  • Figure 5: Example of our WeatherSynthetic (the first row) and WeatherReal (the second row).
  • ...and 7 more figures