VL-UR: Vision-Language-guided Universal Restoration of Images Degraded by Adverse Weather Conditions

Ziyan Liu; Yuxu Lu; Huashan Yu; Dong yang

VL-UR: Vision-Language-guided Universal Restoration of Images Degraded by Adverse Weather Conditions

Ziyan Liu, Yuxu Lu, Huashan Yu, Dong yang

TL;DR

This work introduces VL-UR, a universal image restoration framework that jointly leverages CLIP-based vision-language priors and a degradation-aware scene classifier to restore images degraded by diverse weather conditions. The system combines a frozen CLIP SC with a Transformer-based SR, using Cross-Transformer Aggregation and a prompt-guided attention mechanism to fuse semantic text and image cues across eleven degradation types. A hybrid loss combining Smooth L1, MS-SSIM, and a CDRL-like feature separation term drives pixel, structural, and feature-level optimization, achieving state-of-the-art results on the CDD-11 dataset with efficient, near real-time performance. The approach enables robust, adaptive restoration suitable for real-world applications such as autonomous driving and surveillance, where degradations are often complex and non-uniform.

Abstract

Image restoration is critical for improving the quality of degraded images, which is vital for applications like autonomous driving, security surveillance, and digital content enhancement. However, existing methods are often tailored to specific degradation scenarios, limiting their adaptability to the diverse and complex challenges in real-world environments. Moreover, real-world degradations are typically non-uniform, highlighting the need for adaptive and intelligent solutions. To address these issues, we propose a novel vision-language-guided universal restoration (VL-UR) framework. VL-UR leverages a zero-shot contrastive language-image pre-training (CLIP) model to enhance image restoration by integrating visual and semantic information. A scene classifier is introduced to adapt CLIP, generating high-quality language embeddings aligned with degraded images while predicting degraded types for complex scenarios. Extensive experiments across eleven diverse degradation settings demonstrate VL-UR's state-of-the-art performance, robustness, and adaptability. This positions VL-UR as a transformative solution for modern image restoration challenges in dynamic, real-world environments.

VL-UR: Vision-Language-guided Universal Restoration of Images Degraded by Adverse Weather Conditions

TL;DR

Abstract

VL-UR: Vision-Language-guided Universal Restoration of Images Degraded by Adverse Weather Conditions

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (6)