VLM-Augmented Degradation Modeling for Image Restoration Under Adverse Weather Conditions
Qianyi Shao, Yuanfan Zhang, Renxiang Xiao, Liang Hu
TL;DR
The paper tackles reliable image restoration under diverse adverse weather by introducing MVLR, a compact encoder–decoder framework augmented with a Visual-Language Model (VLM) that generates degradation priors and an Implicit Memory Bank (IMB) of degradation prototypes. The VLM-prior guides a transformer-based encoder through cross-attention, and the IMB retrieves relevant prototypes via cosine similarity to refine features, with a fusion mechanism that yields high-fidelity restoration. The approach outperforms single-branch and mixture-of-experts baselines on four severe-weather benchmarks in PSNR and SSIM, while maintaining efficiency suitable for real-time deployment. These results suggest MVLR's practical value for robust outdoor perception in autonomous systems and robotics.
Abstract
Reliable visual perception under adverse weather conditions, such as rain, haze, snow, or a mixture of them, is desirable yet challenging for autonomous driving and outdoor robots. In this paper, we propose a unified Memory-Enhanced Visual-Language Recovery (MVLR) model that restores images from different degradation levels under various weather conditions. MVLR couples a lightweight encoder-decoder backbone with a Visual-Language Model (VLM) and an Implicit Memory Bank (IMB). The VLM performs chain-of-thought inference to encode weather degradation priors and the IMB stores continuous latent representations of degradation patterns. The VLM-generated priors query the IMB to retrieve fine-grained degradation prototypes. These prototypes are then adaptively fused with multi-scale visual features via dynamic cross-attention mechanisms, enhancing restoration accuracy while maintaining computational efficiency. Extensive experiments on four severe-weather benchmarks show that MVLR surpasses single-branch and Mixture-of-Experts baselines in terms of Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity Index Measure (SSIM). These results indicate that MVLR offers a practical balance between model compactness and expressiveness for real-time deployment in diverse outdoor conditions.
