Table of Contents
Fetching ...

UniFlowRestore: A General Video Restoration Framework via Flow Matching and Prompt Guidance

Shuning Sun, Yu Zhang, Chen Wu, Dianjie Lu, Dianjie Lu, Guijuan Zhan, Yang Weng, Zhuoran Zheng

TL;DR

UniFlowRestore addresses diverse video degradations by proposing a unified restoration framework that treats restoration as a time-continuous evolution under a prompt-guided vector field. It marries a physics-aware backbone PhysicsUNet, which encodes degradation priors as potential energy, with a learnable PromptGenerator that supplies momentum, forming a Hamiltonian system solved by a fixed-step ODE for efficiency and stability. The approach yields strong generalization across multiple tasks (dehazing, deraining, denoising, deblurring) and achieves state-of-the-art denoising results (PSNR $33.89$ dB, SSIM $0.97$) while maintaining competitive performance on other tasks. These contributions offer a scalable, interpretable, and resource-efficient path toward universal video restoration, demonstrated on a large all-in-one dataset and validated through ablations. Overall, the work advances flow-based, physics-informed, prompt-guided video restoration with practical implications for real-world video pipelines.

Abstract

Video imaging is often affected by complex degradations such as blur, noise, and compression artifacts. Traditional restoration methods follow a "single-task single-model" paradigm, resulting in poor generalization and high computational cost, limiting their applicability in real-world scenarios with diverse degradation types. We propose UniFlowRestore, a general video restoration framework that models restoration as a time-continuous evolution under a prompt-guided and physics-informed vector field. A physics-aware backbone PhysicsUNet encodes degradation priors as potential energy, while PromptGenerator produces task-relevant prompts as momentum. These components define a Hamiltonian system whose vector field integrates inertial dynamics, decaying physical gradients, and prompt-based guidance. The system is optimized via a fixed-step ODE solver to achieve efficient and unified restoration across tasks. Experiments show that UniFlowRestore delivers stateof-the-art performance with strong generalization and efficiency. Quantitative results demonstrate that UniFlowRestore achieves state-of-the-art performance, attaining the highest PSNR (33.89 dB) and SSIM (0.97) on the video denoising task, while maintaining top or second-best scores across all evaluated tasks.

UniFlowRestore: A General Video Restoration Framework via Flow Matching and Prompt Guidance

TL;DR

UniFlowRestore addresses diverse video degradations by proposing a unified restoration framework that treats restoration as a time-continuous evolution under a prompt-guided vector field. It marries a physics-aware backbone PhysicsUNet, which encodes degradation priors as potential energy, with a learnable PromptGenerator that supplies momentum, forming a Hamiltonian system solved by a fixed-step ODE for efficiency and stability. The approach yields strong generalization across multiple tasks (dehazing, deraining, denoising, deblurring) and achieves state-of-the-art denoising results (PSNR dB, SSIM ) while maintaining competitive performance on other tasks. These contributions offer a scalable, interpretable, and resource-efficient path toward universal video restoration, demonstrated on a large all-in-one dataset and validated through ablations. Overall, the work advances flow-based, physics-informed, prompt-guided video restoration with practical implications for real-world video pipelines.

Abstract

Video imaging is often affected by complex degradations such as blur, noise, and compression artifacts. Traditional restoration methods follow a "single-task single-model" paradigm, resulting in poor generalization and high computational cost, limiting their applicability in real-world scenarios with diverse degradation types. We propose UniFlowRestore, a general video restoration framework that models restoration as a time-continuous evolution under a prompt-guided and physics-informed vector field. A physics-aware backbone PhysicsUNet encodes degradation priors as potential energy, while PromptGenerator produces task-relevant prompts as momentum. These components define a Hamiltonian system whose vector field integrates inertial dynamics, decaying physical gradients, and prompt-based guidance. The system is optimized via a fixed-step ODE solver to achieve efficient and unified restoration across tasks. Experiments show that UniFlowRestore delivers stateof-the-art performance with strong generalization and efficiency. Quantitative results demonstrate that UniFlowRestore achieves state-of-the-art performance, attaining the highest PSNR (33.89 dB) and SSIM (0.97) on the video denoising task, while maintaining top or second-best scores across all evaluated tasks.

Paper Structure

This paper contains 15 sections, 9 equations, 6 figures, 3 tables.

Figures (6)

  • Figure 1: Comparison of different video restoration paradigms. (a) Single-task models train a dedicated network for each degradation type, resulting in high maintenance cost and poor generalization. (b) PromptIR introduces task labels and prompt tuning to enable multi-task restoration, but relies heavily on task annotations and lacks temporal modeling. (c) Our method formulates video restoration as a continuous evolution under a prompt-guided vector field, enabling unified and label-free restoration across multiple degradations with improved temporal coherence.
  • Figure 2: Overview of the UniFlowRestore framework. Given a degraded frame $X_i$, the Prompt Generator extracts a task-aware prompt $Z$ to modulate the PhysicsUNet backbone. The network predicts a clean intermediate frame $\tilde{X}_i$ and constructs the Hamiltonian energy with potential $U(X)$ and momentum $P$. These are injected into a time-continuous vector field $f(X_i)$, which governs the image evolution through fixed-step ODE integration. By sampling along the flow $f(t, x(t); \theta)$, the degraded input is progressively restored toward a clean state.
  • Figure 3: Visualization of all-in-one task in different models. Our approach reached state-of-art performance in most tasks.
  • Figure 4: Qualitative results on real-world video frames with complex degradations. Top: degraded inputs. Bottom: restored outputs by UniFlowRestore. These methods were evaluated on these real-world images using the NIQE metric (the lower the score, the better). The average score of our method is 4.3, while the average scores of other methods range from 4.6 to 5.1.
  • Figure 5: t-SNE visualization of prompt embeddings across four restoration tasks. Each point corresponds to a prompt vector extracted from one video frame. The clusters indicate that task semantics are effectively encoded by the PromptGenerator.
  • ...and 1 more figures