Plug-and-Play Versatile Compressed Video Enhancement
Huimin Zeng, Jiacheng Li, Zhiwei Xiong
TL;DR
This work tackles the challenge of visual quality degradation in compressed videos across varied compression levels while supporting multiple downstream vision tasks. It introduces a codec-aware enhancement framework with two networks: Compression-Aware Adaptation (CAA) that hierarchically adapts enhancement parameters conditioned on $CRF_s$ and $CRF_i$, and Bitstream-Aware Enhancement (BAE) that leverages motion vectors and partition maps for motion alignment and region-aware refinement. The method demonstrates superior quality enhancement (e.g., PSNR gains up to ~1.2 dB at $CRF_s=15$) and broad versatility across tasks such as video super-resolution, optical flow estimation, video object segmentation, and inpainting, while maintaining competitive computational efficiency. By reusing existing codec information, the approach provides a practical, plug-and-play solution for real-world pipelines, enabling robust performance on compressed videos without a dedicated model per compression setting. This has significant implications for real-time cloud-based analytics and downstream vision systems operating on compressed streams.
Abstract
As a widely adopted technique in data transmission, video compression effectively reduces the size of files, making it possible for real-time cloud computing. However, it comes at the cost of visual quality, posing challenges to the robustness of downstream vision models. In this work, we present a versatile codec-aware enhancement framework that reuses codec information to adaptively enhance videos under different compression settings, assisting various downstream vision tasks without introducing computation bottleneck. Specifically, the proposed codec-aware framework consists of a compression-aware adaptation (CAA) network that employs a hierarchical adaptation mechanism to estimate parameters of the frame-wise enhancement network, namely the bitstream-aware enhancement (BAE) network. The BAE network further leverages temporal and spatial priors embedded in the bitstream to effectively improve the quality of compressed input frames. Extensive experimental results demonstrate the superior quality enhancement performance of our framework over existing enhancement methods, as well as its versatility in assisting multiple downstream tasks on compressed videos as a plug-and-play module. Code and models are available at https://huimin-zeng.github.io/PnP-VCVE/.
