Plug-and-Play Tri-Branch Invertible Block for Image Rescaling
Jingwei Bao, Jinhua Hao, Pengcheng Xu, Ming Sun, Chao Zhou, Shuyuan Zhu
TL;DR
This work tackles image rescaling by learning a bijective mapping between HR and LR with invertible neural networks. It introduces a plug-and-play tri-branch invertible block (T-InvBlock) that splits the low-frequency content in YCbCr into luminance $x_y$ and chrominance $x_c$, processing them alongside high-frequency information with a dedicated high-frequency branch $x_h$, thereby reducing inter-channel redundancy. In training, the method adopts an all-zero high-frequency mapping $z=0$ during upscaling, shifting the burden of detail reconstruction to the LR content and enabling robust HR restoration. The T-InvBlock is integrated into IRN and SAIN to form T-IRN and T-SAIN, respectively, and experiments across general rescaling and lossy compression scenarios demonstrate state-of-the-art or competitive improvements with a plug-and-play design and minimal architectural changes.
Abstract
High-resolution (HR) images are commonly downscaled to low-resolution (LR) to reduce bandwidth, followed by upscaling to restore their original details. Recent advancements in image rescaling algorithms have employed invertible neural networks (INNs) to create a unified framework for downscaling and upscaling, ensuring a one-to-one mapping between LR and HR images. Traditional methods, utilizing dual-branch based vanilla invertible blocks, process high-frequency and low-frequency information separately, often relying on specific distributions to model high-frequency components. However, processing the low-frequency component directly in the RGB domain introduces channel redundancy, limiting the efficiency of image reconstruction. To address these challenges, we propose a plug-and-play tri-branch invertible block (T-InvBlocks) that decomposes the low-frequency branch into luminance (Y) and chrominance (CbCr) components, reducing redundancy and enhancing feature processing. Additionally, we adopt an all-zero mapping strategy for high-frequency components during upscaling, focusing essential rescaling information within the LR image. Our T-InvBlocks can be seamlessly integrated into existing rescaling models, improving performance in both general rescaling tasks and scenarios involving lossy compression. Extensive experiments confirm that our method advances the state of the art in HR image reconstruction.
