HarmoQ: Harmonized Post-Training Quantization for High-Fidelity Image
Hongjun Wang, Jiyuan Chen, Xuan Song, Yinqiang Zheng
TL;DR
HarmoQ targets efficient, high-fidelity image super-resolution under post-training quantization by uncovering a fundamental asymmetry: weight quantization mainly degrades structural similarity while activation quantization harms pixel-level accuracy. It introduces a unified three-step framework—Structural Residual Calibration, Harmonized Scale Optimization, and Adaptive Boundary Refinement—to jointly mitigate the coupled quantization errors. The approach provides closed-form solutions for weight calibration and scale, plus gradient-based boundary refinement, and demonstrates substantial PSNR/SSIM gains at 2-bit and 3-bit quantization, along with notable speedups and memory reductions on HPC GPUs. By systematically analyzing weight–activation coupling and proposing a principled optimization flow, HarmoQ enables robust, efficient deployment of high-quality SR models on resource-constrained devices.
Abstract
Post-training quantization offers an efficient pathway to deploy super-resolution models, yet existing methods treat weight and activation quantization independently, missing their critical interplay. Through controlled experiments on SwinIR, we uncover a striking asymmetry: weight quantization primarily degrades structural similarity, while activation quantization disproportionately affects pixel-level accuracy. This stems from their distinct roles--weights encode learned restoration priors for textures and edges, whereas activations carry input-specific intensity information. Building on this insight, we propose HarmoQ, a unified framework that harmonizes quantization across components through three synergistic steps: structural residual calibration proactively adjusts weights to compensate for activation-induced detail loss, harmonized scale optimization analytically balances quantization difficulty via closed-form solutions, and adaptive boundary refinement iteratively maintains this balance during optimization. Experiments show HarmoQ achieves substantial gains under aggressive compression, outperforming prior art by 0.46 dB on Set5 at 2-bit while delivering 3.2x speedup and 4x memory reduction on A100 GPUs. This work provides the first systematic analysis of weight-activation coupling in super-resolution quantization and establishes a principled solution for efficient high-quality image restoration.
