Rate-Adaptive Quantization: A Multi-Rate Codebook Adaptation for Vector Quantization-based Generative Models
Jiwan Seo, Joonhyuk Kang
TL;DR
RAQ addresses the rigidity of fixed-rate VQ-based generative models by enabling dynamic, multi-rate codebooks without retraining. It introduces a Seq2Seq-based rate adaptation module that autoregressively generates adapted codebooks of size $\tilde{K}$ from an original codebook of size $K$, with cross-forcing to stabilize training. A competitive model-based alternative using differentiable $k$-means (DKM) and inverse functional DKM (IKM) offers a no-parameter fallback for rate adjustment. Experiments across CIFAR10, CelebA, and ImageNet show that a single RAQ-enabled model matches or surpasses fixed-rate baselines across rates, with favorable trade-offs in reconstruction quality and perceptual metrics. The work broadens the applicability of VQ-based models to real-time, bandwidth-mvarying scenarios by reducing the need for multiple, separately trained models.
Abstract
Learning discrete representations with vector quantization (VQ) has emerged as a powerful approach in various generative models. However, most VQ-based models rely on a single, fixed-rate codebook, requiring extensive retraining for new bitrates or efficiency requirements. We introduce Rate-Adaptive Quantization (RAQ), a multi-rate codebook adaptation framework for VQ-based generative models. RAQ applies a data-driven approach to generate variable-rate codebooks from a single baseline VQ model, enabling flexible tradeoffs between compression and reconstruction fidelity. Additionally, we provide a simple clustering-based procedure for pre-trained VQ models, offering an alternative when retraining is infeasible. Our experiments show that RAQ performs effectively across multiple rates, often outperforming conventional fixed-rate VQ baselines. By enabling a single system to seamlessly handle diverse bitrate requirements, RAQ extends the adaptability of VQ-based generative models and broadens their applicability to data compression, reconstruction, and generation tasks.
