Table of Contents
Fetching ...

Training-Free Adaptive Quantization for Variable Rate Image Coding for Machines

Yui Tatsumi, Ziyue Zeng, Hiroshi Watanabe

TL;DR

This work introduces a training-free adaptive quantization controller for Image Coding for Machines (ICM) that leverages the hyperprior scale to modulate quantization strength across channels and spatial positions, enabling continuous bitrate control without retraining. By applying channel- and spatial-aware quantization with a single control parameter, it preserves semantically important regions for machine vision tasks while reducing bitrate. Empirical results show BD-rate improvements over non-adaptive baselines and only minor encoding/decoding overhead, demonstrating practical viability. The method is designed to integrate with existing LIC/ICM architectures, notably SA-ICM with Ch-ARM, and lays groundwork for scalable human–machine coding in the future.

Abstract

Image Coding for Machines (ICM) has become increasingly important with the rapid integration of computer vision technology into real-world applications. However, most neural network-based ICM frameworks operate at a fixed rate, thus requiring individual training for each target bitrate. This limitation may restrict their practical usage. Existing variable rate image compression approaches mitigate this issue but often rely on additional training, which increases computational costs and complicates deployment. Moreover, variable rate control has not been thoroughly explored for ICM. To address these challenges, we propose a training-free quantization strength control scheme that enables flexible bitrate adjustment. By exploiting the scale parameter predicted by the hyperprior network, the proposed method adaptively modulates quantization step sizes across both channel and spatial dimensions. This allows the model to preserve semantically important regions while coarsely quantizing less critical areas. Our architectural design further enables continuous bitrate control through a single parameter. Experimental results demonstrate the effectiveness of our proposed method, achieving up to 11.07% BD-rate savings over the non-adaptive variable rate baseline.

Training-Free Adaptive Quantization for Variable Rate Image Coding for Machines

TL;DR

This work introduces a training-free adaptive quantization controller for Image Coding for Machines (ICM) that leverages the hyperprior scale to modulate quantization strength across channels and spatial positions, enabling continuous bitrate control without retraining. By applying channel- and spatial-aware quantization with a single control parameter, it preserves semantically important regions for machine vision tasks while reducing bitrate. Empirical results show BD-rate improvements over non-adaptive baselines and only minor encoding/decoding overhead, demonstrating practical viability. The method is designed to integrate with existing LIC/ICM architectures, notably SA-ICM with Ch-ARM, and lays groundwork for scalable human–machine coding in the future.

Abstract

Image Coding for Machines (ICM) has become increasingly important with the rapid integration of computer vision technology into real-world applications. However, most neural network-based ICM frameworks operate at a fixed rate, thus requiring individual training for each target bitrate. This limitation may restrict their practical usage. Existing variable rate image compression approaches mitigate this issue but often rely on additional training, which increases computational costs and complicates deployment. Moreover, variable rate control has not been thoroughly explored for ICM. To address these challenges, we propose a training-free quantization strength control scheme that enables flexible bitrate adjustment. By exploiting the scale parameter predicted by the hyperprior network, the proposed method adaptively modulates quantization step sizes across both channel and spatial dimensions. This allows the model to preserve semantically important regions while coarsely quantizing less critical areas. Our architectural design further enables continuous bitrate control through a single parameter. Experimental results demonstrate the effectiveness of our proposed method, achieving up to 11.07% BD-rate savings over the non-adaptive variable rate baseline.

Paper Structure

This paper contains 16 sections, 10 equations, 6 figures, 1 table.

Figures (6)

  • Figure 1: Visualization of average scale values of each (a) channel and (b) slice. Deeper colors represent larger values.
  • Figure 2: Visualization of scale parameters in the first channel. Brighter colors indicate larger values.
  • Figure 3: Image coding process with the proposed method. The figure illustrates $N=2$ slices, whereas our experiments use $N=5$.
  • Figure 4: Examples of reconstructed images with different quantization control methods. (a) Original image, (b) Non-adaptive quantization which uniformly scales the step size across all latents, and (c) Adaptive quantization step size control by the proposed method.
  • Figure 5: Image compression performance for different recognition tasks. (a) Object detection by YOLOv5, (b) Object detection by Mask R-CNN, and (c) Instance segmentation by Mask R-CNN.
  • ...and 1 more figures