Table of Contents
Fetching ...

HQOD: Harmonious Quantization for Object Detection

Long Huang, Zhiwei Dong, Song-Lu Chen, Ruiyao Zhang, Shutong Ti, Feng Chen, Xu-Cheng Yin

TL;DR

The paper addresses task inharmony between classification and localization in multi-task object detectors, which is worsened by low-bit quantization during QAT. It proposes HQOD, a framework introducing Task-Correlated (TCorr) loss and Harmonious IoU (HIoU) loss to focus optimization on low-harmony samples and balance regression across IoU levels, integrated into the QAT process. Empirical results on VOC and MS COCO show strong gains, including 4-bit ATSS with ResNet-50 achieving 39.6% mAP on COCO and even exceeding some full-precision baselines in certain setups, with notable improvements at 2-bit constraints as well. The approach is lightweight to integrate with existing QAT methods and detectors, offering a practical route to deploy high-performance quantized detectors on resource-constrained devices and potentially extendable to other vision tasks.

Abstract

Task inharmony problem commonly occurs in modern object detectors, leading to inconsistent qualities between classification and regression tasks. The predicted boxes with high classification scores but poor localization positions or low classification scores but accurate localization positions will worsen the performance of detectors after Non-Maximum Suppression. Furthermore, when object detectors collaborate with Quantization-Aware Training (QAT), we observe that the task inharmony problem will be further exacerbated, which is considered one of the main causes of the performance degradation of quantized detectors. To tackle this issue, we propose the Harmonious Quantization for Object Detection (HQOD) framework, which consists of two components. Firstly, we propose a task-correlated loss to encourage detectors to focus on improving samples with lower task harmony quality during QAT. Secondly, a harmonious Intersection over Union (IoU) loss is incorporated to balance the optimization of the regression branch across different IoU levels. The proposed HQOD can be easily integrated into different QAT algorithms and detectors. Remarkably, on the MS COCO dataset, our 4-bit ATSS with ResNet-50 backbone achieves a state-of-the-art mAP of 39.6%, even surpassing the full-precision one.

HQOD: Harmonious Quantization for Object Detection

TL;DR

The paper addresses task inharmony between classification and localization in multi-task object detectors, which is worsened by low-bit quantization during QAT. It proposes HQOD, a framework introducing Task-Correlated (TCorr) loss and Harmonious IoU (HIoU) loss to focus optimization on low-harmony samples and balance regression across IoU levels, integrated into the QAT process. Empirical results on VOC and MS COCO show strong gains, including 4-bit ATSS with ResNet-50 achieving 39.6% mAP on COCO and even exceeding some full-precision baselines in certain setups, with notable improvements at 2-bit constraints as well. The approach is lightweight to integrate with existing QAT methods and detectors, offering a practical route to deploy high-performance quantized detectors on resource-constrained devices and potentially extendable to other vision tasks.

Abstract

Task inharmony problem commonly occurs in modern object detectors, leading to inconsistent qualities between classification and regression tasks. The predicted boxes with high classification scores but poor localization positions or low classification scores but accurate localization positions will worsen the performance of detectors after Non-Maximum Suppression. Furthermore, when object detectors collaborate with Quantization-Aware Training (QAT), we observe that the task inharmony problem will be further exacerbated, which is considered one of the main causes of the performance degradation of quantized detectors. To tackle this issue, we propose the Harmonious Quantization for Object Detection (HQOD) framework, which consists of two components. Firstly, we propose a task-correlated loss to encourage detectors to focus on improving samples with lower task harmony quality during QAT. Secondly, a harmonious Intersection over Union (IoU) loss is incorporated to balance the optimization of the regression branch across different IoU levels. The proposed HQOD can be easily integrated into different QAT algorithms and detectors. Remarkably, on the MS COCO dataset, our 4-bit ATSS with ResNet-50 backbone achieves a state-of-the-art mAP of 39.6%, even surpassing the full-precision one.
Paper Structure (16 sections, 9 equations, 6 figures, 4 tables)

This paper contains 16 sections, 9 equations, 6 figures, 4 tables.

Figures (6)

  • Figure 1: The results of object detection between the baseline LSQ LSQ and our proposed framework. The ground truth is represented by the red bounding box, and the classification scores of the predicted boxes are explicitly indicated. Our framework enhances the harmonious relationship between the classification task and regression task, producing more accurate bboxes.
  • Figure 2: The statistical results for true positive (TP) samples after NMS. The red dashed ellipse signifies the ideal numerical distribution, where the classification scores and IoU values are both at high levels. Note that FP32 represents a full-precision setting. INT4/2 represents that models are quantized under the 4/2-bit constraints.
  • Figure 3: Visualization of the task correlation indicator $c$ with different setting of $\beta_{\mathrm{cls}}$ and $\beta_{\mathrm{reg}}$ in Eq. \ref{['eq:task correlation indicator']}.
  • Figure 4: The average improvement compared to baseline LSQ LSQ for positive samples from different IoU intervals.
  • Figure 5: The distribution of the gap value between IoU and classification score (Cls) based on RetinaNet with ResNet-18.
  • ...and 1 more figures