HQOD: Harmonious Quantization for Object Detection
Long Huang, Zhiwei Dong, Song-Lu Chen, Ruiyao Zhang, Shutong Ti, Feng Chen, Xu-Cheng Yin
TL;DR
The paper addresses task inharmony between classification and localization in multi-task object detectors, which is worsened by low-bit quantization during QAT. It proposes HQOD, a framework introducing Task-Correlated (TCorr) loss and Harmonious IoU (HIoU) loss to focus optimization on low-harmony samples and balance regression across IoU levels, integrated into the QAT process. Empirical results on VOC and MS COCO show strong gains, including 4-bit ATSS with ResNet-50 achieving 39.6% mAP on COCO and even exceeding some full-precision baselines in certain setups, with notable improvements at 2-bit constraints as well. The approach is lightweight to integrate with existing QAT methods and detectors, offering a practical route to deploy high-performance quantized detectors on resource-constrained devices and potentially extendable to other vision tasks.
Abstract
Task inharmony problem commonly occurs in modern object detectors, leading to inconsistent qualities between classification and regression tasks. The predicted boxes with high classification scores but poor localization positions or low classification scores but accurate localization positions will worsen the performance of detectors after Non-Maximum Suppression. Furthermore, when object detectors collaborate with Quantization-Aware Training (QAT), we observe that the task inharmony problem will be further exacerbated, which is considered one of the main causes of the performance degradation of quantized detectors. To tackle this issue, we propose the Harmonious Quantization for Object Detection (HQOD) framework, which consists of two components. Firstly, we propose a task-correlated loss to encourage detectors to focus on improving samples with lower task harmony quality during QAT. Secondly, a harmonious Intersection over Union (IoU) loss is incorporated to balance the optimization of the regression branch across different IoU levels. The proposed HQOD can be easily integrated into different QAT algorithms and detectors. Remarkably, on the MS COCO dataset, our 4-bit ATSS with ResNet-50 backbone achieves a state-of-the-art mAP of 39.6%, even surpassing the full-precision one.
