CornerNet-Lite: Efficient Keypoint Based Object Detection
Hei Law, Yun Teng, Olga Russakovsky, Jia Deng
TL;DR
This paper tackles the slow inference of anchor-free, keypoint-based detectors by introducing CornerNet-Lite, which combines CornerNet-Saccade (attention-guided, offline-efficient) and CornerNet-Squeeze (compact-backbone, real-time-efficient). CornerNet-Saccade uses multi-scale attention maps and crops high-resolution regions to achieve a 6× speed-up with a modest 1% AP gain, while CornerNet-Squeeze employs SqueezeNet-inspired fire modules and depthwise separable convolutions to surpass YOLOv3 in both speed (30 ms) and accuracy (34.4% AP on COCO). Ablation studies show saccades help only when attention maps are accurate and when the network has sufficient capacity, and that combining Squeeze with Saccade is not beneficial under tight budgets. Overall, CornerNet-Lite demonstrates that keypoint-based detection can meet practical efficiency and real-time constraints, expanding its applicability to time-sensitive tasks.
Abstract
Keypoint-based methods are a relatively new paradigm in object detection, eliminating the need for anchor boxes and offering a simplified detection framework. Keypoint-based CornerNet achieves state of the art accuracy among single-stage detectors. However, this accuracy comes at high processing cost. In this work, we tackle the problem of efficient keypoint-based object detection and introduce CornerNet-Lite. CornerNet-Lite is a combination of two efficient variants of CornerNet: CornerNet-Saccade, which uses an attention mechanism to eliminate the need for exhaustively processing all pixels of the image, and CornerNet-Squeeze, which introduces a new compact backbone architecture. Together these two variants address the two critical use cases in efficient object detection: improving efficiency without sacrificing accuracy, and improving accuracy at real-time efficiency. CornerNet-Saccade is suitable for offline processing, improving the efficiency of CornerNet by 6.0x and the AP by 1.0% on COCO. CornerNet-Squeeze is suitable for real-time detection, improving both the efficiency and accuracy of the popular real-time detector YOLOv3 (34.4% AP at 30ms for CornerNet-Squeeze compared to 33.0% AP at 39ms for YOLOv3 on COCO). Together these contributions for the first time reveal the potential of keypoint-based detection to be useful for applications requiring processing efficiency.
