CornerNet: Detecting Objects as Paired Keypoints
Hei Law, Jia Deng
TL;DR
CornerNet reframes object detection as finding paired keypoints (top-left and bottom-right corners) and grouping them with associative embeddings, eliminating anchor boxes. It introduces corner pooling to localize corners more accurately and uses an hourglass backbone to capture multi-scale context, achieving a COCO AP of $42.2\%$ among one-stage detectors. Extensive ablations confirm the critical role of corner pooling and demonstrate high-quality bounding boxes, especially at large IoU thresholds. The approach simplifies detector design and yields competitive performance relative to state-of-the-art two-stage methods, with practical implications for anchor-free detection systems.
Abstract
We propose CornerNet, a new approach to object detection where we detect an object bounding box as a pair of keypoints, the top-left corner and the bottom-right corner, using a single convolution neural network. By detecting objects as paired keypoints, we eliminate the need for designing a set of anchor boxes commonly used in prior single-stage detectors. In addition to our novel formulation, we introduce corner pooling, a new type of pooling layer that helps the network better localize corners. Experiments show that CornerNet achieves a 42.2% AP on MS COCO, outperforming all existing one-stage detectors.
