Table of Contents
Fetching ...

CornerNet: Detecting Objects as Paired Keypoints

Hei Law, Jia Deng

TL;DR

CornerNet reframes object detection as finding paired keypoints (top-left and bottom-right corners) and grouping them with associative embeddings, eliminating anchor boxes. It introduces corner pooling to localize corners more accurately and uses an hourglass backbone to capture multi-scale context, achieving a COCO AP of $42.2\%$ among one-stage detectors. Extensive ablations confirm the critical role of corner pooling and demonstrate high-quality bounding boxes, especially at large IoU thresholds. The approach simplifies detector design and yields competitive performance relative to state-of-the-art two-stage methods, with practical implications for anchor-free detection systems.

Abstract

We propose CornerNet, a new approach to object detection where we detect an object bounding box as a pair of keypoints, the top-left corner and the bottom-right corner, using a single convolution neural network. By detecting objects as paired keypoints, we eliminate the need for designing a set of anchor boxes commonly used in prior single-stage detectors. In addition to our novel formulation, we introduce corner pooling, a new type of pooling layer that helps the network better localize corners. Experiments show that CornerNet achieves a 42.2% AP on MS COCO, outperforming all existing one-stage detectors.

CornerNet: Detecting Objects as Paired Keypoints

TL;DR

CornerNet reframes object detection as finding paired keypoints (top-left and bottom-right corners) and grouping them with associative embeddings, eliminating anchor boxes. It introduces corner pooling to localize corners more accurately and uses an hourglass backbone to capture multi-scale context, achieving a COCO AP of among one-stage detectors. Extensive ablations confirm the critical role of corner pooling and demonstrate high-quality bounding boxes, especially at large IoU thresholds. The approach simplifies detector design and yields competitive performance relative to state-of-the-art two-stage methods, with practical implications for anchor-free detection systems.

Abstract

We propose CornerNet, a new approach to object detection where we detect an object bounding box as a pair of keypoints, the top-left corner and the bottom-right corner, using a single convolution neural network. By detecting objects as paired keypoints, we eliminate the need for designing a set of anchor boxes commonly used in prior single-stage detectors. In addition to our novel formulation, we introduce corner pooling, a new type of pooling layer that helps the network better localize corners. Experiments show that CornerNet achieves a 42.2% AP on MS COCO, outperforming all existing one-stage detectors.

Paper Structure

This paper contains 23 sections, 8 equations, 11 figures, 7 tables.

Figures (11)

  • Figure 1: We detect an object as a pair of bounding box corners grouped together. A convolutional network outputs a heatmap for all top-left corners, a heatmap for all bottom-right corners, and an embedding vector for each detected corner. The network is trained to predict similar embeddings for corners that belong to the same object.
  • Figure 2: Often there is no local evidence to determine the location of a bounding box corner. We address this issue by proposing a new type of pooling layer.
  • Figure 3: Corner pooling: for each channel, we take the maximum values (red dots) in two directions (red lines), each from a separate feature map, and add the two maximums together (blue dot).
  • Figure 4: Overview of CornerNet. The backbone network is followed by two prediction modules, one for the top-left corners and the other for the bottom-right corners. Using the predictions from both modules, we locate and group the corners.
  • Figure 5: "Ground-truth" heatmaps for training. Boxes (green dotted rectangles) whose corners are within the radii of the positive locations (orange circles) still have large overlaps with the ground-truth annotations (red solid rectangles).
  • ...and 6 more figures