Table of Contents
Fetching ...

Center-based 3D Object Detection and Tracking

Tianwei Yin, Xingyi Zhou, Philipp Krähenbühl

TL;DR

CenterPoint introduces a center-based approach for simultaneous 3D object detection and tracking from LiDAR, treating objects as centers rather than axis-aligned boxes. A two-stage architecture first detects centers and regresses full 3D properties, then refines with point features from object faces; tracking uses velocity estimates and greedy closest-point association, avoiding heavy motion models. The method achieves state-of-the-art results on Waymo and nuScenes with single-model backbones and near real-time speed, highlighting strong gains from center-based representation and lightweight refinement. This approach simplifies 3D perception pipelines while delivering high accuracy and robust tracking suitable for autonomous driving.

Abstract

Three-dimensional objects are commonly represented as 3D boxes in a point-cloud. This representation mimics the well-studied image-based 2D bounding-box detection but comes with additional challenges. Objects in a 3D world do not follow any particular orientation, and box-based detectors have difficulties enumerating all orientations or fitting an axis-aligned bounding box to rotated objects. In this paper, we instead propose to represent, detect, and track 3D objects as points. Our framework, CenterPoint, first detects centers of objects using a keypoint detector and regresses to other attributes, including 3D size, 3D orientation, and velocity. In a second stage, it refines these estimates using additional point features on the object. In CenterPoint, 3D object tracking simplifies to greedy closest-point matching. The resulting detection and tracking algorithm is simple, efficient, and effective. CenterPoint achieved state-of-the-art performance on the nuScenes benchmark for both 3D detection and tracking, with 65.5 NDS and 63.8 AMOTA for a single model. On the Waymo Open Dataset, CenterPoint outperforms all previous single model method by a large margin and ranks first among all Lidar-only submissions. The code and pretrained models are available at https://github.com/tianweiy/CenterPoint.

Center-based 3D Object Detection and Tracking

TL;DR

CenterPoint introduces a center-based approach for simultaneous 3D object detection and tracking from LiDAR, treating objects as centers rather than axis-aligned boxes. A two-stage architecture first detects centers and regresses full 3D properties, then refines with point features from object faces; tracking uses velocity estimates and greedy closest-point association, avoiding heavy motion models. The method achieves state-of-the-art results on Waymo and nuScenes with single-model backbones and near real-time speed, highlighting strong gains from center-based representation and lightweight refinement. This approach simplifies 3D perception pipelines while delivering high accuracy and robust tracking suitable for autonomous driving.

Abstract

Three-dimensional objects are commonly represented as 3D boxes in a point-cloud. This representation mimics the well-studied image-based 2D bounding-box detection but comes with additional challenges. Objects in a 3D world do not follow any particular orientation, and box-based detectors have difficulties enumerating all orientations or fitting an axis-aligned bounding box to rotated objects. In this paper, we instead propose to represent, detect, and track 3D objects as points. Our framework, CenterPoint, first detects centers of objects using a keypoint detector and regresses to other attributes, including 3D size, 3D orientation, and velocity. In a second stage, it refines these estimates using additional point features on the object. In CenterPoint, 3D object tracking simplifies to greedy closest-point matching. The resulting detection and tracking algorithm is simple, efficient, and effective. CenterPoint achieved state-of-the-art performance on the nuScenes benchmark for both 3D detection and tracking, with 65.5 NDS and 63.8 AMOTA for a single model. On the Waymo Open Dataset, CenterPoint outperforms all previous single model method by a large margin and ranks first among all Lidar-only submissions. The code and pretrained models are available at https://github.com/tianweiy/CenterPoint.

Paper Structure

This paper contains 17 sections, 2 equations, 3 figures, 14 tables, 1 algorithm.

Figures (3)

  • Figure 1: We present a center-based framework to represent, detect and track objects. Previous anchor-based methods use axis-aligned anchors with respect to ego-vehicle coordinate. When the vehicle is driving in straight roads, both anchor-based and our center-based method are able to detect objects accurately (top). However, during a safety-critical left turn (bottom), anchor-based methods have difficulty fitting axis-aligned bounding boxes to rotated objects. Our center-based model accurately detect objects through rotationally invariant points. Best viewed in color.
  • Figure 2: Overview of our CenterPoint framework. We rely on a standard 3D backbone that extracts map-view feature representation from Lidar point-clouds. Then, a 2D CNN architecture detection head finds object centers and regress to full 3D bounding boxes using center features. This box prediction is used to extract point features at the 3D centers of each face of the estimated 3D bounding box, which are passed into MLP to predict an IoU-guided confidence score and box regression refinement. Best viewed in color.
  • Figure 3: Example qualitative results of CenterPoint on the Waymo validation. We show the raw point-cloud in blue, our detected objects in green bounding boxes, and Lidar points inside bounding boxes in red. Best viewed on screen.