3D-COCO: extension of MS-COCO dataset for image detection and 3D reconstruction modules
Maxence Bideaux, Alice Phe, Mohamed Chaouch, Bertrand Luvison, Quoc-Cuong Pham
TL;DR
3D-COCO tackles the lack of datasets linking 2D object detection with 3D CAD models by extending MS-COCO with a large, aligned set of 3D shapes from ShapeNet and Objaverse. It introduces an IoU-based automatic 2D-3D matching pipeline and renders 62 views per model to support both 2D detection and 3D reconstruction tasks. The result is a publicly available, open dataset of about 27,760 CAD models across 80 COCO classes, with alignments for hundreds of thousands of annotations, enabling 3D-configurable detection and multi-view or single-view reconstruction research. This dataset paves the way for integrating real images into 3D reconstruction pipelines and invites future improvements in alignment and model coverage.
Abstract
We introduce 3D-COCO, an extension of the original MS-COCO dataset providing 3D models and 2D-3D alignment annotations. 3D-COCO was designed to achieve computer vision tasks such as 3D reconstruction or image detection configurable with textual, 2D image, and 3D CAD model queries. We complete the existing MS-COCO dataset with 28K 3D models collected on ShapeNet and Objaverse. By using an IoU-based method, we match each MS-COCO annotation with the best 3D models to provide a 2D-3D alignment. The open-source nature of 3D-COCO is a premiere that should pave the way for new research on 3D-related topics. The dataset and its source codes is available at https://kalisteo.cea.fr/index.php/coco3d-object-detection-and-reconstruction/
