UnCommon Objects in 3D
Xingchen Liu, Piyush Tayal, Jianyuan Wang, Jesus Zarzar, Tom Monnier, Konstantinos Tertikas, Jiali Duan, Antoine Toisoul, Jason Y. Zhang, Natalia Neverova, Andrea Vedaldi, Roman Shapovalov, David Novotny
TL;DR
UnCommon Objects in 3D (uCO3D) is a high-quality, real-world, object-centric 360-degree video dataset with extensive 3D annotations, captions, and Gaussian Splat reconstructions across over $1{,}000$ categories and $170{,}000$ scenes. It combines VGGSfM-based camera poses, depth maps, sparse/dense point clouds, and canonical-view 3DGS to support robust learning and re-rendering, validated through improved performance on few-view 3D reconstruction, novel-view diffusion, and text-to-3D tasks. The work demonstrates that training on uCO3D yields stronger models than training on prior datasets (MVImgNet, CO3Dv2) and enables re-shooting 3DGS from canonical views to adapt real data for Instant3D-like pipelines. This dataset thus provides a practical, scalable resource for real-world 3D deep learning and generative modelling with broad applicability in digital twinning and 3D content creation.
Abstract
We introduce Uncommon Objects in 3D (uCO3D), a new object-centric dataset for 3D deep learning and 3D generative AI. uCO3D is the largest publicly-available collection of high-resolution videos of objects with 3D annotations that ensures full-360$^{\circ}$ coverage. uCO3D is significantly more diverse than MVImgNet and CO3Dv2, covering more than 1,000 object categories. It is also of higher quality, due to extensive quality checks of both the collected videos and the 3D annotations. Similar to analogous datasets, uCO3D contains annotations for 3D camera poses, depth maps and sparse point clouds. In addition, each object is equipped with a caption and a 3D Gaussian Splat reconstruction. We train several large 3D models on MVImgNet, CO3Dv2, and uCO3D and obtain superior results using the latter, showing that uCO3D is better for learning applications.
