Simple multi-dataset detection
Xingyi Zhou, Vladlen Koltun, Philipp Krähenbühl
TL;DR
The paper tackles fragmentation in object detection across large, diverse datasets by training a partitioned detector with dataset-specific outputs and losses while using a fully automatic ILP-based method to unify label spaces into a single taxonomy. This unified detector, evaluated on COCO, Objects365, and OpenImages, matches dataset-specific models on training domains and generalizes to unseen domains without fine-tuning, outperforming expert-designed taxonomies in many cases. Key contributions include the partitioned training framework, automatic taxonomy learning, and extensive cross-dataset and scale-up experiments demonstrating strong generalization and practicality. The approach enables a single, deployable detector across multiple domains, with potential for easy expansion to new datasets and tighter integration with language-aware cues in future work.
Abstract
How do we build a general and broad object detection system? We use all labels of all concepts ever annotated. These labels span diverse datasets with potentially inconsistent taxonomies. In this paper, we present a simple method for training a unified detector on multiple large-scale datasets. We use dataset-specific training protocols and losses, but share a common detection architecture with dataset-specific outputs. We show how to automatically integrate these dataset-specific outputs into a common semantic taxonomy. In contrast to prior work, our approach does not require manual taxonomy reconciliation. Experiments show our learned taxonomy outperforms a expert-designed taxonomy in all datasets. Our multi-dataset detector performs as well as dataset-specific models on each training domain, and can generalize to new unseen dataset without fine-tuning on them. Code is available at https://github.com/xingyizhou/UniDet.
