CPR++: Object Localization via Single Coarse Point Supervision
Xuehui Yu, Pengfei Chen, Kuiran Wang, Xumeng Han, Guorong Li, Zhenjun Han, Qixiang Ye, Jianbin Jiao
TL;DR
This work tackles semantic variance in point-based object localization by introducing CPR, which refines coarse point annotations into semantic centers using MIL over neighbourhoods. Building on CPR, CPR++ adds a dynamic, cascade-based sampling regime and variance regularization to handle multi-scale objects, achieving state-of-the-art results across COCO, DOTA, SeaPerson, and VOC without requiring strict annotation rules. The combination of MIL-driven refinement, adaptive region estimation, and cascade optimization significantly reduces training ambiguity and improves localization accuracy, particularly for larger objects. Overall, CPR and CPR++ demonstrate that algorithmic refinement of weak supervision can rival or surpass center-keypoint annotations, broadening the practicality of POL in diverse real-world settings.
Abstract
Point-based object localization (POL), which pursues high-performance object sensing under low-cost data annotation, has attracted increased attention. However, the point annotation mode inevitably introduces semantic variance due to the inconsistency of annotated points. Existing POL heavily rely on strict annotation rules, which are difficult to define and apply, to handle the problem. In this study, we propose coarse point refinement (CPR), which to our best knowledge is the first attempt to alleviate semantic variance from an algorithmic perspective. CPR reduces the semantic variance by selecting a semantic centre point in a neighbourhood region to replace the initial annotated point. Furthermore, We design a sampling region estimation module to dynamically compute a sampling region for each object and use a cascaded structure to achieve end-to-end optimization. We further integrate a variance regularization into the structure to concentrate the predicted scores, yielding CPR++. We observe that CPR++ can obtain scale information and further reduce the semantic variance in a global region, thus guaranteeing high-performance object localization. Extensive experiments on four challenging datasets validate the effectiveness of both CPR and CPR++. We hope our work can inspire more research on designing algorithms rather than annotation rules to address the semantic variance problem in POL. The dataset and code will be public at github.com/ucas-vg/PointTinyBenchmark.
