Smart Feature is What You Need
Zhaoxin Hu, Keyan Ren
TL;DR
This work tackles shape guidance and label jitter in 3D weakly-supervised object detection by introducing MMA, a plug-in point-cloud feature representation with two modules—AAA and FDC—that encode shape tendencies and existence areas across multiple densities. By applying MMA to existing detectors and BR-based systems, the authors demonstrate substantial improvements in accuracy and robustness, often approaching fully supervised performance with only a fraction of annotation effort and with reduced model size. The approach yields anti-jitter effects, converts noisy weak labels into data-enhancing signal, and provides a versatile, low-coupling solution for integrating weak and full supervision in 3D detection. These results suggest MMA as a practical path to more data-efficient, reliable indoor 3D detection in real-world applications.
Abstract
Lack of shape guidance and label jitter caused by information deficiency of weak label are the main problems in 3D weakly-supervised object detection. Current weakly-supervised models often use heuristics or assumptions methods to infer information from weak labels without taking advantage of the inherent clues of weakly-supervised and fully-supervised methods, thus it is difficult to explore a method that combines data utilization efficiency and model accuracy. In an attempt to address these issues, we propose a novel plug-and-in point cloud feature representation network called Multi-scale Mixed Attention (MMA). MMA utilizes adjacency attention within neighborhoods and disparity attention at different density scales to build a feature representation network. The smart feature representation obtained from MMA has shape tendency and object existence area inference, which can constrain the region of the detection boxes, thereby alleviating the problems caused by the information default of weak labels. Extensive experiments show that in indoor weak label scenarios, the fully-supervised network can perform close to that of the weakly-supervised network merely through the improvement of point feature by MMA. At the same time, MMA can turn waste into treasure, reversing the label jitter problem that originally interfered with weakly-supervised detection into the source of data enhancement, strengthening the performance of existing weak supervision detection methods. Our code is available at https://github.com/hzx-9894/MMA.
