Point-PRC: A Prompt Learning Based Regulation Framework for Generalizable Point Cloud Analysis
Hongyu Sun, Qiuhong Ke, Yongcai Wang, Wang Chen, Kang Yang, Deying Li, Jianfei Cai
TL;DR
The paper tackles 3D domain generalization for large multi-modal point cloud models by introducing Point-PRC, a regulation framework that couples lightweight prompt learning with pre-trained 3D knowledge. It comprises three constraints—Mutual Agreement Constraint (MAC), Text Diversity Constraint (TDC), and Model Ensemble Constraint (MEC)—and optimizes a joint objective where $\mathcal{L}_{RC} = \alpha L_p + \beta L_t + \gamma L_D$. The authors also curate three new 3DDG benchmarks (base-to-new, cross-dataset, few-shot) and demonstrate consistent improvements in both generalization and task performance across ULIP/ULIP-2 and PointCLIP-based models, validating the approach as model-agnostic and scalable. Overall, Point-PRC advances open-vocabulary 3D recognition by enabling prompts to interact with large 3D knowledge without overfitting, with practical impact on robust deployment of 3D vision systems.
Abstract
This paper investigates the 3D domain generalization (3DDG) ability of large 3D models based on prevalent prompt learning. Recent works demonstrate the performances of 3D point cloud recognition can be boosted remarkably by parameter-efficient prompt tuning. However, we observe that the improvement on downstream tasks comes at the expense of a severe drop in 3D domain generalization. To resolve this challenge, we present a comprehensive regulation framework that allows the learnable prompts to actively interact with the well-learned general knowledge in large 3D models to maintain good generalization. Specifically, the proposed framework imposes multiple explicit constraints on the prompt learning trajectory by maximizing the mutual agreement between task-specific predictions and task-agnostic knowledge. We design the regulation framework as a plug-and-play module to embed into existing representative large 3D models. Surprisingly, our method not only realizes consistently increasing generalization ability but also enhances task-specific 3D recognition performances across various 3DDG benchmarks by a clear margin. Considering the lack of study and evaluation on 3DDG, we also create three new benchmarks, namely base-to-new, cross-dataset and few-shot generalization benchmarks, to enrich the field and inspire future research. Code and benchmarks are available at \url{https://github.com/auniquesun/Point-PRC}.
