DG-PIC: Domain Generalized Point-In-Context Learning for Point Cloud Understanding
Jincen Jiang, Qianyu Zhou, Yuhang Li, Xuequan Lu, Meili Wang, Lizhuang Ma, Jian Chang, Jian Jun Zhang
TL;DR
DG-PIC tackles the problem of cross-domain generalization in multi-task point cloud understanding by unifying Domain Generalization with Point-In-Context Learning. It introduces dual-level source prototypes (global and local) and a dual-level test-time feature shifting mechanism (macro-level semantic and micro-level patch relations) to align unseen target data toward known source domains without any testing-time updates. The method is trained with a PIC-based, MPM Transformer framework and uses test-time prompt selection from the closest source-domain sample to handle reconstruction, denoising, and registration within a single model. A new multi-domain multi-task benchmark across four datasets (ModelNet40, ShapeNet, ScanNet, ScanObjectNN) demonstrates that DG-PIC achieves superior generalization and robustness compared to baselines and ablations, highlighting its practical value for real-world 3D understanding under distribution shifts.
Abstract
Recent point cloud understanding research suffers from performance drops on unseen data, due to the distribution shifts across different domains. While recent studies use Domain Generalization (DG) techniques to mitigate this by learning domain-invariant features, most are designed for a single task and neglect the potential of testing data. Despite In-Context Learning (ICL) showcasing multi-task learning capability, it usually relies on high-quality context-rich data and considers a single dataset, and has rarely been studied in point cloud understanding. In this paper, we introduce a novel, practical, multi-domain multi-task setting, handling multiple domains and multiple tasks within one unified model for domain generalized point cloud understanding. To this end, we propose Domain Generalized Point-In-Context Learning (DG-PIC) that boosts the generalizability across various tasks and domains at testing time. In particular, we develop dual-level source prototype estimation that considers both global-level shape contextual and local-level geometrical structures for representing source domains and a dual-level test-time feature shifting mechanism that leverages both macro-level domain semantic information and micro-level patch positional relationships to pull the target data closer to the source ones during the testing. Our DG-PIC does not require any model updates during the testing and can handle unseen domains and multiple tasks, \textit{i.e.,} point cloud reconstruction, denoising, and registration, within one unified model. We also introduce a benchmark for this new setting. Comprehensive experiments demonstrate that DG-PIC outperforms state-of-the-art techniques significantly.
