Table of Contents
Fetching ...

DAPoinTr: Domain Adaptive Point Transformer for Point Cloud Completion

Yinghui Li, Qianyu Zhou, Jingyu Gong, Ye Zhu, Richard Dazeley, Xinkui Zhao, Xuequan Lu

TL;DR

This work tackles cross-domain unsupervised domain adaptation for point cloud completion by leveraging a Transformer-based PCC framework (PoinTr) augmented with Domain Query-based Feature Alignment, Point Token-wise Feature Alignment, and Voted Prediction Consistency. DAPoinTr introduces global-domain (DQFA) and local-domain (PTFA) alignment mechanisms to produce domain-invariant sequence and token features, while VPC ensembles multiple decoder predictions and generates pseudo-labels to boost transferability. Across synthetic and real-world benchmarks, including CRN as source and KITTI/ScanNet/MatterPort3D/3D-FUTURE/ModelNet as targets, the method achieves state-of-the-art cross-domain PCC performance and demonstrates robust improvements in both quantitative metrics and qualitative reconstructions. The approach advances domain-robust PCC by marrying Transformer sequence modeling with adversarial alignment and ensemble-based prediction, offering practical gains for real-world 3D completion tasks where domain shifts are prevalent.

Abstract

Point Transformers (PoinTr) have shown great potential in point cloud completion recently. Nevertheless, effective domain adaptation that improves transferability toward target domains remains unexplored. In this paper, we delve into this topic and empirically discover that direct feature alignment on point Transformer's CNN backbone only brings limited improvements since it cannot guarantee sequence-wise domain-invariant features in the Transformer. To this end, we propose a pioneering Domain Adaptive Point Transformer (DAPoinTr) framework for point cloud completion. DAPoinTr consists of three key components: Domain Query-based Feature Alignment (DQFA), Point Token-wise Feature alignment (PTFA), and Voted Prediction Consistency (VPC). In particular, DQFA is presented to narrow the global domain gaps from the sequence via the presented domain proxy and domain query at the Transformer encoder and decoder, respectively. PTFA is proposed to close the local domain shifts by aligning the tokens, \emph{i.e.,} point proxy and dynamic query, at the Transformer encoder and decoder, respectively. VPC is designed to consider different Transformer decoders as multiple of experts (MoE) for ensembled prediction voting and pseudo-label generation. Extensive experiments with visualization on several domain adaptation benchmarks demonstrate the effectiveness and superiority of our DAPoinTr compared with state-of-the-art methods. Code will be publicly available at: https://github.com/Yinghui-Li-New/DAPoinTr

DAPoinTr: Domain Adaptive Point Transformer for Point Cloud Completion

TL;DR

This work tackles cross-domain unsupervised domain adaptation for point cloud completion by leveraging a Transformer-based PCC framework (PoinTr) augmented with Domain Query-based Feature Alignment, Point Token-wise Feature Alignment, and Voted Prediction Consistency. DAPoinTr introduces global-domain (DQFA) and local-domain (PTFA) alignment mechanisms to produce domain-invariant sequence and token features, while VPC ensembles multiple decoder predictions and generates pseudo-labels to boost transferability. Across synthetic and real-world benchmarks, including CRN as source and KITTI/ScanNet/MatterPort3D/3D-FUTURE/ModelNet as targets, the method achieves state-of-the-art cross-domain PCC performance and demonstrates robust improvements in both quantitative metrics and qualitative reconstructions. The approach advances domain-robust PCC by marrying Transformer sequence modeling with adversarial alignment and ensemble-based prediction, offering practical gains for real-world 3D completion tasks where domain shifts are prevalent.

Abstract

Point Transformers (PoinTr) have shown great potential in point cloud completion recently. Nevertheless, effective domain adaptation that improves transferability toward target domains remains unexplored. In this paper, we delve into this topic and empirically discover that direct feature alignment on point Transformer's CNN backbone only brings limited improvements since it cannot guarantee sequence-wise domain-invariant features in the Transformer. To this end, we propose a pioneering Domain Adaptive Point Transformer (DAPoinTr) framework for point cloud completion. DAPoinTr consists of three key components: Domain Query-based Feature Alignment (DQFA), Point Token-wise Feature alignment (PTFA), and Voted Prediction Consistency (VPC). In particular, DQFA is presented to narrow the global domain gaps from the sequence via the presented domain proxy and domain query at the Transformer encoder and decoder, respectively. PTFA is proposed to close the local domain shifts by aligning the tokens, \emph{i.e.,} point proxy and dynamic query, at the Transformer encoder and decoder, respectively. VPC is designed to consider different Transformer decoders as multiple of experts (MoE) for ensembled prediction voting and pseudo-label generation. Extensive experiments with visualization on several domain adaptation benchmarks demonstrate the effectiveness and superiority of our DAPoinTr compared with state-of-the-art methods. Code will be publicly available at: https://github.com/Yinghui-Li-New/DAPoinTr

Paper Structure

This paper contains 18 sections, 8 equations, 8 figures, 5 tables.

Figures (8)

  • Figure 1: (a) Visualization of the domain discrepancy among objects of the same category from different domains where there are significant variations in topology, geometric patterns, and feature distribution. We observe that directly applying adversarial alignment to CNN backbone features of PoinTr yu2021pointr does not well align the source domains and target domains since it does not ensure learning domain-invariant sequence features of the point cloud. (b) In contrast, we propose a Domain Adaptive Point Transformer (DAPoinTr) framework for point cloud completion that well aligns the cross-domain distributions (visualization of Transformer encoder features) and manages to generate complete shapes of partial point cloud input.
  • Figure 2: The framework of DAPoinTr for domain adaptive point cloud completion, including three key components: (a) Point Token-wise Feature Alignment (PTFA) is proposed to close the local domain shifts by aligning the tokens, i.e., point proxy and dynamic query, at the Transformer encoder and decoder, respectively. (b) Domain Query-based Feature Alignment (DQFA) is presented to narrow the global domain gaps from the sequence via the presented domain proxy and domain query at the Transformer encoder and decoder, respectively. Finally, Voted Prediction Consistency (VPC) is designed to consider different Transformer decoders as multiple of experts (MoE) to vote for the ensembled prediction and pseudo-label generation.
  • Figure 3: Visualization comparisons with state-of-the-art PCC methods on ModelNet dataset.
  • Figure 4: Visualization comparisons with state-of-the-art PCC methods on 3D-FUTURE dataset.
  • Figure 5: TSNE Visualization of feature distribution for PoinTr and DAPoinTr.
  • ...and 3 more figures