Table of Contents
Fetching ...

OCCDiff: Occupancy Diffusion Model for High-Fidelity 3D Building Reconstruction from Noisy Point Clouds

Jialu Sui, Rui Liu, Hongsheng Zhang

TL;DR

<3-5 sentence high-level summary> OCCDiff tackles high-fidelity 3D building reconstruction from noisy, variably dense LiDAR data by placing diffusion in the occupancy-function space. It combines a condition-aware function autoencoder with a latent diffusion model guided by a point encoder, using flow matching to learn smooth, global latent features and multi-task training to preserve geometric fidelity. The approach produces complete occupancy functions that can be decoded into meshes via MISE, showing state-of-the-art performance on Building3D and Building-PCC with robustness to noise and varying density. This framework offers a scalable, flexible alternative to end-to-end encoder-decoder methods for outdoor building reconstruction and completion.>

Abstract

A major challenge in reconstructing buildings from LiDAR point clouds lies in accurately capturing building surfaces under varying point densities and noise interference. To flexibly gather high-quality 3D profiles of the building in diverse resolution, we propose OCCDiff applying latent diffusion in the occupancy function space. Our OCCDiff combines a latent diffusion process with a function autoencoder architecture to generate continuous occupancy functions evaluable at arbitrary locations. Moreover, a point encoder is proposed to provide condition features to diffusion learning, constraint the final occupancy prediction for occupancy decoder, and insert multi-modal features for latent generation to latent encoder. To further enhance the model performance, a multi-task training strategy is employed, ensuring that the point encoder learns diverse and robust feature representations. Empirical results show that our method generates physically consistent samples with high fidelity to the target distribution and exhibits robustness to noisy data.

OCCDiff: Occupancy Diffusion Model for High-Fidelity 3D Building Reconstruction from Noisy Point Clouds

TL;DR

<3-5 sentence high-level summary> OCCDiff tackles high-fidelity 3D building reconstruction from noisy, variably dense LiDAR data by placing diffusion in the occupancy-function space. It combines a condition-aware function autoencoder with a latent diffusion model guided by a point encoder, using flow matching to learn smooth, global latent features and multi-task training to preserve geometric fidelity. The approach produces complete occupancy functions that can be decoded into meshes via MISE, showing state-of-the-art performance on Building3D and Building-PCC with robustness to noise and varying density. This framework offers a scalable, flexible alternative to end-to-end encoder-decoder methods for outdoor building reconstruction and completion.>

Abstract

A major challenge in reconstructing buildings from LiDAR point clouds lies in accurately capturing building surfaces under varying point densities and noise interference. To flexibly gather high-quality 3D profiles of the building in diverse resolution, we propose OCCDiff applying latent diffusion in the occupancy function space. Our OCCDiff combines a latent diffusion process with a function autoencoder architecture to generate continuous occupancy functions evaluable at arbitrary locations. Moreover, a point encoder is proposed to provide condition features to diffusion learning, constraint the final occupancy prediction for occupancy decoder, and insert multi-modal features for latent generation to latent encoder. To further enhance the model performance, a multi-task training strategy is employed, ensuring that the point encoder learns diverse and robust feature representations. Empirical results show that our method generates physically consistent samples with high fidelity to the target distribution and exhibits robustness to noisy data.

Paper Structure

This paper contains 18 sections, 10 equations, 6 figures, 4 tables.

Figures (6)

  • Figure 1: The structure of our proposed OCCDiff model.
  • Figure 2: The architecture of function autoencoder and point encoder under partial point cloud condition.
  • Figure 3: The structure of the network in the diffusion model. Since our input feature $z_t$ is one-dimensional feature, we use a simple MLP as input embedding and add learnable position embedding.
  • Figure 4: The process of training and inference. Point Encoder is omitted in each figure. (a) Training of function autoencoder. (b) Training of the diffusion model. (c) Inference.
  • Figure 5: Four visual examples of shape completion results by different approaches on the Building3D and Building-PCC dataset. All point clouds are average sampling from generated mesh. Different colors denote the point clouds reconstructed by different approaches.
  • ...and 1 more figures