Table of Contents
Fetching ...

Uni-Fusion: Universal Continuous Mapping

Yijun Yuan, Andreas Nuechter

TL;DR

Uni-Fusion introduces a universal, training-free framework for continuous mapping that encodes geometry and arbitrary surface properties into Latent Implicit Maps (LIM) using per-voxel latent features and Nyström-based kernel approximation. By decoupling regression into a position encoder and a content encoder, the method builds compact voxel latents (l≈20) that fuse incrementally into a global LIM, enabling real-time surface reconstruction, color/infrared fields, and high-dimensional feature fields such as CLIP embeddings. The approach supports derivative-based and sample-based GPIS for surface inference, and extends to surface property and feature fields with a flexible fusion scheme, demonstrated across incremental reconstruction, 2D-to-3D property transfer, and open-vocabulary scene understanding. Experiments on ScanNet, TUM RGB-D, Replica, and segmentation datasets show competitive or superior performance with significantly reduced memory footprint and real-time capabilities, while maintaining flexibility to incorporate new properties without training. The work lays a foundation for universal 3D mapping and CLIP-based scene understanding, with potential extensions to loop-closure, bundle adjustment, and visual-language navigation.

Abstract

We present Uni-Fusion, a universal continuous mapping framework for surfaces, surface properties (color, infrared, etc.) and more (latent features in CLIP embedding space, etc.). We propose the first universal implicit encoding model that supports encoding of both geometry and different types of properties (RGB, infrared, features, etc.) without requiring any training. Based on this, our framework divides the point cloud into regular grid voxels and generates a latent feature in each voxel to form a Latent Implicit Map (LIM) for geometries and arbitrary properties. Then, by fusing a local LIM frame-wisely into a global LIM, an incremental reconstruction is achieved. Encoded with corresponding types of data, our Latent Implicit Map is capable of generating continuous surfaces, surface property fields, surface feature fields, and all other possible options. To demonstrate the capabilities of our model, we implement three applications: (1) incremental reconstruction for surfaces and color (2) 2D-to-3D transfer of fabricated properties (3) open-vocabulary scene understanding by creating a text CLIP feature field on surfaces. We evaluate Uni-Fusion by comparing it in corresponding applications, from which Uni-Fusion shows high-flexibility in various applications while performing best or being competitive. The project page of Uni-Fusion is available at https://jarrome.github.io/Uni-Fusion/ .

Uni-Fusion: Universal Continuous Mapping

TL;DR

Uni-Fusion introduces a universal, training-free framework for continuous mapping that encodes geometry and arbitrary surface properties into Latent Implicit Maps (LIM) using per-voxel latent features and Nyström-based kernel approximation. By decoupling regression into a position encoder and a content encoder, the method builds compact voxel latents (l≈20) that fuse incrementally into a global LIM, enabling real-time surface reconstruction, color/infrared fields, and high-dimensional feature fields such as CLIP embeddings. The approach supports derivative-based and sample-based GPIS for surface inference, and extends to surface property and feature fields with a flexible fusion scheme, demonstrated across incremental reconstruction, 2D-to-3D property transfer, and open-vocabulary scene understanding. Experiments on ScanNet, TUM RGB-D, Replica, and segmentation datasets show competitive or superior performance with significantly reduced memory footprint and real-time capabilities, while maintaining flexibility to incorporate new properties without training. The work lays a foundation for universal 3D mapping and CLIP-based scene understanding, with potential extensions to loop-closure, bundle adjustment, and visual-language navigation.

Abstract

We present Uni-Fusion, a universal continuous mapping framework for surfaces, surface properties (color, infrared, etc.) and more (latent features in CLIP embedding space, etc.). We propose the first universal implicit encoding model that supports encoding of both geometry and different types of properties (RGB, infrared, features, etc.) without requiring any training. Based on this, our framework divides the point cloud into regular grid voxels and generates a latent feature in each voxel to form a Latent Implicit Map (LIM) for geometries and arbitrary properties. Then, by fusing a local LIM frame-wisely into a global LIM, an incremental reconstruction is achieved. Encoded with corresponding types of data, our Latent Implicit Map is capable of generating continuous surfaces, surface property fields, surface feature fields, and all other possible options. To demonstrate the capabilities of our model, we implement three applications: (1) incremental reconstruction for surfaces and color (2) 2D-to-3D transfer of fabricated properties (3) open-vocabulary scene understanding by creating a text CLIP feature field on surfaces. We evaluate Uni-Fusion by comparing it in corresponding applications, from which Uni-Fusion shows high-flexibility in various applications while performing best or being competitive. The project page of Uni-Fusion is available at https://jarrome.github.io/Uni-Fusion/ .
Paper Structure (51 sections, 20 equations, 18 figures, 8 tables)

This paper contains 51 sections, 20 equations, 18 figures, 8 tables.

Figures (18)

  • Figure 1: One Universal Continuous Mapping for all reconstructions. Such as surface, properties including RGB, saliency, style and $...$, even high dimensional features (CLIP embeddings, and etc). A rendered video is available on youtube.
  • Figure 2: Uniform, sparse voxels. Each voxel is encoded into one feature vector $\mathbf F_m$huang2021di.
  • Figure 3: Interpreting formula with a graph that is coherent to the Encoder-decoder structure in Neural Implicit Maps huang2021di.
  • Figure 4: Sorted eigenvalues for $\mathbf K_{a}$'s eigendecomposition.
  • Figure 5: Inheritance graph for the class of Latent Implicit Maps (LIM).
  • ...and 13 more figures