RaLD: Generating High-Resolution 3D Radar Point Clouds with Latent Diffusion
Ruijie Zhang, Bixin Zeng, Shengpeng Wang, Fuhui Zhou, Wei Wang
TL;DR
RaLD tackles the challenge of sparse, low-resolution radar point clouds by generating dense LiDAR-like 3D scenes from raw radar spectra. It introduces a latent-diffusion framework operating in a compact latent space learned by a frustum-based LiDAR autoencoder, augmented with order-invariant latent encoding and radar spectrum guidance to condition generation. The approach yields high-fidelity 3D reconstructions and surpasses state-of-the-art baselines on ColoRadar, with ablations validating the contributions of radar conditioning, query initialization, and frustum-aligned occupancy. This work enables robust, high-resolution 3D perception from radar in challenging environments and demonstrates scalable, transferable performance across indoor and outdoor scenarios.
Abstract
Millimeter-wave radar offers a promising sensing modality for autonomous systems thanks to its robustness in adverse conditions and low cost. However, its utility is significantly limited by the sparsity and low resolution of radar point clouds, which poses challenges for tasks requiring dense and accurate 3D perception. Despite that recent efforts have shown great potential by exploring generative approaches to address this issue, they often rely on dense voxel representations that are inefficient and struggle to preserve structural detail. To fill this gap, we make the key observation that latent diffusion models (LDMs), though successful in other modalities, have not been effectively leveraged for radar-based 3D generation due to a lack of compatible representations and conditioning strategies. We introduce RaLD, a framework that bridges this gap by integrating scene-level frustum-based LiDAR autoencoding, order-invariant latent representations, and direct radar spectrum conditioning. These insights lead to a more compact and expressive generation process. Experiments show that RaLD produces dense and accurate 3D point clouds from raw radar spectrums, offering a promising solution for robust perception in challenging environments.
