Table of Contents
Fetching ...

Radar2Shape: 3D Shape Reconstruction from High-Frequency Radar using Multiresolution Signed Distance Functions

Neel Sortur, Justin Goodwin, Purvik Patel, Luis Enrique Martinez, Tzofi Klinghoffer, Rajmonda S. Caceres, Robin Walters

TL;DR

Radar2Shape addresses the challenging problem of 3D shape reconstruction from high-frequency radar under partial observability. It introduces a two-stage framework that first builds a hierarchical latent space for signed distance functions from multi-resolution features and then applies a radar-conditioned diffusion process to generate full 3D geometries in a coarse-to-fine manner, guided by the radar signal $F(oldsymbol{u}, f)$ across frequencies. Key contributions include the multiresolution SDF representation via projected triplanes, a Transformer-based radar-conditioned diffusion model, and two public benchmark datasets (Manifold40-PO and Manifold40-PO-SBR) plus real monoconic radar data for zero-shot evaluation. The method demonstrates superior reconstruction accuracy and robustness to partial observability compared with baselines, and it generalizes to unseen radar data, moving the field toward practical radar-based 3D reconstruction.

Abstract

Determining the shape of 3D objects from high-frequency radar signals is analytically complex but critical for commercial and aerospace applications. Previous deep learning methods have been applied to radar modeling; however, they often fail to represent arbitrary shapes or have difficulty with real-world radar signals which are collected over limited viewing angles. Existing methods in optical 3D reconstruction can generate arbitrary shapes from limited camera views, but struggle when they naively treat the radar signal as a camera view. In this work, we present Radar2Shape, a denoising diffusion model that handles a partially observable radar signal for 3D reconstruction by correlating its frequencies with multiresolution shape features. Our method consists of a two-stage approach: first, Radar2Shape learns a regularized latent space with hierarchical resolutions of shape features, and second, it diffuses into this latent space by conditioning on the frequencies of the radar signal in an analogous coarse-to-fine manner. We demonstrate that Radar2Shape can successfully reconstruct arbitrary 3D shapes even from partially-observed radar signals, and we show robust generalization to two different simulation methods and real-world data. Additionally, we release two synthetic benchmark datasets to encourage future research in the high-frequency radar domain so that models like Radar2Shape can safely be adapted into real-world radar systems.

Radar2Shape: 3D Shape Reconstruction from High-Frequency Radar using Multiresolution Signed Distance Functions

TL;DR

Radar2Shape addresses the challenging problem of 3D shape reconstruction from high-frequency radar under partial observability. It introduces a two-stage framework that first builds a hierarchical latent space for signed distance functions from multi-resolution features and then applies a radar-conditioned diffusion process to generate full 3D geometries in a coarse-to-fine manner, guided by the radar signal across frequencies. Key contributions include the multiresolution SDF representation via projected triplanes, a Transformer-based radar-conditioned diffusion model, and two public benchmark datasets (Manifold40-PO and Manifold40-PO-SBR) plus real monoconic radar data for zero-shot evaluation. The method demonstrates superior reconstruction accuracy and robustness to partial observability compared with baselines, and it generalizes to unseen radar data, moving the field toward practical radar-based 3D reconstruction.

Abstract

Determining the shape of 3D objects from high-frequency radar signals is analytically complex but critical for commercial and aerospace applications. Previous deep learning methods have been applied to radar modeling; however, they often fail to represent arbitrary shapes or have difficulty with real-world radar signals which are collected over limited viewing angles. Existing methods in optical 3D reconstruction can generate arbitrary shapes from limited camera views, but struggle when they naively treat the radar signal as a camera view. In this work, we present Radar2Shape, a denoising diffusion model that handles a partially observable radar signal for 3D reconstruction by correlating its frequencies with multiresolution shape features. Our method consists of a two-stage approach: first, Radar2Shape learns a regularized latent space with hierarchical resolutions of shape features, and second, it diffuses into this latent space by conditioning on the frequencies of the radar signal in an analogous coarse-to-fine manner. We demonstrate that Radar2Shape can successfully reconstruct arbitrary 3D shapes even from partially-observed radar signals, and we show robust generalization to two different simulation methods and real-world data. Additionally, we release two synthetic benchmark datasets to encourage future research in the high-frequency radar domain so that models like Radar2Shape can safely be adapted into real-world radar systems.

Paper Structure

This paper contains 25 sections, 2 equations, 5 figures, 2 tables.

Figures (5)

  • Figure 1: Overview. Radar2Shape solves the challenging task of 3D shape reconstruction from radar captured at limiting viewing angles. (a) Limited views cause self-occlusion, resulting in missing information in the measurement. (b) Our approach overcomes this ambiguity by using a data-driven diffusion prior with a novel coarse-to-fine refinement technique in signed distance function space. This method accurately generates occluded geometries based on partial radar measurements, leading to better performance than (c) existing domain-adapted methods that can fail with limited views and struggle even in full observability.
  • Figure 2: Method. Radar2Shape consists of two stages: 1) learning a multi-resolution, hierarchical latent space for 3D shapes, and 2) training a diffusion model to denoise in this space by conditioning on radar responses. In this figure, three hierarchical levels ($L=3$) are shown. (a) In Stage 1, we learn per-point multiresolution features from a point cloud that are projected onto triplanes of $L$ different grid resolutions. (b) A VAE then reconstructs each triplane independently to keep feature resolutions separate in its latent space. (c) Features are combined across resolutions to reconstruct the 3D geometry. (d) In Stage 2, a Transformer learns a sequence of $L$ multiresolution radar embeddings from a radar response interleaved with the VAE's multiresolution latent shape features. This enable coarse-to-fine prediction in a conditional diffusion process. Green and purple modules represent parameters trained during Stage 1 and Stage 2, respectively.
  • Figure 3: Ablation. Reconstruction of learned hierarchical latent codes with mixed coarse and fine features. For chairs, the model learns that the fine features correspond to arms and legs, because coarse features maintain the overall shape while the arms and legs are added or removed. This interpretability experiment demonstrates that our hierarchical SDF training method does indeed capture these coarse and fine features geometrically.
  • Figure 4: Qualitative Results. Comparison of select reconstructions from heldout fully-observed radar responses of Manifold40-PO. Radar2Shape consistently outperforms all baselines across a diverse set of meshes. TMNet and LIST exhibit mode-collapse, showing the difficulty of the radar-based 3D reconstruction problem when adapted to deterministic single/multi-view image-based reconstruction methods. Diffusion-SDF does the best among baselines, but often fails at reconstructing low-level features (shown with the chairs, table legs, and number of airplane engines).
  • Figure 5: Qualitative Results on Real Data. Reconstructions of a monoconic object from its real radar response, using Radar2Shape trained on Manifold40-PO. Radar2Shape predicts a wider tip, but is able to correctly predict the overall shape, base width, height, and angle near the base with low variance.