Table of Contents
Fetching ...

Can NeRFs See without Cameras?

Chaitanya Amballa, Sattwik Basu, Yu-Lin Wei, Zhijian Yang, Mehmet Ergezer, Romit Roy Choudhury

TL;DR

This work tackles the problem of inferring indoor floorplans from ambient wireless multipath signals without cameras. It introduces EchoNeRF, a physics-informed extension of NeRF that represents each voxel by opacity $δ$ and orientation $ω$, and learns from Tx/Rx locations and RSSI to recover an implicit 2D floorplan through a two-stage training regime that first models line-of-sight power and then first-order reflections with a discretized set of reflection orientations. Quantitative results on Zillow floorplans show EchoNeRF improves wall related metrics over baselines and supports forward tasks such as RSSI prediction and basic ray tracing, demonstrating the feasibility of neural wireless imaging. The approach advances RF based scene understanding with potential applications in indoor localization and wireless channel prediction, while highlighting future directions like higher order reflections and 3D floorplan extensions.

Abstract

Neural Radiance Fields (NeRFs) have been remarkably successful at synthesizing novel views of 3D scenes by optimizing a volumetric scene function. This scene function models how optical rays bring color information from a 3D object to the camera pixels. Radio frequency (RF) or audio signals can also be viewed as a vehicle for delivering information about the environment to a sensor. However, unlike camera pixels, an RF/audio sensor receives a mixture of signals that contain many environmental reflections (also called "multipath"). Is it still possible to infer the environment using such multipath signals? We show that with redesign, NeRFs can be taught to learn from multipath signals, and thereby "see" the environment. As a grounding application, we aim to infer the indoor floorplan of a home from sparse WiFi measurements made at multiple locations inside the home. Although a difficult inverse problem, our implicitly learnt floorplans look promising, and enables forward applications, such as indoor signal prediction and basic ray tracing.

Can NeRFs See without Cameras?

TL;DR

This work tackles the problem of inferring indoor floorplans from ambient wireless multipath signals without cameras. It introduces EchoNeRF, a physics-informed extension of NeRF that represents each voxel by opacity and orientation , and learns from Tx/Rx locations and RSSI to recover an implicit 2D floorplan through a two-stage training regime that first models line-of-sight power and then first-order reflections with a discretized set of reflection orientations. Quantitative results on Zillow floorplans show EchoNeRF improves wall related metrics over baselines and supports forward tasks such as RSSI prediction and basic ray tracing, demonstrating the feasibility of neural wireless imaging. The approach advances RF based scene understanding with potential applications in indoor localization and wireless channel prediction, while highlighting future directions like higher order reflections and 3D floorplan extensions.

Abstract

Neural Radiance Fields (NeRFs) have been remarkably successful at synthesizing novel views of 3D scenes by optimizing a volumetric scene function. This scene function models how optical rays bring color information from a 3D object to the camera pixels. Radio frequency (RF) or audio signals can also be viewed as a vehicle for delivering information about the environment to a sensor. However, unlike camera pixels, an RF/audio sensor receives a mixture of signals that contain many environmental reflections (also called "multipath"). Is it still possible to infer the environment using such multipath signals? We show that with redesign, NeRFs can be taught to learn from multipath signals, and thereby "see" the environment. As a grounding application, we aim to infer the indoor floorplan of a home from sparse WiFi measurements made at multiple locations inside the home. Although a difficult inverse problem, our implicitly learnt floorplans look promising, and enables forward applications, such as indoor signal prediction and basic ray tracing.

Paper Structure

This paper contains 23 sections, 25 equations, 14 figures, 4 tables.

Figures (14)

  • Figure 1: LoS and correct multipath reflections.
  • Figure 2: Estimated Wall_IoU at various levels of injected noise $\sigma$
  • Figure 3: Colored stripes define the manifold from which reflections are plausible between $\langle Tx, Rx \rangle$. Voxels located on the manifold form the plausible set $\mathcal{V}$. Dashed lines show plausible reflections.
  • Figure 4: EchoNeRF's two-stage training approach: In Stage 1, the LoS model is trained using known $Rx$ locations and signal power. This provides a warm-start to the reflection model in Stage 2 which refines the learned voxel densities and orientation.
  • Figure 5: Qualitative comparison of ground truth floorplans against baselines. In the first row, red stars denote Tx locations and light gray dots denote Rx measurement locations. The bottom two rows show floorplans learnt by EchoNeRF_LoS (i.e., Stage 1) and EchoNeRF (i.e., Stage 2) with sharper walls and boundaries. More visualizations available at https://echonerf.github.io/
  • ...and 9 more figures