RISE: Single Static Radar-based Indoor Scene Understanding
Kaichen Zhou, Laura Dodds, Sayed Saad Afzal, Fadel Adib
TL;DR
RISE tackles privacy-preserving indoor scene understanding with a single static mmWave radar by converting multipath ghosts into geometric cues. The core pipeline combines Bi-Angular Multipath Enhancement to reveal off-diagonal ghosts, multipath inversion to recover reflector geometry, and a Sim2Real Hierarchical Diffusion model to produce complete wall layouts and furniture detections from partial radar data. The authors create the RISE-Indoor Benchmark (50k frames, 100 trajectories across 11 environments) and demonstrate state-of-the-art layout accuracy (Chamfer ~16 cm) and first-mmWave furniture detection (IoU ~58%). This work lays a foundation for geometry-aware, privacy-preserving indoor perception with inexpensive, passive sensing and points to future extensions for dynamic or multi-person scenes in real-world deployments.
Abstract
Robust and privacy-preserving indoor scene understanding remains a fundamental open problem. While optical sensors such as RGB and LiDAR offer high spatial fidelity, they suffer from severe occlusions and introduce privacy risks in indoor environments. In contrast, millimeter-wave (mmWave) radar preserves privacy and penetrates obstacles, but its inherently low spatial resolution makes reliable geometric reasoning difficult. We introduce RISE, the first benchmark and system for single-static-radar indoor scene understanding, jointly targeting layout reconstruction and object detection. RISE is built upon the key insight that multipath reflections, traditionally treated as noise, encode rich geometric cues. To exploit this, we propose a Bi-Angular Multipath Enhancement that explicitly models Angle-of-Arrival and Angle-of-Departure to recover secondary (ghost) reflections and reveal invisible structures. On top of these enhanced observations, a simulation-to-reality Hierarchical Diffusion framework transforms fragmented radar responses into complete layout reconstruction and object detection. Our benchmark contains 50,000 frames collected across 100 real indoor trajectories, forming the first large-scale dataset dedicated to radar-based indoor scene understanding. Extensive experiments show that RISE reduces the Chamfer Distance by 60% (down to 16 cm) compared to the state of the art in layout reconstruction, and delivers the first mmWave-based object detection, achieving 58% IoU. These results establish RISE as a new foundation for geometry-aware and privacy-preserving indoor scene understanding using a single static radar.
