Table of Contents
Fetching ...

PanoDP: Learning Collision-Free Navigation with Panoramic Depth and Differentiable Physics

Hao Zhong, Pei Chi, Jiang Zhao, Shenghai Yuan, Xuyang Gao, Thien-Minh Nguyen, Lihua Xie

TL;DR

PanoDP is presented, a communication-free learning framework that combines four-view panoramic depth perception with differentiable-physics-based training signals and optimizes policies with dense differentiable collision and motion-feasibility terms, improving training stability beyond sparse terminal collisions.

Abstract

Autonomous collision-free navigation in cluttered environments requires safe decision-making under partial observability with both static structure and dynamic obstacles. We present \textbf{PanoDP}, a communication-free learning framework that combines four-view panoramic depth perception with differentiable-physics-based training signals. PanoDP encodes panoramic depth using a lightweight CNN and optimizes policies with dense differentiable collision and motion-feasibility terms, improving training stability beyond sparse terminal collisions. We evaluate PanoDP on a controlled ring-to-center benchmark with systematic sweeps over agent count, obstacle density/layout, and dynamic behaviors, and further test out-of-distribution generalization in an external simulator (e.g., AirSim). Across settings, PanoDP increases collision-free and completion rates over single-view and non-physics-guided baselines under matched training budgets, and ablations (view masking, rotation augmentation) confirm the policy leverages 360-degree information. Code will be open source upon acceptance.

PanoDP: Learning Collision-Free Navigation with Panoramic Depth and Differentiable Physics

TL;DR

PanoDP is presented, a communication-free learning framework that combines four-view panoramic depth perception with differentiable-physics-based training signals and optimizes policies with dense differentiable collision and motion-feasibility terms, improving training stability beyond sparse terminal collisions.

Abstract

Autonomous collision-free navigation in cluttered environments requires safe decision-making under partial observability with both static structure and dynamic obstacles. We present \textbf{PanoDP}, a communication-free learning framework that combines four-view panoramic depth perception with differentiable-physics-based training signals. PanoDP encodes panoramic depth using a lightweight CNN and optimizes policies with dense differentiable collision and motion-feasibility terms, improving training stability beyond sparse terminal collisions. We evaluate PanoDP on a controlled ring-to-center benchmark with systematic sweeps over agent count, obstacle density/layout, and dynamic behaviors, and further test out-of-distribution generalization in an external simulator (e.g., AirSim). Across settings, PanoDP increases collision-free and completion rates over single-view and non-physics-guided baselines under matched training budgets, and ablations (view masking, rotation augmentation) confirm the policy leverages 360-degree information. Code will be open source upon acceptance.
Paper Structure (15 sections, 12 equations, 8 figures, 2 tables)

This paper contains 15 sections, 12 equations, 8 figures, 2 tables.

Figures (8)

  • Figure 1: Panoramic depth construction. (a) Four cameras with $100^{\circ}$ FOV cover the full azimuth with overlapping regions. (b) Cosine-squared blend weights $w_k(\theta)$ ensure smooth transitions. (c) The resulting equirectangular panorama $\mathbf{P}^i_t$ wraps continuously at $0^{\circ}/360^{\circ}$.
  • Figure 2: PanoDP training pipeline. At each time step the four onboard depth images are stitched into a $360^{\circ}$ panorama, encoded by the circular CNN, fused with the body-frame state vector in a GRU cell, and mapped to an acceleration command. Because the dynamics integrator \ref{['eq:pos_update']}--\ref{['eq:vel_update']} is fully differentiable, the trajectory loss $\mathcal{L}$ can be back-propagated through the entire $T$-step unrolled graph (dashed arrows), yielding exact gradients $\partial\mathcal{L}/\partial\theta$ that update all learnable parameters in a single pass.
  • Figure 3: Panorama and forward-depth training comparison over 50 K iterations (6 metrics). Insets zoom into the converged tail (40--50 K steps). PanoDP (blue, solid) consistently outperforms the forward-only baseline DPD$^\dagger$ (grey, dashed) across all metrics.
  • Figure 4: Ablation training curves over 50 K iterations (6 metrics), with tail zoom (40--50 K). Full PanoDP converges most stably with lowest variance. Removing the GRU raises jerk and snap; removing circular convolutions or using a flat MLP yields higher plateaus.
  • Figure 5: Ablation bar chart (tail-mean at 50 K iterations) across eight training metrics. PanoDP (Ours, red border) achieves the best or near-best value on every metric. Stars ($\star$) mark the best method per subplot.
  • ...and 3 more figures