Table of Contents
Fetching ...

E(2)-Equivariant Graph Planning for Navigation

Linfeng Zhao, Hongyu Li, Taskin Padir, Huaizu Jiang, Lawson L. S. Wong

TL;DR

This work tackles data-efficient 2D robot navigation by leveraging Euclidean symmetry in planning. It formulates navigation on geometric graphs and develops an $E(2)$-equivariant differentiable planner (MP-VIN), augmented with a learnable $C_K$-equivariant lifting layer for multi-camera inputs. The approach achieves $G$-equivariance across inputs and outputs, enabling continuous actions in ${\mathbb{R}}^2$ and improved training efficiency, stability, and generalization across grid, graph, Miniworld, and semantic navigation tasks. The results highlight the practical impact of symmetry-aware planning for robust, scalable navigation in unstructured environments.

Abstract

Learning for robot navigation presents a critical and challenging task. The scarcity and costliness of real-world datasets necessitate efficient learning approaches. In this letter, we exploit Euclidean symmetry in planning for 2D navigation, which originates from Euclidean transformations between reference frames and enables parameter sharing. To address the challenges of unstructured environments, we formulate the navigation problem as planning on a geometric graph and develop an equivariant message passing network to perform value iteration. Furthermore, to handle multi-camera input, we propose a learnable equivariant layer to lift features to a desired space. We conduct comprehensive evaluations across five diverse tasks encompassing structured and unstructured environments, along with maps of known and unknown, given point goals or semantic goals. Our experiments confirm the substantial benefits on training efficiency, stability, and generalization. More details can be found at the project website: https://lhy.xyz/e2-planning/.

E(2)-Equivariant Graph Planning for Navigation

TL;DR

This work tackles data-efficient 2D robot navigation by leveraging Euclidean symmetry in planning. It formulates navigation on geometric graphs and develops an -equivariant differentiable planner (MP-VIN), augmented with a learnable -equivariant lifting layer for multi-camera inputs. The approach achieves -equivariance across inputs and outputs, enabling continuous actions in and improved training efficiency, stability, and generalization across grid, graph, Miniworld, and semantic navigation tasks. The results highlight the practical impact of symmetry-aware planning for robust, scalable navigation in unstructured environments.

Abstract

Learning for robot navigation presents a critical and challenging task. The scarcity and costliness of real-world datasets necessitate efficient learning approaches. In this letter, we exploit Euclidean symmetry in planning for 2D navigation, which originates from Euclidean transformations between reference frames and enables parameter sharing. To address the challenges of unstructured environments, we formulate the navigation problem as planning on a geometric graph and develop an equivariant message passing network to perform value iteration. Furthermore, to handle multi-camera input, we propose a learnable equivariant layer to lift features to a desired space. We conduct comprehensive evaluations across five diverse tasks encompassing structured and unstructured environments, along with maps of known and unknown, given point goals or semantic goals. Our experiments confirm the substantial benefits on training efficiency, stability, and generalization. More details can be found at the project website: https://lhy.xyz/e2-planning/.
Paper Structure (34 sections, 23 equations, 8 figures, 3 tables)

This paper contains 34 sections, 23 equations, 8 figures, 3 tables.

Figures (8)

  • Figure 1: Illustration of rotation equivariance. We provide a side-by-side comparison with SymVIN zhao_integrating_2022. We use the blue arrow to show the orientation of the robot. Rotating the robot $\circlearrowright 90^{\circ}$ is equivalent to rotating the world frame $\circlearrowleft 90^{\circ}$. When camera views are cyclically permuted, action output (red arrow) is transformed by a rotation matrix. The state space of SymVIN (left) is confined to the grid, and it only produces discrete actions. Our approach acts on continuous 2D space and produces ${\mathbb{R}}^2$ actions.
  • Figure 2: Overview of the message passing planner network (MP-VIN). It takes the map $M$ as the input, which contains the node position ${\bm{x}} \in {\mathbb{R}}^2$ and is optionally appended by the goal information (the red node is goal node) or observations depending on the navigation task. Then, the output is applied value iteration for $k$ times. The state value map $h_V$ and Q-value map $h_Q$ are updated during value iterations. The final output is an action map $\Pi$: for each node, it is a continuous relative movement $\Delta x \in {\mathbb{R}}^2$.
  • Figure 3: Our proposed $\texttt{lift}$ layer and its equivariance.
  • Figure 4: Learning curves on the Grid World experiments (left two) and the Graph World experiments (right two). The shadow area shows the standard error. Dashed lines are for non-MP-VIN methods (VIN, SymVIN, GCN-VIN, and GAT-VIN).
  • Figure 5: Data efficiency and size generalization. We demonstrate data efficiency across 100, 256, and 512 training samples. For models trained on each dataset, we show size generalization by training them on the smallest size and directly testing them on larger ones.
  • ...and 3 more figures