Table of Contents
Fetching ...

X-IONet: Cross-Platform Inertial Odometry Network with Dual-Stage Attention

Dehan Shen, Changhao Chen

TL;DR

X-IONet addresses cross-platform inertial odometry by combining an IMU-only front-end with a rule-based expert selector, a dual-stage attention displacement predictor, and EKF-based state estimation. The method captures long-range temporal and inter-axis dependencies, outputs displacement with uncertainty, and fuses this information for robust global localization. It achieves state-of-the-art performance on both RoNIN (pedestrian) and a Go2 quadruped dataset, with significant reductions in ATE and RTE compared to strong baselines, and ablations validate the importance of each component, including the Huber–Gaussian loss for uncertainty modeling. The work demonstrates strong cross-platform generalization and robustness to aggressive and irregular motion, suggesting practical applicability for IMU-only navigation across humans and legged robots and potential for broader multimodal integration.

Abstract

Learning-based inertial odometry has achieved remarkable progress in pedestrian navigation. However, extending these methods to quadruped robots remains challenging due to their distinct and highly dynamic motion patterns. Models that perform well on pedestrian data often experience severe degradation when deployed on legged platforms. To tackle this challenge, we introduce X-IONet, a cross-platform inertial odometry framework that operates solely using a single Inertial Measurement Unit (IMU). X-IONet incorporates a rule-based expert selection module to classify motion platforms and route IMU sequences to platform-specific expert networks. The displacement prediction network features a dual-stage attention architecture that jointly models long-range temporal dependencies and inter-axis correlations, enabling accurate motion representation. It outputs both displacement and associated uncertainty, which are further fused through an Extended Kalman Filter (EKF) for robust state estimation. Extensive experiments on public pedestrian datasets and a self-collected quadruped robot dataset demonstrate that X-IONet achieves state-of-the-art performance, reducing Absolute Trajectory Error (ATE) by 14.3% and Relative Trajectory Error (RTE) by 11.4% on pedestrian data, and by 52.8% and 41.3% on quadruped robot data. These results highlight the effectiveness of X-IONet in advancing accurate and robust inertial navigation across both human and legged robot platforms.

X-IONet: Cross-Platform Inertial Odometry Network with Dual-Stage Attention

TL;DR

X-IONet addresses cross-platform inertial odometry by combining an IMU-only front-end with a rule-based expert selector, a dual-stage attention displacement predictor, and EKF-based state estimation. The method captures long-range temporal and inter-axis dependencies, outputs displacement with uncertainty, and fuses this information for robust global localization. It achieves state-of-the-art performance on both RoNIN (pedestrian) and a Go2 quadruped dataset, with significant reductions in ATE and RTE compared to strong baselines, and ablations validate the importance of each component, including the Huber–Gaussian loss for uncertainty modeling. The work demonstrates strong cross-platform generalization and robustness to aggressive and irregular motion, suggesting practical applicability for IMU-only navigation across humans and legged robots and potential for broader multimodal integration.

Abstract

Learning-based inertial odometry has achieved remarkable progress in pedestrian navigation. However, extending these methods to quadruped robots remains challenging due to their distinct and highly dynamic motion patterns. Models that perform well on pedestrian data often experience severe degradation when deployed on legged platforms. To tackle this challenge, we introduce X-IONet, a cross-platform inertial odometry framework that operates solely using a single Inertial Measurement Unit (IMU). X-IONet incorporates a rule-based expert selection module to classify motion platforms and route IMU sequences to platform-specific expert networks. The displacement prediction network features a dual-stage attention architecture that jointly models long-range temporal dependencies and inter-axis correlations, enabling accurate motion representation. It outputs both displacement and associated uncertainty, which are further fused through an Extended Kalman Filter (EKF) for robust state estimation. Extensive experiments on public pedestrian datasets and a self-collected quadruped robot dataset demonstrate that X-IONet achieves state-of-the-art performance, reducing Absolute Trajectory Error (ATE) by 14.3% and Relative Trajectory Error (RTE) by 11.4% on pedestrian data, and by 52.8% and 41.3% on quadruped robot data. These results highlight the effectiveness of X-IONet in advancing accurate and robust inertial navigation across both human and legged robot platforms.

Paper Structure

This paper contains 20 sections, 22 equations, 5 figures, 5 tables.

Figures (5)

  • Figure 1: Cross-Platform Inertial Odometry Network for pedestrians and quadruped robots. The temporal and frequency-domain visualizations reveal distinct differences between pedestrian and quadruped inertial signals.
  • Figure 2: Overall framework of the proposed X-IONet framework. The raw inertial data are rotated using the attitude estimated by EKF and then fed into a rule-based expert selection network to identify the motion platform. The data are routed to a displacement prediction network to regress displacement and uncertainty. The predicted results are refined through the EKF to achieve accurate and robust cross-platform inertial odometry.
  • Figure 3: The quadruped robot used in the experiments.
  • Figure 4: Trajectory comparisons of partial experimental results. The top three trajectory plots illustrate the comparisons of different methods on the Go2_Easy_01, Go2_Medium_02, and Go2_Hard_02, while the bottom three plots correspond to the RoNIN_a037_1, RoNIN_a038_2, and RoNIN_a044_2. In the figures, darker and thicker trajectories indicate a closer match to the ground-truth motion paths. Each image highlights selected regions with magnified details to better illustrate the differences among the methods.
  • Figure 5: The trajectory of the quadruped robot predicted using the pedestrian model.