IONext: Unlocking the Next Era of Inertial Odometry

Shanshan Zhang; Qi Zhang; Siyue Wang; Tianshui Wen; Liqin Wu; Ziheng Zhou; Xuemin Hong; Ao Peng; Lingxiang Zheng; Yu Yang

IONext: Unlocking the Next Era of Inertial Odometry

Shanshan Zhang, Qi Zhang, Siyue Wang, Tianshui Wen, Liqin Wu, Ziheng Zhou, Xuemin Hong, Ao Peng, Lingxiang Zheng, Yu Yang

TL;DR

IONext tackles drift and generalization in inertial odometry by marrying CNN inductive bias with Transformer-inspired adaptability through the Adaptive Dynamic Encoder (ADE). ADE comprises ADM and AGU to jointly capture contextual motion and fine-grained local variations, enabling input-adaptive, multi-scale feature fusion. The work introduces the Absolute Length Error (ALE) metric with length-based normalization and demonstrates state-of-the-art performance across six public datasets, notably reducing errors on RNIN relative to strong baselines. This hybrid CNN-Transformer-inspired backbone offers robust, efficient IO suitable for infrastructure-free localization in diverse environments.

Abstract

Researchers have increasingly adopted Transformer-based models for inertial odometry. While Transformers excel at modeling long-range dependencies, their limited sensitivity to local, fine-grained motion variations and lack of inherent inductive biases often hinder localization accuracy and generalization. Recent studies have shown that incorporating large-kernel convolutions and Transformer-inspired architectural designs into CNN can effectively expand the receptive field, thereby improving global motion perception. Motivated by these insights, we propose a novel CNN-based module called the Dual-wing Adaptive Dynamic Mixer (DADM), which adaptively captures both global motion patterns and local, fine-grained motion features from dynamic inputs. This module dynamically generates selective weights based on the input, enabling efficient multi-scale feature aggregation. To further improve temporal modeling, we introduce the Spatio-Temporal Gating Unit (STGU), which selectively extracts representative and task-relevant motion features in the temporal domain. This unit addresses the limitations of temporal modeling observed in existing CNN approaches. Built upon DADM and STGU, we present a new CNN-based inertial odometry backbone, named Next Era of Inertial Odometry (IONext). Extensive experiments on six public datasets demonstrate that IONext consistently outperforms state-of-the-art (SOTA) Transformer- and CNN-based methods. For instance, on the RNIN dataset, IONext reduces the average ATE by 10% and the average RTE by 12% compared to the representative model iMOT.

IONext: Unlocking the Next Era of Inertial Odometry

TL;DR

Abstract

IONext: Unlocking the Next Era of Inertial Odometry

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (5)