Future-Aware Interaction Network For Motion Forecasting

Shijie Li; Xun Xu; Si Yong Yeo; Xulei Yang

Future-Aware Interaction Network For Motion Forecasting

Shijie Li, Xun Xu, Si Yong Yeo, Xulei Yang

TL;DR

FINet addresses the multimodal motion forecasting problem by integrating potential future trajectories into scene encoding, enabling joint optimization of historical and future states. It combines a Lightweight Scene Encoder, Future-Aware Mamba with an Adaptive Reorder Strategy, and a Temporal Enhanced Decoder to produce diverse, temporally coherent trajectories with linear-scaling efficiency via the Mamba State Space Model. The method achieves state-of-the-art or competitive results on Argoverse 1 and 2, with substantial reductions in latency, memory, and FLOPs compared to transformer-based baselines. This work advances practical, real-time motion forecasting for autonomous driving by improving accuracy and efficiency while capturing diverse plausible futures.

Abstract

Motion forecasting is a crucial component of autonomous driving systems, enabling the generation of accurate and smooth future trajectories to ensure safe navigation to the destination. In previous methods, potential future trajectories are often absent in the scene encoding stage, which may lead to suboptimal outcomes. Additionally, prior approaches typically employ transformer architectures for spatiotemporal modeling of trajectories and map information, which suffer from the quadratic scaling complexity of the transformer architecture. In this work, we propose an interaction-based method, named Future-Aware Interaction Network, that introduces potential future trajectories into scene encoding for a comprehensive traffic representation. Furthermore, a State Space Model (SSM), specifically Mamba, is introduced for both spatial and temporal modeling. To adapt Mamba for spatial interaction modeling, we propose an adaptive reordering strategy that transforms unordered data into a structured sequence. Additionally, Mamba is employed to refine generated future trajectories temporally, ensuring more consistent predictions. These enhancements not only improve model efficiency but also enhance the accuracy and diversity of predictions. We conduct comprehensive experiments on the widely used Argoverse 1 and Argoverse 2 datasets, demonstrating that the proposed method achieves superior performance compared to previous approaches in a more efficient way. The code will be released according to the acceptance.

Future-Aware Interaction Network For Motion Forecasting

TL;DR

Abstract

Future-Aware Interaction Network For Motion Forecasting

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (6)