IA-LSTM: Interaction-Aware LSTM for Pedestrian Trajectory Prediction
Yuehai Chen
TL;DR
The paper tackles pedestrian trajectory prediction in crowded environments by introducing IA-LSTM, which integrates a correntropy-based Interaction Module to quantify the relative importance of human–human interactions and to establish per-pedestrian personal space. Each pedestrian is modeled with a shared LSTM that ingests both an individual embedding and an interaction embedding derived from an Interaction Tensor, where correntropy weights emphasize nearby, socially relevant influences via $CE_{ij} = \exp(-||\boldsymbol{p}_i^t - \boldsymbol{p}_j^t||^2 /(2\sigma^2))$ and $\boldsymbol{H}_i^t = \sum_j CE_{ij} \boldsymbol{h}_j^{t-1}$. The model outputs a bivariate Gaussian for the next position using parameters $(\mu_i^{t+1}, \sigma_i^{t+1}, \rho_i^{t+1}) = W_o \, \boldsymbol{h}_i^t$, and is trained via negative log-likelihood. Empirical results on ETH/Hotel/ZARA/UCY show IA-LSTM consistently outperforms Social LSTM and Social Attention, with notable gains in crowded scenes and real-time inference speed (~0.049 s per step on a 3090 GPU). The work highlights correntropy as a robust, data-driven mechanism to capture personal space and varying interaction importance, enabling more accurate and socially plausible trajectory predictions in complex crowds.
Abstract
Predicting the trajectory of pedestrians in crowd scenarios is indispensable in self-driving or autonomous mobile robot field because estimating the future locations of pedestrians around is beneficial for policy decision to avoid collision. It is a challenging issue because humans have different walking motions, and the interactions between humans and objects in the current environment, especially between humans themselves, are complex. Previous researchers focused on how to model human-human interactions but neglected the relative importance of interactions. To address this issue, a novel mechanism based on correntropy is introduced. The proposed mechanism not only can measure the relative importance of human-human interactions but also can build personal space for each pedestrian. An interaction module including this data-driven mechanism is further proposed. In the proposed module, the data-driven mechanism can effectively extract the feature representations of dynamic human-human interactions in the scene and calculate the corresponding weights to represent the importance of different interactions. To share such social messages among pedestrians, an interaction-aware architecture based on long short-term memory network for trajectory prediction is designed. Experiments are conducted on two public datasets. Experimental results demonstrate that our model can achieve better performance than several latest methods with good performance.
