Table of Contents
Fetching ...

Vehicle Behavior Prediction by Episodic-Memory Implanted NDT

Peining Shen, Jianwu Fang, Hongkai Yu, Jianru Xue

TL;DR

The paper tackles interpretable prediction of target vehicle behaviors in autonomous driving by introducing eMem-NDT, a Neural Decision Tree that implants an Episodic Memory Bank into leaf nodes. A base vBeh-Pre model (using a Transformer for vehicle history and EvolveGCN for interactions) provides strong predictive power, while eMem-NDT replaces the softmax with a memory-guided decision process that traverses memory prototypes via Memory Prototype Matching (MPM) and Leaf Link Aggregation (LLA). The Episodic Memory Bank stores representative past scene interactions, filtered by a Leaf Node Memory Filter to reduce redundancy, and is aligned with input features through learned projections. Experiments on BLVD and LOKI demonstrate improved precision and F1 over baselines, with notable gains in few-shot scenarios and transparent root-to-leaf inference paths that enhance interpretability. The work offers a practical path toward trustworthy, explainable driving decisions in safety-critical systems, and provides code for reproducibility.

Abstract

In autonomous driving, predicting the behavior (turning left, stopping, etc.) of target vehicles is crucial for the self-driving vehicle to make safe decisions and avoid accidents. Existing deep learning-based methods have shown excellent and accurate performance, but the black-box nature makes it untrustworthy to apply them in practical use. In this work, we explore the interpretability of behavior prediction of target vehicles by an Episodic Memory implanted Neural Decision Tree (abbrev. eMem-NDT). The structure of eMem-NDT is constructed by hierarchically clustering the text embedding of vehicle behavior descriptions. eMem-NDT is a neural-backed part of a pre-trained deep learning model by changing the soft-max layer of the deep model to eMem-NDT, for grouping and aligning the memory prototypes of the historical vehicle behavior features in training data on a neural decision tree. Each leaf node of eMem-NDT is modeled by a neural network for aligning the behavior memory prototypes. By eMem-NDT, we infer each instance in behavior prediction of vehicles by bottom-up Memory Prototype Matching (MPM) (searching the appropriate leaf node and the links to the root node) and top-down Leaf Link Aggregation (LLA) (obtaining the probability of future behaviors of vehicles for certain instances). We validate eMem-NDT on BLVD and LOKI datasets, and the results show that our model can obtain a superior performance to other methods with clear explainability. The code is available at https://github.com/JWFangit/eMem-NDT.

Vehicle Behavior Prediction by Episodic-Memory Implanted NDT

TL;DR

The paper tackles interpretable prediction of target vehicle behaviors in autonomous driving by introducing eMem-NDT, a Neural Decision Tree that implants an Episodic Memory Bank into leaf nodes. A base vBeh-Pre model (using a Transformer for vehicle history and EvolveGCN for interactions) provides strong predictive power, while eMem-NDT replaces the softmax with a memory-guided decision process that traverses memory prototypes via Memory Prototype Matching (MPM) and Leaf Link Aggregation (LLA). The Episodic Memory Bank stores representative past scene interactions, filtered by a Leaf Node Memory Filter to reduce redundancy, and is aligned with input features through learned projections. Experiments on BLVD and LOKI demonstrate improved precision and F1 over baselines, with notable gains in few-shot scenarios and transparent root-to-leaf inference paths that enhance interpretability. The work offers a practical path toward trustworthy, explainable driving decisions in safety-critical systems, and provides code for reproducibility.

Abstract

In autonomous driving, predicting the behavior (turning left, stopping, etc.) of target vehicles is crucial for the self-driving vehicle to make safe decisions and avoid accidents. Existing deep learning-based methods have shown excellent and accurate performance, but the black-box nature makes it untrustworthy to apply them in practical use. In this work, we explore the interpretability of behavior prediction of target vehicles by an Episodic Memory implanted Neural Decision Tree (abbrev. eMem-NDT). The structure of eMem-NDT is constructed by hierarchically clustering the text embedding of vehicle behavior descriptions. eMem-NDT is a neural-backed part of a pre-trained deep learning model by changing the soft-max layer of the deep model to eMem-NDT, for grouping and aligning the memory prototypes of the historical vehicle behavior features in training data on a neural decision tree. Each leaf node of eMem-NDT is modeled by a neural network for aligning the behavior memory prototypes. By eMem-NDT, we infer each instance in behavior prediction of vehicles by bottom-up Memory Prototype Matching (MPM) (searching the appropriate leaf node and the links to the root node) and top-down Leaf Link Aggregation (LLA) (obtaining the probability of future behaviors of vehicles for certain instances). We validate eMem-NDT on BLVD and LOKI datasets, and the results show that our model can obtain a superior performance to other methods with clear explainability. The code is available at https://github.com/JWFangit/eMem-NDT.
Paper Structure (16 sections, 9 equations, 6 figures, 3 tables, 1 algorithm)

This paper contains 16 sections, 9 equations, 6 figures, 3 tables, 1 algorithm.

Figures (6)

  • Figure 1: The eMem-NDT aims to retrospect the episodic memory of certain kinds of vehicle behaviors and enhance the prediction trustworthiness.
  • Figure 2: The feature embedding for vehicle state and the dynamic interaction graphs by vehicle state transformer (Stateformer) Vaswani2017 and EvolveGCN, respectively.
  • Figure 3: The construction process of eMem-NDT, which contains five steps. Commonly, the number of leaf nodes is initialized by the vehicle behavior types. Hierarchical clustering groups the text embedding of leaf nodes and makes the tree grow. In particular, we implant the episodic memories of vehicles to the correlated leaf nodes, and the leaf nodes are further optimized by measuring the feature embedding of input instances (raw vehicle states and interactions) and stored episodic memories.
  • Figure 4: The instance distribution of (a) BLVD dataset and (b) LOKI dataset. LOKI has a more severe imbalance issue than BLVD.
  • Figure 5: The diversity of memory prototype utilization ($\eta=0.7$).
  • ...and 1 more figures