Table of Contents
Fetching ...

IMM-MOT: A Novel 3D Multi-object Tracking Framework with Interacting Multiple Model Filter

Xiaohong Liu, Xulong Zhao, Gang Liu, Zili Wu, Tao Wang, Lei Meng, Yuhan Wang

TL;DR

IMM-MOT tackles the challenge of varying motion patterns in 3D MOT by integrating an Interacting Multi-Model (IMM) filter, which blends multiple motion models (CV, CA, CTRV, CTRA) through adaptive model probabilities to better predict target trajectories. It augments prediction with a Damping Window trajectory management to robustly birth and terminate tracks, and a Distance-Based Score Enhancement (DBSE) to improve discriminability between true and false detections. On NuScenes Val with CenterPoint detections, IMM-MOT achieves AMOTA of 73.8%, surpassing most single-modal methods, and ablations show the IMM, DW, and DBSE components each contribute meaningful gains. The approach provides a practical, open-source 3D MOT framework that adapts to complex motions and reduces missed detections while controlling false positives, with potential impact on autonomous driving perception pipelines.

Abstract

3D Multi-Object Tracking (MOT) provides the trajectories of surrounding objects, assisting robots or vehicles in smarter path planning and obstacle avoidance. Existing 3D MOT methods based on the Tracking-by-Detection framework typically use a single motion model to track an object throughout its entire tracking process. However, objects may change their motion patterns due to variations in the surrounding environment. In this paper, we introduce the Interacting Multiple Model filter in IMM-MOT, which accurately fits the complex motion patterns of individual objects, overcoming the limitation of single-model tracking in existing approaches. In addition, we incorporate a Damping Window mechanism into the trajectory lifecycle management, leveraging the continuous association status of trajectories to control their creation and termination, reducing the occurrence of overlooked low-confidence true targets. Furthermore, we propose the Distance-Based Score Enhancement module, which enhances the differentiation between false positives and true positives by adjusting detection scores, thereby improving the effectiveness of the Score Filter. On the NuScenes Val dataset, IMM-MOT outperforms most other single-modal models using 3D point clouds, achieving an AMOTA of 73.8%. Our project is available at https://github.com/Ap01lo/IMM-MOT.

IMM-MOT: A Novel 3D Multi-object Tracking Framework with Interacting Multiple Model Filter

TL;DR

IMM-MOT tackles the challenge of varying motion patterns in 3D MOT by integrating an Interacting Multi-Model (IMM) filter, which blends multiple motion models (CV, CA, CTRV, CTRA) through adaptive model probabilities to better predict target trajectories. It augments prediction with a Damping Window trajectory management to robustly birth and terminate tracks, and a Distance-Based Score Enhancement (DBSE) to improve discriminability between true and false detections. On NuScenes Val with CenterPoint detections, IMM-MOT achieves AMOTA of 73.8%, surpassing most single-modal methods, and ablations show the IMM, DW, and DBSE components each contribute meaningful gains. The approach provides a practical, open-source 3D MOT framework that adapts to complex motions and reduces missed detections while controlling false positives, with potential impact on autonomous driving perception pipelines.

Abstract

3D Multi-Object Tracking (MOT) provides the trajectories of surrounding objects, assisting robots or vehicles in smarter path planning and obstacle avoidance. Existing 3D MOT methods based on the Tracking-by-Detection framework typically use a single motion model to track an object throughout its entire tracking process. However, objects may change their motion patterns due to variations in the surrounding environment. In this paper, we introduce the Interacting Multiple Model filter in IMM-MOT, which accurately fits the complex motion patterns of individual objects, overcoming the limitation of single-model tracking in existing approaches. In addition, we incorporate a Damping Window mechanism into the trajectory lifecycle management, leveraging the continuous association status of trajectories to control their creation and termination, reducing the occurrence of overlooked low-confidence true targets. Furthermore, we propose the Distance-Based Score Enhancement module, which enhances the differentiation between false positives and true positives by adjusting detection scores, thereby improving the effectiveness of the Score Filter. On the NuScenes Val dataset, IMM-MOT outperforms most other single-modal models using 3D point clouds, achieving an AMOTA of 73.8%. Our project is available at https://github.com/Ap01lo/IMM-MOT.

Paper Structure

This paper contains 12 sections, 12 equations, 5 figures, 7 tables, 1 algorithm.

Figures (5)

  • Figure 1: An illustration of a car turning left and its sequential scenarios: (a) Deceleration; (b) Turning; (c) Acceleration.
  • Figure 2: The framework of IMM-MOT is as follows: (1) Distance-Based Score Enhancement (DBSE), Score Filter (SF), and Non-Maximum Suppression (NMS) are applied sequentially to process the detection values in the preprocessing stage, resulting in $D'$. $D'$ is then fed into the association module. (2) The Interacting Multiple Model (IMM) tracker predicts the next-frame target positions of effective trajectories in the trajectory library, resulting in $\hat{X}_t$. Each model predicts based on the previous frame’s $X_{t-1}$, and the predicted values $\hat{X}_t^i$ from each model are weighted by the previous frame’s model probabilities $\mu_{t-1}^i$ to obtain $\hat{X}_t$. $\hat{X}_t$ is then passed into the association module to obtain the association results. The observations $Z_t$ from successful associations are used to update the model probabilities. During the update, residuals $S_t^i$ are first calculated, followed by the computation of the likelihoods $\Lambda_t^i$ for each model. Combining these with the previous frame’s $\mu_{t-1}$ and the Markov matrix M, the updated model probabilities $\mu_t$ are obtained. (3) The trajectory initialization process determines whether new tracks are formed for unassociated detection values after association. The Damping Window (DW) scores $S_{DW}$ for all trajectories are computed and compared with the active and tentative thresholds $\theta_{active}$ and $\theta_{tentative}$, respectively, to determine the trajectory states. Active and tentative trajectories are retained as effective trajectories for tracking in subsequent frames.
  • Figure 3: The trend of Damping Window scores for five different conditions.
  • Figure 4: (a) The effect of the parameters in (\ref{['eq12']}) on the function value. (b) The effect of the parameters in (\ref{['eq13']}) on the function value. (c) The effect of (\ref{['eq12']}) on the target score. (d) The effect of (\ref{['eq13']}) on the target score.
  • Figure 5: The comparison of Interacting Multiple Model and other models on different targets.