IMM-MOT: A Novel 3D Multi-object Tracking Framework with Interacting Multiple Model Filter
Xiaohong Liu, Xulong Zhao, Gang Liu, Zili Wu, Tao Wang, Lei Meng, Yuhan Wang
TL;DR
IMM-MOT tackles the challenge of varying motion patterns in 3D MOT by integrating an Interacting Multi-Model (IMM) filter, which blends multiple motion models (CV, CA, CTRV, CTRA) through adaptive model probabilities to better predict target trajectories. It augments prediction with a Damping Window trajectory management to robustly birth and terminate tracks, and a Distance-Based Score Enhancement (DBSE) to improve discriminability between true and false detections. On NuScenes Val with CenterPoint detections, IMM-MOT achieves AMOTA of 73.8%, surpassing most single-modal methods, and ablations show the IMM, DW, and DBSE components each contribute meaningful gains. The approach provides a practical, open-source 3D MOT framework that adapts to complex motions and reduces missed detections while controlling false positives, with potential impact on autonomous driving perception pipelines.
Abstract
3D Multi-Object Tracking (MOT) provides the trajectories of surrounding objects, assisting robots or vehicles in smarter path planning and obstacle avoidance. Existing 3D MOT methods based on the Tracking-by-Detection framework typically use a single motion model to track an object throughout its entire tracking process. However, objects may change their motion patterns due to variations in the surrounding environment. In this paper, we introduce the Interacting Multiple Model filter in IMM-MOT, which accurately fits the complex motion patterns of individual objects, overcoming the limitation of single-model tracking in existing approaches. In addition, we incorporate a Damping Window mechanism into the trajectory lifecycle management, leveraging the continuous association status of trajectories to control their creation and termination, reducing the occurrence of overlooked low-confidence true targets. Furthermore, we propose the Distance-Based Score Enhancement module, which enhances the differentiation between false positives and true positives by adjusting detection scores, thereby improving the effectiveness of the Score Filter. On the NuScenes Val dataset, IMM-MOT outperforms most other single-modal models using 3D point clouds, achieving an AMOTA of 73.8%. Our project is available at https://github.com/Ap01lo/IMM-MOT.
