Multi-Target Tracking with Transferable Convolutional Neural Networks
Damian Owerko, Charilaos I. Kanatsoulis, Jennifer Bondarchuk, Donald J. Bucci, Alejandro Ribeiro
TL;DR
The paper addresses scalable multi-target tracking (MTT) by recasting the problem as image-to-image prediction through 2D target-intensity and measurement-intensity images. It introduces a fully convolutional encoder–decoder CNN trained on small tracking windows and demonstrates transfer to large areas, supported by a theoretical generalization bound. Empirically, the method yields a 29% improvement in OSPA when scaling from 1 km^2 to 25 km^2 and consistently outperforms random finite-set filters (GLMB and LMB) at all scales, while maintaining favorable computational properties. This work offers a scalable, structure-exploiting deep learning solution for MTT with potential applicability to other domains.
Abstract
Multi-target tracking (MTT) is a classical signal processing task, where the goal is to estimate the states of an unknown number of moving targets from noisy sensor measurements. In this paper, we revisit MTT from a deep learning perspective and propose a convolutional neural network (CNN) architecture to tackle it. We represent the target states and sensor measurements as images and recast the problem as an image-to-image prediction task. Then we train a fully convolutional model at small tracking areas and transfer it to much larger areas with numerous targets and sensors. This transfer learning approach enables MTT at a large scale and is also theoretically supported by our novel analysis that bounds the generalization error. In practice, the proposed transferable CNN architecture outperforms random finite set filters on the MTT task with 10 targets and transfers without re-training to a larger MTT task with 250 targets with a 29% performance improvement.
