ComTraQ-MPC: Meta-Trained DQN-MPC Integration for Trajectory Tracking with Limited Active Localization Updates
Gokul Puthumanaillam, Manav Vora, Melkior Ornik
TL;DR
This work tackles trajectory tracking under partial observability with limited active localization updates by formulating a Budgeted POMDP and proposing ComTraQ-MPC, a hybrid system that couples a meta-trained DQN for adaptive localization scheduling with Model Predictive Control for precise tracking. The two modules interact bidirectionally: DQN decisions determine when to obtain true state information, influencing MPC’s belief and control, while MPC performance provides learning signals to refine the DQN policy. The approach is validated in both simulation and real-world robotic experiments, showing improved tracking accuracy and operational efficiency over baselines such as MPC with passive/naive localization and vanilla DQN. The meta-training across diverse trajectories and budgets enables generalization to unseen tasks, making ComTraQ-MPC a practical solution for resource-constrained autonomous navigation in complex environments. Overall, the framework delivers a generalizable and approximately optimal strategy for balancing localization budget and trajectory fidelity in partially observable domains, with potential extensions to multi-agent scenarios.
Abstract
Optimal decision-making for trajectory tracking in partially observable, stochastic environments where the number of active localization updates -- the process by which the agent obtains its true state information from the sensors -- are limited, presents a significant challenge. Traditional methods often struggle to balance resource conservation, accurate state estimation and precise tracking, resulting in suboptimal performance. This problem is particularly pronounced in environments with large action spaces, where the need for frequent, accurate state data is paramount, yet the capacity for active localization updates is restricted by external limitations. This paper introduces ComTraQ-MPC, a novel framework that combines Deep Q-Networks (DQN) and Model Predictive Control (MPC) to optimize trajectory tracking with constrained active localization updates. The meta-trained DQN ensures adaptive active localization scheduling, while the MPC leverages available state information to improve tracking. The central contribution of this work is their reciprocal interaction: DQN's update decisions inform MPC's control strategy, and MPC's outcomes refine DQN's learning, creating a cohesive, adaptive system. Empirical evaluations in simulated and real-world settings demonstrate that ComTraQ-MPC significantly enhances operational efficiency and accuracy, providing a generalizable and approximately optimal solution for trajectory tracking in complex partially observable environments.
