UniLCD: Unified Local-Cloud Decision-Making via Reinforcement Learning
Kathakoli Sengupta, Zhongkai Shangguan, Sandesh Bharadwaj, Sanjay Arora, Eshed Ohn-Bar, Renato Mancuso
TL;DR
UniLCD tackles real-time vision-based mobile systems by learning a flexible local-cloud routing policy that balances energy, latency, and safety. The approach trains a local and a cloud navigation policy via imitation learning, then optimizes a residual routing policy with PPO under a multiplicative multi-objective reward, including a dedicated collision penalty. A shared feature extractor enables on-device efficiency, while embedding-based communication to the cloud reduces energy and delays; results on CARLA crowded navigation show substantial gains in ecological navigation performance (ENS up to ≈86%) and overall efficiency, outperforming state-of-the-art baselines by over 35%. This work offers a practical framework for sustainable, safe, real-time cloud-edge collaboration applicable to dynamic, safety-critical robotic systems.
Abstract
Embodied vision-based real-world systems, such as mobile robots, require a careful balance between energy consumption, compute latency, and safety constraints to optimize operation across dynamic tasks and contexts. As local computation tends to be restricted, offloading the computation, ie, to a remote server, can save local resources while providing access to high-quality predictions from powerful and large models. However, the resulting communication and latency overhead has led to limited usability of cloud models in dynamic, safety-critical, real-time settings. To effectively address this trade-off, we introduce UniLCD, a novel hybrid inference framework for enabling flexible local-cloud collaboration. By efficiently optimizing a flexible routing module via reinforcement learning and a suitable multi-task objective, UniLCD is specifically designed to support the multiple constraints of safety-critical end-to-end mobile systems. We validate the proposed approach using a challenging, crowded navigation task requiring frequent and timely switching between local and cloud operations. UniLCD demonstrates improved overall performance and efficiency, by over 35% compared to state-of-the-art baselines based on various split computing and early exit strategies.
