Table of Contents
Fetching ...

Opportunistic Collaborative Planning with Large Vision Model Guided Control and Joint Query-Service Optimization

Jiayi Chen, Shuai Wang, Guoliang Li, Wei Xu, Guangxu Zhu, Derrick Wing Kwan Ng, Chengzhong Xu

TL;DR

This work tackles autonomous navigation in open environments with unseen objects by proposing Opportunistic Collaborative Planning (OCP), which integrates efficient local MPC with cloud-driven large vision model (LVM) perception. The framework introduces LVM-MPC, a closed-loop perception-to-control system guided by cloud outputs, and CTO, which jointly optimizes when to query the cloud (ODCT) and when to offer cloud service (CFS). The approach is implemented in Carla with ROS and demonstrates reduced finish times, shorter trajectories, and near-100% success across diverse scenarios, while significantly reducing unnecessary cloud usage. The results highlight the practical value of cloud-edge collaboration for robust, resource-efficient autonomous navigation in open-world settings without sacrificing safety.

Abstract

Navigating autonomous vehicles in open scenarios is a challenge due to the difficulties in handling unseen objects. Existing solutions either rely on small models that struggle with generalization or large models that are resource-intensive. While collaboration between the two offers a promising solution, the key challenge is deciding when and how to engage the large model. To address this issue, this paper proposes opportunistic collaborative planning (OCP), which seamlessly integrates efficient local models with powerful cloud models through two key innovations. First, we propose large vision model guided model predictive control (LVM-MPC), which leverages the cloud for LVM perception and decision making. The cloud output serves as a global guidance for a local MPC, thereby forming a closed-loop perception-to-control system. Second, to determine the best timing for large model query and service, we propose collaboration timing optimization (CTO), including object detection confidence thresholding (ODCT) and cloud forward simulation (CFS), to decide when to seek cloud assistance and when to offer cloud service. Extensive experiments show that the proposed OCP outperforms existing methods in terms of both navigation time and success rate.

Opportunistic Collaborative Planning with Large Vision Model Guided Control and Joint Query-Service Optimization

TL;DR

This work tackles autonomous navigation in open environments with unseen objects by proposing Opportunistic Collaborative Planning (OCP), which integrates efficient local MPC with cloud-driven large vision model (LVM) perception. The framework introduces LVM-MPC, a closed-loop perception-to-control system guided by cloud outputs, and CTO, which jointly optimizes when to query the cloud (ODCT) and when to offer cloud service (CFS). The approach is implemented in Carla with ROS and demonstrates reduced finish times, shorter trajectories, and near-100% success across diverse scenarios, while significantly reducing unnecessary cloud usage. The results highlight the practical value of cloud-edge collaboration for robust, resource-efficient autonomous navigation in open-world settings without sacrificing safety.

Abstract

Navigating autonomous vehicles in open scenarios is a challenge due to the difficulties in handling unseen objects. Existing solutions either rely on small models that struggle with generalization or large models that are resource-intensive. While collaboration between the two offers a promising solution, the key challenge is deciding when and how to engage the large model. To address this issue, this paper proposes opportunistic collaborative planning (OCP), which seamlessly integrates efficient local models with powerful cloud models through two key innovations. First, we propose large vision model guided model predictive control (LVM-MPC), which leverages the cloud for LVM perception and decision making. The cloud output serves as a global guidance for a local MPC, thereby forming a closed-loop perception-to-control system. Second, to determine the best timing for large model query and service, we propose collaboration timing optimization (CTO), including object detection confidence thresholding (ODCT) and cloud forward simulation (CFS), to decide when to seek cloud assistance and when to offer cloud service. Extensive experiments show that the proposed OCP outperforms existing methods in terms of both navigation time and success rate.

Paper Structure

This paper contains 17 sections, 15 equations, 10 figures, 4 tables.

Figures (10)

  • Figure 1: System architecture of the proposed OCP, which consists of LVM-MPC and CTO blocks.
  • Figure 2: LVM cloud perception based on SAM.
  • Figure 3: The case of $\alpha_t=1$ but $\beta_t=0$ for CFS.
  • Figure 4: Experimental results of ODCT.
  • Figure 5: Comparison of local and cloud perception models.
  • ...and 5 more figures