Table of Contents
Fetching ...

RoboECC: Multi-Factor-Aware Edge-Cloud Collaborative Deployment for VLA Models

Zihao Zheng, Hangyu Cao, Jiayu Chen, Sicheng Tian, Chenyue Li, Maoliang Li, Xinhao Sun, Guojie Luo, Xiang Chen

Abstract

Vision-Language-Action (VLA) models are mainstream in embodied intelligence but face high inference costs. Edge-Cloud Collaborative (ECC) deployment offers an effective fix by easing edge-device computing pressure to meet real-time needs. However, existing ECC frameworks are suboptimal for VLA models due to two challenges: (1) Diverse model structures hinder optimal ECC segmentation point identification; (2) Even if the optimal split point is determined, changes in network bandwidth can cause performance drift. To address these issues, we propose a novel ECC deployment framework for various VLA models, termed RoboECC. Specifically, we propose a model-hardware co-aware segmentation strategy to help find the optimal segmentation point for various VLA models. Moreover, we propose a network-aware deployment adjustment approach to adapt to the network fluctuations for maintaining optimal performance. Experiments demonstrate that RoboECC achieves a speedup of up to 3.28x with only 2.55x~2.62x overhead.

RoboECC: Multi-Factor-Aware Edge-Cloud Collaborative Deployment for VLA Models

Abstract

Vision-Language-Action (VLA) models are mainstream in embodied intelligence but face high inference costs. Edge-Cloud Collaborative (ECC) deployment offers an effective fix by easing edge-device computing pressure to meet real-time needs. However, existing ECC frameworks are suboptimal for VLA models due to two challenges: (1) Diverse model structures hinder optimal ECC segmentation point identification; (2) Even if the optimal split point is determined, changes in network bandwidth can cause performance drift. To address these issues, we propose a novel ECC deployment framework for various VLA models, termed RoboECC. Specifically, we propose a model-hardware co-aware segmentation strategy to help find the optimal segmentation point for various VLA models. Moreover, we propose a network-aware deployment adjustment approach to adapt to the network fluctuations for maintaining optimal performance. Experiments demonstrate that RoboECC achieves a speedup of up to 3.28x with only 2.55x~2.62x overhead.
Paper Structure (35 sections, 3 equations, 7 figures, 4 tables, 1 algorithm)

This paper contains 35 sections, 3 equations, 7 figures, 4 tables, 1 algorithm.

Figures (7)

  • Figure 1: (a) VLA Inference on Edge Devices (b) Challenges of VLA ECC Deployment (c) Overview of the Proposed RoboECC Framework
  • Figure 2: Latency of Model Segmentation under Various Structures
  • Figure 3: Performance Drift under Network Bandwidth Fluctuation
  • Figure 4: Components in RoboECC: (a) Model-Hardware Co-Aware Segmentation Strategy (2) Network-Aware Deployment Adjustment Approach
  • Figure 5: An Example of RoboECC Deployment in Real-World Scenarios
  • ...and 2 more figures