Fast ECoT: Efficient Embodied Chain-of-Thought via Thoughts Reuse

Zhekai Duan; Yuan Zhang; Shikai Geng; Gaowen Liu; Joschka Boedecker; Chris Xiaoxuan Lu

Fast ECoT: Efficient Embodied Chain-of-Thought via Thoughts Reuse

Zhekai Duan, Yuan Zhang, Shikai Geng, Gaowen Liu, Joschka Boedecker, Chris Xiaoxuan Lu

TL;DR

This work tackles the latency bottleneck of Embodied Chain-of-Thought (ECoT) in vision-language-action policies by introducing Fast ECoT, which caches high-level reasoning across timesteps, enables parallel generation of modular reasoning steps, and employs an asynchronous scheduler to decouple reasoning from action decoding. The method is model-agnostic and requires no training or architectural changes, integrating into existing VLA pipelines. Empirical results across LIBERO simulations and real-world robot tasks show latency reductions up to substantial factors while maintaining or improving task success and reasoning fidelity, with asynchronous variants offering the best speed-accuracy trade-offs. Overall, Fast ECoT makes ECoT-driven policies more viable for real-time deployment by balancing interpretability, efficiency, and performance.

Abstract

Embodied Chain-of-Thought (ECoT) reasoning enhances vision-language-action (VLA) models by improving performance and interpretability through intermediate reasoning steps. However, its sequential autoregressive token generation introduces significant inference latency, limiting real-time deployment. We propose Fast ECoT, an inference-time acceleration method that exploits the structured and repetitive nature of ECoT to (1) cache and reuse high-level reasoning across timesteps and (2) parallelise the generation of modular reasoning steps. Additionally, we introduce an asynchronous scheduler that decouples reasoning from action decoding, further boosting responsiveness. Fast ECoT requires no model changes or additional training and integrates easily into existing VLA pipelines. Experiments in both simulation (LIBERO) and real-world robot tasks show up to a 7.5% reduction in latency with comparable or improved task success rate and reasoning faithfulness, bringing ECoT policies closer to practical real-time deployment.

Fast ECoT: Efficient Embodied Chain-of-Thought via Thoughts Reuse

TL;DR

Abstract

Fast ECoT: Efficient Embodied Chain-of-Thought via Thoughts Reuse

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (9)