VOCALoco: Viability-Optimized Cost-aware Adaptive Locomotion
Stanley Wu, Mohamad H. Danesh, Simon Li, Hanna Yurchyk, Amin Abyaneh, Anas El Houssaini, David Meger, Hsiu-Chin Lin
TL;DR
VOCALoco addresses safety and interpretability gaps in end-to-end DRL legged locomotion by introducing a modular, perception-driven framework that predicts viability and Cost of Transport (CoT) for a set of pre-trained policies. A high-level decision module filters unsafe options and selects the most energy-efficient viable policy based on local heightfield observations, with all predictors trained in simulation. The approach yields improved robustness and safety during stair ascent/descent and is validated through zero-shot real-world deployment on ANYmal-D, showing practical viability for scalable, interpretable locomotion in unstructured terrains. This work advances toward flexible, perception-guided skill switching in quadruped robots with potential for broad deployment across complex terrains and tasks.
Abstract
Recent advancements in legged robot locomotion have facilitated traversal over increasingly complex terrains. Despite this progress, many existing approaches rely on end-to-end deep reinforcement learning (DRL), which poses limitations in terms of safety and interpretability, especially when generalizing to novel terrains. To overcome these challenges, we introduce VOCALoco, a modular skill-selection framework that dynamically adapts locomotion strategies based on perceptual input. Given a set of pre-trained locomotion policies, VOCALoco evaluates their viability and energy-consumption by predicting both the safety of execution and the anticipated cost of transport over a fixed planning horizon. This joint assessment enables the selection of policies that are both safe and energy-efficient, given the observed local terrain. We evaluate our approach on staircase locomotion tasks, demonstrating its performance in both simulated and real-world scenarios using a quadrupedal robot. Empirical results show that VOCALoco achieves improved robustness and safety during stair ascent and descent compared to a conventional end-to-end DRL policy
