Bridging Research and Practice in Simulation-based Testing of Industrial Robot Navigation Systems
Sajad Khatiri, Francisco Eli Vina Barrientos, Maximilian Wulf, Paolo Tonella, Sebastiano Panichella
TL;DR
Robotic navigation in dynamic industrial environments is challenging to validate with traditional testing. The authors extend Surrealist (originally for UAVs) and Aerialist to the ANYmal quadruped, integrating them into a dockerized, end-to-end simulation workflow that automatically generates challenging obstacle configurations and benchmarks navigation algorithms. Through a three-month pilot and a six-month industrial deployment at ANYbotics, the approach uncovers critical failures, enables objective algorithm benchmarking, and strengthens the verification pipeline via a formal engineer survey. The results demonstrate a practical bridge from research to practice, with evidence of improved development efficiency, robust failure detection, and broader applicability to other robotic domains.
Abstract
Ensuring robust robotic navigation in dynamic environments is a key challenge, as traditional testing methods often struggle to cover the full spectrum of operational requirements. This paper presents the industrial adoption of Surrealist, a simulation-based test generation framework originally for UAVs, now applied to the ANYmal quadrupedal robot for industrial inspection. Our method uses a search-based algorithm to automatically generate challenging obstacle avoidance scenarios, uncovering failures often missed by manual testing. In a pilot phase, generated test suites revealed critical weaknesses in one experimental algorithm (40.3% success rate) and served as an effective benchmark to prove the superior robustness of another (71.2% success rate). The framework was then integrated into the ANYbotics workflow for a six-month industrial evaluation, where it was used to test five proprietary algorithms. A formal survey confirmed its value, showing it enhances the development process, uncovers critical failures, provides objective benchmarks, and strengthens the overall verification pipeline.
