Natural Selection via Foundation Models for Soft Robot Evolution
Changhe Chen, Xiaohao Xu, Xiangdong Wang, Xiaonan Huang
TL;DR
This work targets the challenge of designing soft robots by introducing RoboCrafter-QA, a multimodal benchmark built on EvoGym to test embodied design reasoning in LLMs. Initial evaluations reveal that state-of-the-art models struggle with fine-grained design distinctions, motivating a finetuning pipeline using LoRA on an efficient open-source LLM that achieves state-of-the-art performance for both design selection and direct morphology generation. The authors validate the approach with a physical modular soft robot, showing strong sim-to-real correlation and practical design transfer from simulation to hardware. Overall, the paper demonstrates that specialized, data-driven instruction tuning can unlock LLMs as effective co-designers for real-world soft-robot morphologies, and releases a complete framework for future embodied design research.
Abstract
Designing soft robots is a complex and iterative process that demands cross-disciplinary expertise in materials science, mechanics, and control, often relying on intuition and extensive experimentation. While foundation models, especially Large Language Models (LLMs), have demonstrated impressive reasoning abilities, their capacity to conduct embodied design remains largely unexplored. This paper introduces RoboCrafter-QA, a novel benchmark to evaluate whether LLMs can learn representations of soft robot designs that effectively bridge the gap between high-level task descriptions and low-level morphological and material choices. RoboCrafter-QA leverages the EvoGym simulator to generate a diverse set of soft robot design challenges, spanning robotic locomotion, manipulation, and balancing tasks. Our experiments with SOTA multi-modal LLMs reveal that while these models exhibit promising capabilities in learning design representations, they struggle with fine-grained distinctions between designs with subtle performance differences. To overcome these limitations, we finetune an efficient, open-source LLM that achieves SOTA performance on our benchmark, demonstrating superior capabilities in both design selection and direct generation of high-performing robot morphologies. Furthermore, we construct a physical replica of the modular soft robot and demonstrate a strong sim-to-real correlation, validating that superior benchmark performance has the potential to translate to effective real-world design selection. Our full system will be open-sourced to foster this exciting direction.
