Evolution 6.0: Evolving Robotic Capabilities Through Generative Design

Muhammad Haris Khan; Artyom Myshlyaev; Artem Lykov; Miguel Altamirano Cabrera; Dzmitry Tsetserukou

Evolution 6.0: Evolving Robotic Capabilities Through Generative Design

Muhammad Haris Khan, Artyom Myshlyaev, Artem Lykov, Miguel Altamirano Cabrera, Dzmitry Tsetserukou

TL;DR

Evolution 6.0 addresses autonomous tool design for robots operating in open-ended environments by integrating Vision-Language Models, Vision-Language Action, and Text-to-3D generative tools into two coordinating modules: Tool Generation and Action Generation. The approach demonstrates high tool-design success and solid physical/visual generalization, with weaker semantic generalization, indicating strong potential for real-world adaptability with further refinements. By enabling robots to perceive, plan, and fabricate task-specific tools on the fly, the framework advances self-sufficient, flexible robotics for challenging settings such as planetary exploration or unstructured industrial spaces.

Abstract

We propose a new concept, Evolution 6.0, which represents the evolution of robotics driven by Generative AI. When a robot lacks the necessary tools to accomplish a task requested by a human, it autonomously designs the required instruments and learns how to use them to achieve the goal. Evolution 6.0 is an autonomous robotic system powered by Vision-Language Models (VLMs), Vision-Language Action (VLA) models, and Text-to-3D generative models for tool design and task execution. The system comprises two key modules: the Tool Generation Module, which fabricates task-specific tools from visual and textual data, and the Action Generation Module, which converts natural language instructions into robotic actions. It integrates QwenVLM for environmental understanding, OpenVLA for task execution, and Llama-Mesh for 3D tool generation. Evaluation results demonstrate a 90% success rate for tool generation with a 10-second inference time, and action generation achieving 83.5% in physical and visual generalization, 70% in motion generalization, and 37% in semantic generalization. Future improvements will focus on bimanual manipulation, expanded task capabilities, and enhanced environmental interpretation to improve real-world adaptability.

Evolution 6.0: Evolving Robotic Capabilities Through Generative Design

TL;DR

Abstract

Evolution 6.0: Evolving Robotic Capabilities Through Generative Design

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (5)