El Agente Estructural: An Artificially Intelligent Molecular Editor
Changhyeok Choi, Yunheng Zou, Marcel Müller, Han Hao, Yeonghun Kang, Juan B. Pérez-Sánchez, Ignacio Gustin, Hanyong Xu, Mohammad Ghazi Vakili, Chris Crebolder, Alán Aspuru-Guzik, Varinia Bernales
TL;DR
El Agente Estructural introduces a multimodal, natural-language–driven molecular editor that operates directly on 3D coordinates to enable precise geometry modifications while preserving core structures. It achieves this through an atomic-index–centric framework and a modular toolbox (structural analysis, geometric operations, editing, and generation) built on open-source chemistry libraries, with validation across diverse case studies including site-selective functionalization, fragment binding, and stereochemical organometallic construction. The work demonstrates robust integration of vision-language reasoning with domain-specific tools, enabling image-guided and mechanism-driven geometry generation beyond traditional SMILES or database-based approaches. A staged roadmap outlines integration with autonomous quantum chemistry platforms, data-driven retrieval, interactive UIs, and advanced organometallic and solid-state capabilities to accelerate discovery in chemistry and catalysis.
Abstract
We present El Agente Estructural, a multimodal, natural-language-driven geometry-generation and manipulation agent for autonomous chemistry and molecular modelling. Unlike molecular generation or editing via generative models, Estructural mimics how human experts directly manipulate molecular systems in three dimensions by integrating a comprehensive set of domain-informed tools and vision-language models. This design enables precise control over atomic or functional group replacements, atomic connectivity, and stereochemistry without the need to rebuild extensive core molecular frameworks. Through a series of representative case studies, we demonstrate that Estructural enables chemically meaningful geometry manipulation across a wide range of real-world scenarios. These include site-selective functionalization, ligand binding, ligand exchange, stereochemically controlled structure construction, isomer interconversion, fragment-level structural analysis, image-guided generation of structures from schematic reaction mechanisms, and mechanism-driven geometry generation and modification. These examples illustrate how multimodal reasoning, when combined with specialized geometry-aware tools, supports interactive and context-aware molecular modelling beyond structure generation. Looking forward, the integration of Estructural into El Agente Quntur, an autonomous multi-agent quantum chemistry platform, enhances its capabilities by adding sophisticated tools for the generation and editing of three-dimensional structures.
