Generative artificial intelligence for computational chemistry: a roadmap to predicting emergent phenomena
Pratyush Tiwary, Lukas Herron, Richard John, Suemin Lee, Disha Sanwal, Ruiyu Wang
TL;DR
This Perspective addresses the challenge of predicting emergent chemical phenomena with Generative AI by surveying foundational concepts in both computational chemistry and AI, and by outlining a spectrum of AI methods (autoencoders, GANs, RL, flow models, and LLMs) tailored to molecular modeling. It highlights representative applications in ab initio QC, ML-based force fields, and biomolecular structure prediction (protein and RNA), while critically examining limitations such as data scarcity, training stability, and the difficulty of capturing emergent behavior. The authors argue that integrating core chemical principles, especially statistical mechanics and environmental context, is essential for turning AI into a reliable predictive tool for chemistry. They propose design principles and hybrid approaches (e.g., AF2RAVE, Thermodynamic Maps) to bridge AI with physics, aiming to predict functions and emergent phenomena from chemical identity under realistic conditions. The outlook emphasizes cautious, physics-grounded progress and the need for rigorous validation to realize AI's potential to accelerate discovery and deepen understanding of complex chemical systems.
Abstract
The recent surge in Generative Artificial Intelligence (AI) has introduced exciting possibilities for computational chemistry. Generative AI methods have made significant progress in sampling molecular structures across chemical species, developing force fields, and speeding up simulations. This Perspective offers a structured overview, beginning with the fundamental theoretical concepts in both Generative AI and computational chemistry. It then covers widely used Generative AI methods, including autoencoders, generative adversarial networks, reinforcement learning, flow models and language models, and highlights their selected applications in diverse areas including force field development, and protein/RNA structure prediction. A key focus is on the challenges these methods face before they become truly predictive, particularly in predicting emergent chemical phenomena. We believe that the ultimate goal of a simulation method or theory is to predict phenomena not seen before, and that Generative AI should be subject to these same standards before it is deemed useful for chemistry. We suggest that to overcome these challenges, future AI models need to integrate core chemical principles, especially from statistical mechanics.
