An Automated Framework for Analyzing Structural Evolution in On-the-fly Non-adiabatic Molecular Dynamics Using Autoencoder and Multiple Molecular Descriptors
Hangxu Liu, Yifei Zhu, Zhenggang Lan
TL;DR
The paper tackles the challenge of automatically identifying key reaction coordinates in nonadiabatic molecular dynamics by introducing an automated framework that combines nonlinear dimensionality reduction via an Autoencoder with clustering and information entropy. It processes on-the-fly TSH trajectory data using six molecular descriptors (Cartesian, RIC, IDM, MBTR, SOAP, AEV), selects effective descriptors, and identifies distinct reaction channels through DBSCAN clustering in latent space. Information entropy then links reduced coordinates to each channel to reveal the active molecular motions driving photochemical pathways. Validated on keto isocytosine and methaniminium cation, the approach yields interpretable, channel-specific coordinates and aligns with established mechanistic insights, offering a robust, generalizable tool for automated analysis of excited-state dynamics.
Abstract
A major challenge in nonadiabatic molecular dynamics is to automatically and objectively identify the key reaction coordinates that drive molecules toward distinct excited-state decay channels. Traditional manual analyses are inefficient and rely heavily on expert intuition, creating a bottleneck for interpreting complex photochemical processes. To overcome this, we introduce a fully automated machine-learning framework that directly extracts these coordinates from on-the-fly trajectory surface hopping data. By combining an Autoencoder for nonlinear dimensionality reduction with clustering and information entropy analysis, our method autonomously maps reaction channels and pinpoints their governing structural motions. When applied to keto isocytosine and the methaniminium cation, the framework objectively revealed invovled reaction channels and corresponding active coordinates with high efficiency and accuracy. This work establishes an effective paradigm for mechanistic insight in excited-state dynamics, transforming raw trajectory data into clear, interpretable reaction mechanisms.
