On The Finetuning of MLIPs Through the Lens of Iterated Maps With BPTT
Evan Dramko, Yizhi Zhu, Aleksandar Krivokapic, Geoffroy Hautier, Thomas Reps, Christopher Jermaine, Anastasios Kyrillidis
TL;DR
The paper tackles the computational bottleneck of structural relaxations by fine-tuning pretrained MLIPs within a fully differentiable, trajectory-level framework. It unrolls relaxation trajectories and trains via backpropagation through time to optimize the final relaxed structure rather than per-step force accuracy. Results show about a 50% reduction in final-structure error across silicon defect and pure-crystal datasets, with robustness to relaxation hyperparameters and with BPTT primarily adjusting the MLIP rather than descent dynamics. The work links iterative maps and fixed-point concepts to relaxation, and proposes a practical workflow: start with a pretrained MLIP, fine-tune with structure supervision, then apply BPTT trajectory-level training to achieve domain-specific performance with less data.
Abstract
Vital to the creation of advanced materials is performing structural relaxations. Traditional approaches built on physics-derived first-principles calculations are computationally expensive, motivating the creation of machine-learning interatomic potentials (MLIPs). Traditional approaches to training MLIPs for structural relaxations involves training models to faithfully reproduce first-principles computed forces. We propose a fine-tuning method to be used on a pretrained MLIP in which we create a fully-differentiable end-to-end simulation loop that optimizes the predicted final structures directly. Trajectories are unrolled and gradients are tracked through the entire relaxation. We show that this method achieves substantial performance gains when applied to pretrained models, leading to a nearly $50\%$ reduction in test error across the sample datasets. Interestingly, we show the process is robust to substantial variation in the relaxation setup, achieving negligibly different results across varied hyperparameter and procedural modifications. Experimental results indicate this is due to a ``preference'' of BPTT to modify the MLIP rather than the other trainable parameters. Of particular interest to practitioners is that this approach lowers the data requirements for producing an effective domain-specific MLIP, addressing a common bottleneck in practical deployment.
