Table of Contents
Fetching ...

On The Finetuning of MLIPs Through the Lens of Iterated Maps With BPTT

Evan Dramko, Yizhi Zhu, Aleksandar Krivokapic, Geoffroy Hautier, Thomas Reps, Christopher Jermaine, Anastasios Kyrillidis

TL;DR

The paper tackles the computational bottleneck of structural relaxations by fine-tuning pretrained MLIPs within a fully differentiable, trajectory-level framework. It unrolls relaxation trajectories and trains via backpropagation through time to optimize the final relaxed structure rather than per-step force accuracy. Results show about a 50% reduction in final-structure error across silicon defect and pure-crystal datasets, with robustness to relaxation hyperparameters and with BPTT primarily adjusting the MLIP rather than descent dynamics. The work links iterative maps and fixed-point concepts to relaxation, and proposes a practical workflow: start with a pretrained MLIP, fine-tune with structure supervision, then apply BPTT trajectory-level training to achieve domain-specific performance with less data.

Abstract

Vital to the creation of advanced materials is performing structural relaxations. Traditional approaches built on physics-derived first-principles calculations are computationally expensive, motivating the creation of machine-learning interatomic potentials (MLIPs). Traditional approaches to training MLIPs for structural relaxations involves training models to faithfully reproduce first-principles computed forces. We propose a fine-tuning method to be used on a pretrained MLIP in which we create a fully-differentiable end-to-end simulation loop that optimizes the predicted final structures directly. Trajectories are unrolled and gradients are tracked through the entire relaxation. We show that this method achieves substantial performance gains when applied to pretrained models, leading to a nearly $50\%$ reduction in test error across the sample datasets. Interestingly, we show the process is robust to substantial variation in the relaxation setup, achieving negligibly different results across varied hyperparameter and procedural modifications. Experimental results indicate this is due to a ``preference'' of BPTT to modify the MLIP rather than the other trainable parameters. Of particular interest to practitioners is that this approach lowers the data requirements for producing an effective domain-specific MLIP, addressing a common bottleneck in practical deployment.

On The Finetuning of MLIPs Through the Lens of Iterated Maps With BPTT

TL;DR

The paper tackles the computational bottleneck of structural relaxations by fine-tuning pretrained MLIPs within a fully differentiable, trajectory-level framework. It unrolls relaxation trajectories and trains via backpropagation through time to optimize the final relaxed structure rather than per-step force accuracy. Results show about a 50% reduction in final-structure error across silicon defect and pure-crystal datasets, with robustness to relaxation hyperparameters and with BPTT primarily adjusting the MLIP rather than descent dynamics. The work links iterative maps and fixed-point concepts to relaxation, and proposes a practical workflow: start with a pretrained MLIP, fine-tune with structure supervision, then apply BPTT trajectory-level training to achieve domain-specific performance with less data.

Abstract

Vital to the creation of advanced materials is performing structural relaxations. Traditional approaches built on physics-derived first-principles calculations are computationally expensive, motivating the creation of machine-learning interatomic potentials (MLIPs). Traditional approaches to training MLIPs for structural relaxations involves training models to faithfully reproduce first-principles computed forces. We propose a fine-tuning method to be used on a pretrained MLIP in which we create a fully-differentiable end-to-end simulation loop that optimizes the predicted final structures directly. Trajectories are unrolled and gradients are tracked through the entire relaxation. We show that this method achieves substantial performance gains when applied to pretrained models, leading to a nearly reduction in test error across the sample datasets. Interestingly, we show the process is robust to substantial variation in the relaxation setup, achieving negligibly different results across varied hyperparameter and procedural modifications. Experimental results indicate this is due to a ``preference'' of BPTT to modify the MLIP rather than the other trainable parameters. Of particular interest to practitioners is that this approach lowers the data requirements for producing an effective domain-specific MLIP, addressing a common bottleneck in practical deployment.

Paper Structure

This paper contains 24 sections, 10 equations, 6 figures, 6 tables.

Figures (6)

  • Figure 1: Effect of training schemes on Si defect relaxations
  • Figure 2: Experimental results comparing learned and fixed scalar step sizes as applied to Si defects.
  • Figure 3: Si defect training loss sees negligible improvement, but testing loss is substantially improved
  • Figure 4: Effect of adjusting learning setup on Si defect samples
  • Figure 5: BPTT tuning improves pure crystal results by roughly $50\%$.
  • ...and 1 more figures