LTAU-FF: Loss Trajectory Analysis for Uncertainty in Atomistic Force Fields
Joshua A. Vita, Amit Samanta, Fei Zhou, Vincenzo Lordi
TL;DR
The paper tackles the high cost and calibration issues of ensemble-based uncertainty quantification in deep learning atomistic force fields by introducing LTAU, which leverages per-sample training error PDFs and a latent-space nearest-neighbor search to estimate the full error PDF at test points. When instantiated as LTAU-FF on NequIP, the method delivers well-calibrated confidence intervals and strong correlation with true errors near the training domain, while achieving 2–3 orders of magnitude speedups over ensembles. It enables practical tasks such as out-of-domain detection, training data re-weighting, and predicting failure during simulations (e.g., OC20 IS2RS), and demonstrates robust performance on ID data with clear limitations for OOD data due to latent-space distance. The approach offers a broadly applicable, low-overhead UQ alternative for regression tasks in materials science and beyond, with potential for further refinements in OOD handling and distance-based calibration.
Abstract
Model ensembles are effective tools for estimating prediction uncertainty in deep learning atomistic force fields. However, their widespread adoption is hindered by high computational costs and overconfident error estimates. In this work, we address these challenges by leveraging distributions of per-sample errors obtained during training and employing a distance-based similarity search in the model latent space. Our method, which we call LTAU, efficiently estimates the full probability distribution function (PDF) of errors for any test point using the logged training errors, achieving speeds that are 2--3 orders of magnitudes faster than typical ensemble methods and allowing it to be used for tasks where training or evaluating multiple models would be infeasible. We apply LTAU towards estimating parametric uncertainty in atomistic force fields (LTAU-FF), demonstrating that its improved ensemble diversity produces well-calibrated confidence intervals and predicts errors that correlate strongly with the true errors for data near the training domain. Furthermore, we show that the errors predicted by LTAU-FF can be used in practical applications for detecting out-of-domain data, tuning model performance, and predicting failure during simulations. We believe that LTAU will be a valuable tool for uncertainty quantification (UQ) in atomistic force fields and is a promising method that should be further explored in other domains of machine learning.
