Table of Contents
Fetching ...

Algorithmic differentiation for plane-wave DFT: materials design, error control and learning model parameters

Niklas Frederik Schmitz, Bruno Ploumhans, Michael F. Herbst

TL;DR

The paper introduces AD-DFPT, a framework that integrates forward-mode algorithmic differentiation with density-functional perturbation theory to compute end-to-end derivatives in plane-wave DFT workflows. Implemented in the DFTK package, it enables gradient-based exploration over a wide range of input parameters, from geometry and pseudopotentials to XC functional parameters, while leveraging a custom DFPT solve for the SCF response. The authors demonstrate multiple high-impact applications, including elasticity with minimal manual effort, inverse materials design, XC functional learning, pseudopotential optimization, propagation of XC uncertainty, and plane-wave basis error estimation, highlighting the potential for gradient-driven design, uncertainty quantification, and differentiable materials design. While forward-mode AD provides strong end-to-end derivatives, challenges remain in extending to reverse-mode AD and handling more complex functionals, spin polarization, and symmetry perturbations, pointing to future work that could further broaden differentiable plane-wave DFT.

Abstract

We present a differentiation framework for plane-wave density-functional theory (DFT) that combines the strengths of forward-mode algorithmic differentiation (AD) and density-functional perturbation theory (DFPT). In the resulting AD-DFPT framework derivatives of any DFT output quantity with respect to any input parameter (e.g. geometry, density functional or pseudopotential) can be computed accurately without deriving gradient expressions by hand. We implement AD-DFPT into the Density-Functional ToolKit (DFTK) and show its broad applicability. Amongst others we consider the inverse design of a semiconductor band gap, the learning of exchange-correlation functional parameters, or the propagation of DFT parameter uncertainties to relaxed structures. These examples demonstrate a number of promising research avenues opened by gradient-driven workflows in first-principles materials modeling.

Algorithmic differentiation for plane-wave DFT: materials design, error control and learning model parameters

TL;DR

The paper introduces AD-DFPT, a framework that integrates forward-mode algorithmic differentiation with density-functional perturbation theory to compute end-to-end derivatives in plane-wave DFT workflows. Implemented in the DFTK package, it enables gradient-based exploration over a wide range of input parameters, from geometry and pseudopotentials to XC functional parameters, while leveraging a custom DFPT solve for the SCF response. The authors demonstrate multiple high-impact applications, including elasticity with minimal manual effort, inverse materials design, XC functional learning, pseudopotential optimization, propagation of XC uncertainty, and plane-wave basis error estimation, highlighting the potential for gradient-driven design, uncertainty quantification, and differentiable materials design. While forward-mode AD provides strong end-to-end derivatives, challenges remain in extending to reverse-mode AD and handling more complex functionals, spin polarization, and symmetry perturbations, pointing to future work that could further broaden differentiable plane-wave DFT.

Abstract

We present a differentiation framework for plane-wave density-functional theory (DFT) that combines the strengths of forward-mode algorithmic differentiation (AD) and density-functional perturbation theory (DFPT). In the resulting AD-DFPT framework derivatives of any DFT output quantity with respect to any input parameter (e.g. geometry, density functional or pseudopotential) can be computed accurately without deriving gradient expressions by hand. We implement AD-DFPT into the Density-Functional ToolKit (DFTK) and show its broad applicability. Amongst others we consider the inverse design of a semiconductor band gap, the learning of exchange-correlation functional parameters, or the propagation of DFT parameter uncertainties to relaxed structures. These examples demonstrate a number of promising research avenues opened by gradient-driven workflows in first-principles materials modeling.

Paper Structure

This paper contains 21 sections, 15 equations, 8 figures.

Figures (8)

  • Figure 1: Systematic DFT derivatives. Examples of physical quantities (rows) differentiated with respect to input parameters (columns), illustrating the combinatorial range of quantity-parameter derivatives readily accessible with our AD-DFPT framework. Quantities are displayed for a silicon unit cell. Densities and non-zero forces are shown along a $z=0$ plane and the structure was slightly distorted. The parameter-induced changes have been scaled to improve visibility.
  • Figure 2: End-to-end derivatives in our AD-DFPT framework. We embed plane-wave DFT into a general-purpose AD system, which across the entire simulation workflow $A$ (top row) computes the end-to-end derivative $\frac{\partial A}{\partial \theta}$ (bottom row). Based on forward-mode AD, the full derivative is accumulated starting from the input $\frac{\partial \theta}{\partial\theta}=1$ and following each primitive computational step in order. Here, blue arrows indicate dependencies on intermediate quantities. The AD system automatically obtains the Hamiltonian perturbation $\frac{\partial H}{\partial \theta}$ entering DFPT, as well as the contributions of the postprocessing. For the SCF algorithm \ref{['eq:fixedpoint']} we manually define its derivative as the matching DFPT algorithm \ref{['eq:Dyson']}, see details in the main text.
  • Figure 3: Elasticity. Relative error in the clamped-ion elastic tensor ($\|C - C_\text{ref}\|_F / \|C_\text{ref}\|_F$) for indicated solids as a function of SCF tolerance. The dashed curves correspond to finite-difference values obtained on top of stresses with step sizes $h$ as indicated in the legend. AD-DFPT (solid curve) denotes a direct computation of second-order energy derivatives within our AD framework. All relative errors are computed with respect to the AD-DFPT result at SCF tolerance $10^{-12}$, see Supplementary Table S1 for the numerical values. AD-DFPT proves to agree well with finite differences at tightest SCF tolerances and be the most precise at looser tolerances, while finite-difference results deteriorate notably for looser SCF tolerance and are sensitive to the step size parameter $h$.
  • Figure 4: Inverse materials design.a Minimal code example tuning the band gap of bulk GaAs with respect to volumetric strain. b Band structure of GaAs with the band gap before (left) and after (right) minimizing $L_\text{bandgap}$. Energies are shown relative to the middle of the band gap. The optimizer internally invokes automatic differentiation to compute the required gradient, without requiring user intervention.
  • Figure 5: Learning the exchange-correlation functional. Experimental lattice constants are targeted for solids in the Sol58LC dataset lundgaard_mbeef-vdw_2016. a The training loss landscape in two parameters $\mu,\kappa$ of the PBE functional Perdew1996 is visualized by an exhaustive grid search of the root mean squared relative error (RMSRE), along with several variants from the literature Perdew2008pbesoldelCampo2012pbemolxu2004xpbeConstantin2011apbeSarmiento2015pbefe. The efficient trajectory of the AD-DFPT-enabled optimization is shown in black. b Relative lattice constant errors for solids in the test set. The train set of Si, Al, V, NaCl is indicated in gray. Fine-tuning improves agreement on average across the dataset, though some metals (e.g. Li, Na, and even V) show overcompensation.
  • ...and 3 more figures