Table of Contents
Fetching ...

Trainable Adaptive Activation Function Structure (TAAFS) Enhances Neural Network Force Field Performance with Only Dozens of Additional Parameters

Enji Li

TL;DR

This work addresses the challenge of improving neural network force fields without proportionally inflating parameter counts. It introduces Trainable Adaptive Activation Function Structure (TAAFS), which learns activation curves from data using a choice of basis functions (e.g., B-spline, Fourier, Chebyshev) and trains the associated coefficients alongside standard network parameters. Across DP, ANI-2, PAINN, and ChgNet, TAAFS yields meaningful accuracy gains (often exceeding $10\%$ improvements) with only tens to hundreds of extra parameters, and molecular dynamics simulations with DeepMD corroborate generalization. The approach demonstrates data-driven adaptivity in activations as a practical route to enhance NNFF performance, while encouraging future expansion of the available mathematical bases for task-specific optimization.

Abstract

At the heart of neural network force fields (NNFFs) is the architecture of neural networks, where the capacity to model complex interactions is typically enhanced through widening or deepening multilayer perceptrons (MLPs) or by increasing layers of graph neural networks (GNNs). These enhancements, while improving the model's performance, often come at the cost of a substantial increase in the number of parameters. By applying the Trainable Adaptive Activation Function Structure (TAAFS), we introduce a method that selects distinct mathematical formulations for non-linear activations, thereby increasing the precision of NNFFs with an insignificant addition to the parameter count. In this study, we integrate TAAFS into a variety of neural network models, resulting in observed accuracy improvements, and further validate these enhancements through molecular dynamics (MD) simulations using DeepMD.

Trainable Adaptive Activation Function Structure (TAAFS) Enhances Neural Network Force Field Performance with Only Dozens of Additional Parameters

TL;DR

This work addresses the challenge of improving neural network force fields without proportionally inflating parameter counts. It introduces Trainable Adaptive Activation Function Structure (TAAFS), which learns activation curves from data using a choice of basis functions (e.g., B-spline, Fourier, Chebyshev) and trains the associated coefficients alongside standard network parameters. Across DP, ANI-2, PAINN, and ChgNet, TAAFS yields meaningful accuracy gains (often exceeding improvements) with only tens to hundreds of extra parameters, and molecular dynamics simulations with DeepMD corroborate generalization. The approach demonstrates data-driven adaptivity in activations as a practical route to enhance NNFF performance, while encouraging future expansion of the available mathematical bases for task-specific optimization.

Abstract

At the heart of neural network force fields (NNFFs) is the architecture of neural networks, where the capacity to model complex interactions is typically enhanced through widening or deepening multilayer perceptrons (MLPs) or by increasing layers of graph neural networks (GNNs). These enhancements, while improving the model's performance, often come at the cost of a substantial increase in the number of parameters. By applying the Trainable Adaptive Activation Function Structure (TAAFS), we introduce a method that selects distinct mathematical formulations for non-linear activations, thereby increasing the precision of NNFFs with an insignificant addition to the parameter count. In this study, we integrate TAAFS into a variety of neural network models, resulting in observed accuracy improvements, and further validate these enhancements through molecular dynamics (MD) simulations using DeepMD.

Paper Structure

This paper contains 16 sections, 11 equations, 9 figures, 11 tables.

Figures (9)

  • Figure 1: Activation Function Curves
  • Figure 2: Deep Potential model and ANI2
  • Figure 3: Convergence Curve of ANI2
  • Figure 4: The solid line represents the activation function fitted using the Bline function, while the dashed line shows the approximation curve of a cubic polynomial.
  • Figure 5: This depicts the evolution of the activation function for the first layer during fitting. There is a significant change in the second epoch, followed by smaller subsequent variations, which aligns with the convergence trend of the RMSE.
  • ...and 4 more figures