Table of Contents
Fetching ...

Multi-objective optimization and quantum hybridization of equivariant deep learning interatomic potentials on organic and inorganic compounds

G. Laskaris, D. Morozov, D. Tarpanov, A. Seth, J. Procelewska, G. Sai Gautam, A. Sagingalieva, R. Brasher, A. Melnikov

TL;DR

This work compares the results from QM9, rMD17-aspirin, rMD17-benzene, and their own proprietary dataset and has a list of variants that surpass the Allegro in accuracy and also results which demonstrate the trade-off with inference times.

Abstract

Allegro is a machine learning interatomic potential (MLIP) model designed to predict atomic properties in molecules using E(3) equivariant neural networks. When training this model, there tends to be a trade-off between accuracy and inference time. For this reason we apply multi-objective hyperparameter optimization to the two objectives. Additionally, we experiment with modified architectures by making variants of Allegro some by adding strictly classical multi-layer perceptron (MLP) layers and some by adding quantum-classical hybrid layers. We compare the results from QM9, rMD17-aspirin, rMD17-benzene and our own proprietary dataset consisting of copper and lithium atoms. As results, we have a list of variants that surpass the Allegro in accuracy and also results which demonstrate the trade-off with inference times.

Multi-objective optimization and quantum hybridization of equivariant deep learning interatomic potentials on organic and inorganic compounds

TL;DR

This work compares the results from QM9, rMD17-aspirin, rMD17-benzene, and their own proprietary dataset and has a list of variants that surpass the Allegro in accuracy and also results which demonstrate the trade-off with inference times.

Abstract

Allegro is a machine learning interatomic potential (MLIP) model designed to predict atomic properties in molecules using E(3) equivariant neural networks. When training this model, there tends to be a trade-off between accuracy and inference time. For this reason we apply multi-objective hyperparameter optimization to the two objectives. Additionally, we experiment with modified architectures by making variants of Allegro some by adding strictly classical multi-layer perceptron (MLP) layers and some by adding quantum-classical hybrid layers. We compare the results from QM9, rMD17-aspirin, rMD17-benzene and our own proprietary dataset consisting of copper and lithium atoms. As results, we have a list of variants that surpass the Allegro in accuracy and also results which demonstrate the trade-off with inference times.
Paper Structure (18 sections, 7 equations, 6 figures, 5 tables)

This paper contains 18 sections, 7 equations, 6 figures, 5 tables.

Figures (6)

  • Figure 1: Quantum depth-infused layer
  • Figure 2: Pareto fronts as they generated by our implementation of SAMO-COBRA during multi-objective optimization of the Allegro model and its variants, minimizing both the forces mean absolute errors on the validation set in (eV/Å) and the inference time in (sec) for each dataset tested.
  • Figure 3: (a) Circuit's ZX graph with original parameters. (b) Circuit's ZX graph after simplification.
  • Figure 4: (a) The histogram of the Fisher eigenspectrum. There are around 78% of parameters with near-zero eigenvalues, but this margin doesn't go higher than 95% trainability threshold. Other parameters have great impact, contributing to circuit's trainability. (b) Average Fisher Information Matrix. The diagonal elements show parameters having mostly high gradient values, showing their impact on the circuit. Meanwhile non-diagonal ones are almost non-existent, indicating parameters' independence. Also there's no evident single-parameter dominance, showing high trainability. (c) Average FIM rank. There are three layers presented, although only one is used in the original circuit. This clearly shows how rank goes on the plateau, which means the model is in the overparameterization mode. Thus, no extra layers are needed as the circuit already has a sufficient number of parameters. (d) Real and imaginary part of Fourier coefficients
  • Figure 5: Training curves of all Allegro variants on QM9 dataset using the best hyperparameters, as achieved from our hyperparameter optimization. Among the variants, we include the Allegro Anders et. al. training curve, which corresponds tot QM9 Allegro hyperparameters as reported in allegro_paper. To smooth the resultsing curves we use the time-weighted exponential moving average (EMA) time_EMA smoothing technique with 0.75 smoothing factor.
  • ...and 1 more figures

Theorems & Definitions (2)

  • Definition 1: Equivariance
  • Definition 2: $k$-layered multilayer perceptron butterfly_paper