Table of Contents
Fetching ...

Constructing artificial life and materials scientists with accelerated AI using Deep AndersoNN

Saleem Abdul Fattah Ahmed Al Dajani, David Keyes

TL;DR

It is shown that Deep AndersoNN achieves up to an order of magnitude of speed-up in training and inference, paving the way for saving up to 90\% of compute required for AI, reducing its carbon footprint by up to 60 gigatons per year by 2030, and scaling above memory limits of explicit neural networks in life and materials science, and beyond.

Abstract

Deep AndersoNN accelerates AI by exploiting the continuum limit as the number of explicit layers in a neural network approaches infinity and can be taken as a single implicit layer, known as a deep equilibrium model. Solving for deep equilibrium model parameters reduces to a nonlinear fixed point iteration problem, enabling the use of vector-to-vector iterative solvers and windowing techniques, such as Anderson extrapolation, for accelerating convergence to the fixed point deep equilibrium. Here we show that Deep AndersoNN achieves up to an order of magnitude of speed-up in training and inference. The method is demonstrated on density functional theory results for industrial applications by constructing artificial life and materials `scientists' capable of classifying drugs as strongly or weakly polar, metal-organic frameworks by pore size, and crystalline materials as metals, semiconductors, and insulators, using graph images of node-neighbor representations transformed from atom-bond networks. Results exhibit accuracy up to 98\% and showcase synergy between Deep AndersoNN and machine learning capabilities of modern computing architectures, such as GPUs, for accelerated computational life and materials science by quickly identifying structure-property relationships. This paves the way for saving up to 90\% of compute required for AI, reducing its carbon footprint by up to 60 gigatons per year by 2030, and scaling above memory limits of explicit neural networks in life and materials science, and beyond.

Constructing artificial life and materials scientists with accelerated AI using Deep AndersoNN

TL;DR

It is shown that Deep AndersoNN achieves up to an order of magnitude of speed-up in training and inference, paving the way for saving up to 90\% of compute required for AI, reducing its carbon footprint by up to 60 gigatons per year by 2030, and scaling above memory limits of explicit neural networks in life and materials science, and beyond.

Abstract

Deep AndersoNN accelerates AI by exploiting the continuum limit as the number of explicit layers in a neural network approaches infinity and can be taken as a single implicit layer, known as a deep equilibrium model. Solving for deep equilibrium model parameters reduces to a nonlinear fixed point iteration problem, enabling the use of vector-to-vector iterative solvers and windowing techniques, such as Anderson extrapolation, for accelerating convergence to the fixed point deep equilibrium. Here we show that Deep AndersoNN achieves up to an order of magnitude of speed-up in training and inference. The method is demonstrated on density functional theory results for industrial applications by constructing artificial life and materials `scientists' capable of classifying drugs as strongly or weakly polar, metal-organic frameworks by pore size, and crystalline materials as metals, semiconductors, and insulators, using graph images of node-neighbor representations transformed from atom-bond networks. Results exhibit accuracy up to 98\% and showcase synergy between Deep AndersoNN and machine learning capabilities of modern computing architectures, such as GPUs, for accelerated computational life and materials science by quickly identifying structure-property relationships. This paves the way for saving up to 90\% of compute required for AI, reducing its carbon footprint by up to 60 gigatons per year by 2030, and scaling above memory limits of explicit neural networks in life and materials science, and beyond.
Paper Structure (5 sections, 5 equations, 5 figures, 2 tables, 1 algorithm)

This paper contains 5 sections, 5 equations, 5 figures, 2 tables, 1 algorithm.

Figures (5)

  • Figure 1: Representative SARS-CoV drug compound (top left), representative hypothetical MOF (middle left), and representative experimentally validated MOF (bottom left) molecular structure. Graphical representations of the compounds (center). Node-neighbor representations (right). The node-neighbor representations are inputs for machine learning density functional theory physical properties.
  • Figure 2: Accelerating convergence to fixed point deep equilibrium with (top right) inferences (single forward pass), (top left) representative behavior of speedup tuning for higher accuracy with window size, m, and iterate-extrapolate ratio, $\beta$, and (bottom) training until residual tolerance achieved.
  • Figure 3: Achieving accuracy near unity with data augmentation (top left, QMOF pore size classification, and right, OQMD QMOF band gap classification) versus trapping in local minimum by standard forward iteration (lower left, QMugs COVID drug polarity classification), compared to CIFAR10 benchmark (lower right).
  • Figure 4: Representative confusion matrix deep equilibrium convergence results for testing with (top left) standard forward iteration and (top right) extrapolation for classifying compounds, as well as (lower left) classifying COVID drug dipoles and (lower right) MOF pore sizes.
  • Figure 5: Algorithmic trade-off between training speedup and testing accuracy for four use cases with Anderson acceleration plotting the third row of Tab. \ref{['tab:results-speedup']} against the fourth row of Tab. \ref{['tab:results-accuracy']}.