Table of Contents
Fetching ...

Training a Foundation Model for Materials on a Budget

Teddy Koker, Mit Kotak, Tess Smidt

TL;DR

Nequix, a compact E(3)-equivariant potential that pairs a simplified NequIP design with modern training practices, including equivariant root-mean-square layer normalization and the Muon optimizer, to retain accuracy while substantially reducing compute requirements is introduced.

Abstract

Foundation models for materials modeling are advancing quickly, but their training remains expensive, often placing state-of-the-art methods out of reach for many research groups. We introduce Nequix, a compact E(3)-equivariant potential that pairs a simplified NequIP design with modern training practices, including equivariant root-mean-square layer normalization and the Muon optimizer, to retain accuracy while substantially reducing compute requirements. Nequix has 700K parameters and was trained in 100 A100 GPU-hours. On the Matbench-Discovery and MDR Phonon benchmarks, Nequix ranks third overall while requiring a 20 times lower training cost than most other methods, and it delivers two orders of magnitude faster inference speed than the current top-ranked model. We release model weights and fully reproducible codebase at https://github.com/atomicarchitects/nequix.

Training a Foundation Model for Materials on a Budget

TL;DR

Nequix, a compact E(3)-equivariant potential that pairs a simplified NequIP design with modern training practices, including equivariant root-mean-square layer normalization and the Muon optimizer, to retain accuracy while substantially reducing compute requirements is introduced.

Abstract

Foundation models for materials modeling are advancing quickly, but their training remains expensive, often placing state-of-the-art methods out of reach for many research groups. We introduce Nequix, a compact E(3)-equivariant potential that pairs a simplified NequIP design with modern training practices, including equivariant root-mean-square layer normalization and the Muon optimizer, to retain accuracy while substantially reducing compute requirements. Nequix has 700K parameters and was trained in 100 A100 GPU-hours. On the Matbench-Discovery and MDR Phonon benchmarks, Nequix ranks third overall while requiring a 20 times lower training cost than most other methods, and it delivers two orders of magnitude faster inference speed than the current top-ranked model. We release model weights and fully reproducible codebase at https://github.com/atomicarchitects/nequix.

Paper Structure

This paper contains 16 sections, 4 figures, 4 tables.

Figures (4)

  • Figure 1: (a) Nequix architecture, a simplified version of NequIP batzner20223, with a species-independent residual connection and layer normalization. (b) Combined performance scores of compliant models on the Matbench-Discovery (unique prototypes subset), collected on 2025-08-17. (c) Available published training times of current compliant models.
  • Figure 2: Validation metrics during training of a smaller version of Nequix configuration with Adam and Muon, trying learning rates in $\{0.03,0.01,0.003,0.001\}$ and with/without RMSNorm. This model configuration uses the same hyperparameters as the final model, except with hidden irreps of 128x0e + 64x1o. The dotted horizontal line shows the best validation performance reached during the Adam training.
  • Figure 3: Inference speed of various models in steps per day.
  • Figure A.1: Validation curves for Nequix training on MPtrj.