Broken neural scaling laws in materials science
Max Großmann, Malte Grunert, Erich Runge
TL;DR
This work analyzes how neural scaling laws apply to a data-limited materials science task: predicting the high-dimensional dielectric response of metals by learning the interband dielectric function $\overline{\varepsilon}_{\mathrm{inter}}(\omega)$ and Drude frequency $\overline{\omega}_{\mathrm{D}}$ from structure. Using a large high-throughput dataset of over $2\times 10^5$ intermetallics and invariant graph neural networks OptiMetal2B and OptiMetal3B, the authors quantify 1D and 2D NSLs across dataset size $D$ and parameter count $N$, revealing broken data scaling with a crossover $D_c$ and saturating parameter scaling beyond $N\sim 5\times 10^{6}$. A Kaplan-type 2D NSL map shows that data efficiency improves with higher body order, but the data-scaling regime persists, underscoring a data bottleneck in materials ML. The results motivate data-efficient architectures and strategies like transfer learning or multi-fidelity data to alleviate data-generation costs in materials discovery.
Abstract
In materials science, data are scarce and expensive to generate, whether computationally or experimentally. Therefore, it is crucial to identify how model performance scales with dataset size and model capacity to distinguish between data- and model-limited regimes. Neural scaling laws provide a framework for quantifying this behavior and guide the design of materials datasets and machine learning architectures. Here, we investigate neural scaling laws for a paradigmatic materials science task: predicting the dielectric function of metals, a high-dimensional response that governs how solids interact with light. Using over 200,000 dielectric functions from high-throughput ab initio calculations, we study two multi-objective graph neural networks trained to predict the frequency-dependent complex interband dielectric function and the Drude frequency. We observe broken neural scaling laws with respect to dataset size, whereas scaling with the number of model parameters follows a simple power law that rapidly saturates.
