Platonic representation of foundation machine learning interatomic potentials
Zhenzhu Li, Aron Walsh
TL;DR
The paper proves that foundation MLIPs trained on overlapping chemical spaces develop a shared latent geometry despite architectural differences. It introduces the Platonic representation, projecting atomic embeddings into a common space using $K$ anchors via $\mathbf{z}_i = [\cos(\mathbf{e}_i,\mathbf{a}_1), \dots, \cos(\mathbf{e}_i,\mathbf{a}_K)]^\top$, with anchor sets generated by DIRECT sampling. Across seven diverse MLIPs, embeddings align into a coherent chemical geometry, enabling cross-model optimal transport and algebraic embedding arithmetic, including zero-shot model stitching, while also exposing representational biases and potential diagnostic signals for symmetry breaking. The framework offers a practical pathway toward interoperable, interpretable foundation potentials for materials science, and highlights the value of considering representational compatibility alongside predictive performance in model design.
Abstract
Foundation machine learning interatomic potentials (MLIPs) are trained on overlapping chemical spaces, yet their latent representations remain model-specific. Here, we show that independently developed MLIPs exhibit statistically consistent geometric organisation of atomic environments, which we term the Platonic representation. By projecting embeddings relative to a set of atomic anchors, we unify the latent spaces of seven MLIPs (spanning equivariant, non-equivariant, conservative, and non-conservative architectures) into a common metric space that preserves chemical periodicity and structural invariants. This unified framework enables direct cross-model optimal transport, interpretable embedding arithmetic, and the detection of representational biases. Furthermore, we demonstrate that geometric distortions in this space can indicate physical prediction failures, including symmetry breaking and incorrect phonon dispersions. Our results show that the latent spaces of diverse MLIPs present consistent statistical geometry shaped by shared physical and chemical constraints, suggesting that the Platonic representation offers a practical route toward interoperable, comparable, and interpretable foundation models for materials science.
