PyAPX: Python toolkit for atomic configuration pattern exploration
Akira Kusaba, Tetsuji Kuboyama, Karol Kawka, Pawel Kempisty, Yoshihiro Kangawa
TL;DR
PyAPX addresses the problem that atomic configurations at fixed composition and lattice can strongly influence material properties. It introduces NA and NAmod encodings for configuration representations and leverages Bayesian optimization (via PHYSBO) to guide DFT energy evaluations toward stable configurations. The h-BCN demonstration shows NAmod outperforms one-hot and NA encodings, with PCA-NAmod offering dimensionality reduction without loss of performance and enabling identification of multiple symmetry-equivalent stable patterns. The toolkit promises broad applicability to crystalline materials and aims to advance materials discovery by enabling efficient configuration-space exploration.
Abstract
In materials discovery, the integration of first-principles calculations with machine learning techniques has been actively studied for two key tasks: crystal structure prediction, which searches for stable structures given a chemical composition, and elemental substitution, which explores chemical compositions that yield desirable properties in a given crystal structure. However, even when both the crystal structure and chemical composition are fixed, material properties can still vary depending on the atomic arrangements (configurations) at crystallographic sites. To support detailed material design, we present PyAPX, a Python toolkit that performs Bayesian searches of stable atomic configurations. A distinctive feature of this initial release is the introduction of encoding methods suitable for configuration search, and we evaluate their performance using the h-BCN system. As a result, they were confirmed to yield superior convergence compared to commonly used one-hot encoding. PyAPX is broadly applicable to crystalline materials and is expected to further advance materials discovery.
