A Scalable Diagonalization Framework for Tensor-Product Bitstring Selected Configuration Interaction
Enhua Xu, William Dawson, Himadri Pathak, Takahito Nakajima
TL;DR
A fully distributed diagonalization framework tailored for extremely large selected determinant spaces, directly addressing this major scalability bottleneck of modern SCI methods and establishing TBSCI as a scalable SCI methodology.
Abstract
Selected configuration interaction (SCI) methods are effective for treating strongly correlated electronic systems, yet their scalability has long been limited by implementations that replicate the configuration interaction (CI) vector across processes, leading to severe memory bottlenecks. Here, we present a fully distributed diagonalization framework tailored for extremely large selected determinant spaces, directly addressing this major scalability bottleneck of modern SCI methods. The method is grounded in a tensor-product bitstring (TPB) representation, in which determinants are organized through a TPB structure constructed from selected alpha- and beta-bitstrings, and is referred to as tensor-product bitstring SCI (TBSCI). An efficient TBSCI eigensolver is developed based on a novel bitstring-based Hamiltonian evaluation algorithm together with a suite of MPI communication strategies designed to improve parallel efficiency. Large-scale full configuration interaction (FCI) benchmarks, employed as communication-intensive stress tests, demonstrate that the implemented TBSCI eigensolver continues to reduce the wall time for distributed diagonalization of 2.6 trillion determinants, reaching 54,000 nodes (more than 2.5 million cores) on supercomputer Fugaku. Beyond scalability, we investigate the structural compactness of the TPB representation and show that selecting alpha- and beta-bitstrings according to their collective weights in a reference SCI wavefunction yields TPB-based wavefunctions approaching the FCI limit while using only a small fraction of determinants. These results establish TBSCI as a scalable SCI methodology and provide evidence for the intrinsic compactness of the TPB representation.
