A Hardware-Native Realisation of Semi-Empirical Electronic Structure Theory on Field-Programmable Gate Arrays
Xincheng Miao, Roland Mitrić
TL;DR
This work demonstrates a hardware-native FPGA realization of semi-empirical electronic-structure methods, implementing Extended Hückel Theory and non-self-consistent DFTB on an Artix-7 device using a streaming dataflow. By fusing Hamiltonian construction, matrix assembly, and diagonalisation into on-device pipelines, the approach achieves deterministic, host-free execution and notable throughput improvements for specific kernels, notably a >4× CPU speedup in stand-alone DFTB0 Hamiltonian generation. Full EHT/DFTB0 workflows remain diagonalisation-bound, but the results highlight substantial energy efficiency advantages of FPGA streaming for regular arithmetic-heavy kernels and identify concrete paths to further improvements through hardware eigensolvers and heterogeneous CPU-FPGA execution. Collectively, the study provides architectural proof-of-concept for scalable, energy-efficient hardware-native semi-empirical toolchains and informs routes toward FPGA-based acceleration of broader electronic-structure methods.
Abstract
High-throughput quantum-chemical calculations underpin modern molecular modelling, materials discovery, and machine-learning workflows, yet even semi-empirical methods become restrictive when many molecules must be evaluated. Here we report the first hardware-native realisation of semi-empirical electronic structure theory on a field-programmable gate array (FPGA), implementing as a proof of principle Extended Hückel Theory (EHT) and non-self-consistent Density Functional Tight Binding (DFTB0). Our design performs Hamiltonian construction and diagonalisation on the FPGA device through a streaming dataflow, enabling deterministic execution without host intervention. On a mid-range Artix-7 FPGA, the DFTB0 Hamiltonian generator delivers a throughput over fourfold higher than that of a contemporary server-class CPU. Improvements in eigensolver design, memory capacity, and extensions to nuclear gradients and excited states could further expand capability. Combined with the inherent energy efficiency of FPGA dataflow, this work opens a pathway towards sustainable, hardware-native acceleration of electronic-structure simulation and direct hardware implementations of a broad class of methods.
