Validation of Semi-Empirical xTB Methods for High-Throughput Screening of TADF Emitters: A 747-Molecule Benchmark Study
Jean-Pierre Tchapet Njafa, Elvira Vanelle Kameni Tcheuffa, Aissatou Maghame, Serge Guy Nana Engo
TL;DR
This work addresses the need for scalable screening of TADF emitters by validating semi-empirical xTB methods (sTDA-xTB and sTD-DFT-xTB) on a 747-molecule benchmark. By coupling GFN2-xTB ground-state geometries with rapid excited-state calculations and including implicit solvent effects, the authors achieve >$99\%$ cost reduction relative to conventional TD-DFT while preserving reliable relative rankings, evidenced by a Pearson $r \approx 0.82$ for $\Delta E_{\text{ST}}$ and MAE ≈ 0.17 eV against experiment. The study extracts robust design principles, confirming D-A-D architectures and an optimal D-A torsional window of $50^{\circ}$–$90^{\circ}$, and reveals a low-dimensional design space where the first three principal components capture about $90\%$ of variance. These findings establish a validated, data-driven HTS framework that accelerates TADF emitter discovery and provides practical guidelines for computational materials science in OLED design.
Abstract
Thermally activated delayed fluorescence (TADF) emitters are essential for next-generation, high-efficiency organic light-emitting diodes (OLEDs), yet their rational design is hampered by the high computational cost of accurate excited-state predictions. Here, we present a comprehensive benchmark study validating semi-empirical extended tight-binding (xTB) methods -- specifically sTDA-xTB and sTD-DFT-xTB -- for the high-throughput screening of TADF materials. Using an unprecedentedly large dataset of \num{747} experimentally characterized emitters, our framework demonstrates a computational cost reduction of over \qty{99}{\percent} compared to conventional TD-DFT, while maintaining strong internal consistency between methods (Pearson $r \approx \num{0.82}$ for \deltaest), validating their utility for relative molecular ranking. Validation against \num{312} experimental \deltaest values reveals a mean absolute error of approximately \qty{0.17}{\electronvolt}, a discrepancy attributed to the vertical approximation inherent to the HTS protocol, underscoring the methods' role in screening rather than quantitative prediction. Through large-scale data analysis, we statistically validate key design principles, confirming the superior performance of Donor-Acceptor-Donor (D-A-D) architectures and identifying an optimal D-A torsional angle range of \qtyrange{50}{90}{\degree} for efficient TADF. Principal Component Analysis reveals that the complex property space is fundamentally low-dimensional, with three components capturing nearly \qty{90}{\percent} of the variance. This work establishes these semi-empirical methods as powerful, cost-effective tools for accelerating TADF discovery and provides a robust set of data-driven design rules and methodological guidelines for the computational materials science community.
