Accelerated Prediction of Temperature-Dependent Lattice Thermal Conductivity via Ensembled Machine Learning Models
Piyush Paliwal, Aftab Alam
TL;DR
The work develops an ensemble learning surrogate, led by the Extra Trees Regressor, to predict temperature-dependent lattice thermal conductivity $κ_L$ with DF T-level fidelity across 100–1000 K. Leveraging 53 physics-informed crystal/compositional descriptors plus temperature, trained on a large DFT-derived dataset, the model achieves $R^2$ near 1 and low RMSE, and generalizes to unseen compounds. SHAP analysis confirms that physically meaningful features govern the predictions, while high-throughput screening identifies ultralow and ultrahigh $κ_L$ candidates in both half-Heuslers and ICSD-derived structures, with selective validation against experiments and DFT. The approach offers orders-of-magnitude speedups over ab initio methods, enabling rapid exploration of broad materials spaces for thermoelectric and thermal management applications.
Abstract
Lattice thermal conductivity ($κ_L$) is a key physical property governing heat transport in solids, with direct relevance to thermoelectrics, thermal barrier coatings, and heat management applications. However, while experimental determination of $κ_L$ is challenging, its theoretical calculation via ab initio methods particularly using density functional theory (DFT) is computationally intensive, often more demanding than electronic transport calculations by an order of magnitude. In this work, we present a machine learning (ML) approach to predict $κ_L$ with DFT-level accuracy over a wide temperature range (100-1000 K). Among various models trained on DFT-calculated data obtained from literature, the Extra Trees Regressor (ETR) yielded the best performance on log-scaled $κ_L$, achieving an average $R^2$ of 0.9994 and a root mean square error (RMSE) of 0.0466 $W\,m^{-1}\,K^{-1}$. The ETR model also generalized well to twelve previously unseen (randomly chosen) low and high $κ_L$ compounds with diverse space group symmetries, reaching an $R^2$ of 0.961 against DFT benchmarks. Notably, the model excels in predicting $κ_L$ for both low- and high-symmetry compounds, enabling efficient high-throughput screening. We also demonstrate this capability by screening ultralow and ultrahigh $κ_L$ candidates among 960 half-Heusler compounds and 60,000 ICSD compounds from the AFLOW database. This result shows reliability of model developed for screening of potential thermoelectric materials. At the end, we have tested model's prediction ability on systems that have experimental $κ_L$ available that shows model's ability to search material that has desirable experimental $κ_L$ for thermoelectric applications.
