Table of Contents
Fetching ...

Accelerated Prediction of Temperature-Dependent Lattice Thermal Conductivity via Ensembled Machine Learning Models

Piyush Paliwal, Aftab Alam

TL;DR

The work develops an ensemble learning surrogate, led by the Extra Trees Regressor, to predict temperature-dependent lattice thermal conductivity $κ_L$ with DF T-level fidelity across 100–1000 K. Leveraging 53 physics-informed crystal/compositional descriptors plus temperature, trained on a large DFT-derived dataset, the model achieves $R^2$ near 1 and low RMSE, and generalizes to unseen compounds. SHAP analysis confirms that physically meaningful features govern the predictions, while high-throughput screening identifies ultralow and ultrahigh $κ_L$ candidates in both half-Heuslers and ICSD-derived structures, with selective validation against experiments and DFT. The approach offers orders-of-magnitude speedups over ab initio methods, enabling rapid exploration of broad materials spaces for thermoelectric and thermal management applications.

Abstract

Lattice thermal conductivity ($κ_L$) is a key physical property governing heat transport in solids, with direct relevance to thermoelectrics, thermal barrier coatings, and heat management applications. However, while experimental determination of $κ_L$ is challenging, its theoretical calculation via ab initio methods particularly using density functional theory (DFT) is computationally intensive, often more demanding than electronic transport calculations by an order of magnitude. In this work, we present a machine learning (ML) approach to predict $κ_L$ with DFT-level accuracy over a wide temperature range (100-1000 K). Among various models trained on DFT-calculated data obtained from literature, the Extra Trees Regressor (ETR) yielded the best performance on log-scaled $κ_L$, achieving an average $R^2$ of 0.9994 and a root mean square error (RMSE) of 0.0466 $W\,m^{-1}\,K^{-1}$. The ETR model also generalized well to twelve previously unseen (randomly chosen) low and high $κ_L$ compounds with diverse space group symmetries, reaching an $R^2$ of 0.961 against DFT benchmarks. Notably, the model excels in predicting $κ_L$ for both low- and high-symmetry compounds, enabling efficient high-throughput screening. We also demonstrate this capability by screening ultralow and ultrahigh $κ_L$ candidates among 960 half-Heusler compounds and 60,000 ICSD compounds from the AFLOW database. This result shows reliability of model developed for screening of potential thermoelectric materials. At the end, we have tested model's prediction ability on systems that have experimental $κ_L$ available that shows model's ability to search material that has desirable experimental $κ_L$ for thermoelectric applications.

Accelerated Prediction of Temperature-Dependent Lattice Thermal Conductivity via Ensembled Machine Learning Models

TL;DR

The work develops an ensemble learning surrogate, led by the Extra Trees Regressor, to predict temperature-dependent lattice thermal conductivity with DF T-level fidelity across 100–1000 K. Leveraging 53 physics-informed crystal/compositional descriptors plus temperature, trained on a large DFT-derived dataset, the model achieves near 1 and low RMSE, and generalizes to unseen compounds. SHAP analysis confirms that physically meaningful features govern the predictions, while high-throughput screening identifies ultralow and ultrahigh candidates in both half-Heuslers and ICSD-derived structures, with selective validation against experiments and DFT. The approach offers orders-of-magnitude speedups over ab initio methods, enabling rapid exploration of broad materials spaces for thermoelectric and thermal management applications.

Abstract

Lattice thermal conductivity () is a key physical property governing heat transport in solids, with direct relevance to thermoelectrics, thermal barrier coatings, and heat management applications. However, while experimental determination of is challenging, its theoretical calculation via ab initio methods particularly using density functional theory (DFT) is computationally intensive, often more demanding than electronic transport calculations by an order of magnitude. In this work, we present a machine learning (ML) approach to predict with DFT-level accuracy over a wide temperature range (100-1000 K). Among various models trained on DFT-calculated data obtained from literature, the Extra Trees Regressor (ETR) yielded the best performance on log-scaled , achieving an average of 0.9994 and a root mean square error (RMSE) of 0.0466 . The ETR model also generalized well to twelve previously unseen (randomly chosen) low and high compounds with diverse space group symmetries, reaching an of 0.961 against DFT benchmarks. Notably, the model excels in predicting for both low- and high-symmetry compounds, enabling efficient high-throughput screening. We also demonstrate this capability by screening ultralow and ultrahigh candidates among 960 half-Heusler compounds and 60,000 ICSD compounds from the AFLOW database. This result shows reliability of model developed for screening of potential thermoelectric materials. At the end, we have tested model's prediction ability on systems that have experimental available that shows model's ability to search material that has desirable experimental for thermoelectric applications.

Paper Structure

This paper contains 11 sections, 4 equations, 11 figures, 4 tables.

Figures (11)

  • Figure 1: The distribution within the dataset of 150 compounds categorized with respect to (a) Crystal structure -: highlighting the prevalence of different crystal symmetry such as cubic, orthorhombic, and tetragonal (b) Space group -: reflecting the diversity in crystallographic symmetry (c) Lattice thermal conductivity ($\kappa_L$) type - classified into four categories based on $\kappa_L$ values at 300 K (Ultra-high: $\kappa_L > 15$$W\,m^{-1}\,K^{-1}$; High: $5 < \kappa_L \leq 15$$W\,m^{-1}\,K^{-1}$; Low: $1 < \kappa_L \leq 5$$W\,m^{-1}\,K^{-1}$; Ultra-low: $0 < \kappa_L \leq 1$$W\,m^{-1}\,K^{-1}$ (d) Compounds type -: showing the frequency of various compositional types (e.g., AB, A$_3$BC$_4$, etc.)
  • Figure 2: Distribution of 4127 entries in the dataset: (a) $\kappa_L$ across all temperatures (inset: zoomed view of $\kappa_L \leq 100$$W\,m^{-1}\,K^{-1}$), showing a highly skewed distribution; (b) log($\kappa_L$), revealing an approximately normal distribution suitable for machine learning modeling.
  • Figure 3: Feature generation work flow of a test compound, ScCuS$_2$, using MagPie LibraryWard2016. Further details of the elemental properties and the associated statistical descriptors used to generate these features is provided in Table S3 of SIsupplementary_file.
  • Figure 4: Pearson correlation map of 53 features used for model training. Abbreviation of all features are provided in Table-S6supplementary_file. The color bar represents Pearson Correlation Coefficient (PCC).
  • Figure 5: Comparison of (a) root mean square error (RMSE) and (b) mean absolute error (MAE) for different ML models used in predicting the logarithm of $\kappa_L$.
  • ...and 6 more figures