Table of Contents
Fetching ...

Hierarchy-Boosted Funnel Learning for Identifying Semiconductors with Ultralow Lattice Thermal Conductivity

Mengfan Wu, Shenshen Yan, Jie Ren

TL;DR

This work introduces HiBoFL, a hierarchy-boosted funnel learning framework that integrates unsupervised clustering with low-cost high-throughput screening to efficiently identify semiconductors with ultralow lattice thermal conductivity κₗ. By building a local, labeled database and training an interpretable CatBoost classifier, the approach achieves high predictive performance (ROC AUC ≈ 0.94) and reveals mechanistic descriptors, including a novel L^min factor linked to structural anharmonicity. The pipeline identifies Cs₂GeSe₃ and Cs₂SnSe₃ as ultralow κₗ candidates and demonstrates how interpretable ML can bridge data-driven predictions with first-principles understanding of phonon transport. Beyond thermoelectrics, HiBoFL offers a general strategy to accelerate discovery of functional materials in large chemical spaces with limited labeled data.

Abstract

Data-driven machine learning (ML) has demonstrated tremendous potential in material property predictions. However, the scarcity of materials data with costly property labels in the vast chemical space presents a significant challenge for ML in efficiently predicting properties and uncovering structure-property relationships. Here, we propose a novel hierarchy-boosted funnel learning (HiBoFL) framework, which is successfully applied to identify semiconductors with ultralow lattice thermal conductivity ($κ_\mathrm{L}$). By training on only a few hundred materials targeted by unsupervised learning from a pool of hundreds of thousands, we achieve efficient and interpretable supervised predictions of ultralow $κ_\mathrm{L}$, thereby circumventing large-scale brute-force \textit{ab initio} calculations without clear objectives. As a result, we provide a list of candidates with ultralow $κ_\mathrm{L}$ for potential thermoelectric applications and discover a new factor that significantly influences structural anharmonicity. This HiBoFL framework offers a novel practical pathway for accelerating the discovery of functional materials.

Hierarchy-Boosted Funnel Learning for Identifying Semiconductors with Ultralow Lattice Thermal Conductivity

TL;DR

This work introduces HiBoFL, a hierarchy-boosted funnel learning framework that integrates unsupervised clustering with low-cost high-throughput screening to efficiently identify semiconductors with ultralow lattice thermal conductivity κₗ. By building a local, labeled database and training an interpretable CatBoost classifier, the approach achieves high predictive performance (ROC AUC ≈ 0.94) and reveals mechanistic descriptors, including a novel L^min factor linked to structural anharmonicity. The pipeline identifies Cs₂GeSe₃ and Cs₂SnSe₃ as ultralow κₗ candidates and demonstrates how interpretable ML can bridge data-driven predictions with first-principles understanding of phonon transport. Beyond thermoelectrics, HiBoFL offers a general strategy to accelerate discovery of functional materials in large chemical spaces with limited labeled data.

Abstract

Data-driven machine learning (ML) has demonstrated tremendous potential in material property predictions. However, the scarcity of materials data with costly property labels in the vast chemical space presents a significant challenge for ML in efficiently predicting properties and uncovering structure-property relationships. Here, we propose a novel hierarchy-boosted funnel learning (HiBoFL) framework, which is successfully applied to identify semiconductors with ultralow lattice thermal conductivity (). By training on only a few hundred materials targeted by unsupervised learning from a pool of hundreds of thousands, we achieve efficient and interpretable supervised predictions of ultralow , thereby circumventing large-scale brute-force \textit{ab initio} calculations without clear objectives. As a result, we provide a list of candidates with ultralow for potential thermoelectric applications and discover a new factor that significantly influences structural anharmonicity. This HiBoFL framework offers a novel practical pathway for accelerating the discovery of functional materials.
Paper Structure (13 sections, 19 equations, 6 figures)

This paper contains 13 sections, 19 equations, 6 figures.

Figures (6)

  • Figure 1: (a) Schematic of the HiBoFL framework, including data preparation, unsupervised learning, data annotation and supervised learning. (b) Workflow of applying the HiBoFL framework to efficiently identify semiconductors with ultralow $\kappa_\mathrm{L}$.
  • Figure 2: (a) Flowchart of preliminary high-throughput screening from the Materials Project (MP) database. (b) Optimization of the number of clusters k for the k-means algorithm based on the elbow method and silhouette coefficient (inset).
  • Figure 3: Unsupervised learning of the materials in the first-level dataset. Using t-SNE visualization for the seven clusters generated by the k-means algorithm, where each point represents a compound and is colored with the corresponding cluster. Eight represented materials in C1 and C2 with low experiment-measured $\kappa_\mathrm{L}$ are marked in the enlarged region.
  • Figure 4: (a–d) Statistical analysis of HTC results in the second-level dataset. (a) Pie chart of $\kappa_\mathrm{PET}$ separated by a threshold of 2 W/mK and distribution of different formula types with counts exceeding 20. (b–d) Distribution of the materials with $\kappa_\mathrm{PET}$ no greater than 2 W/mK: (b) Heat map over the periodic table of elements for easy visualization of element counts. (c) Box plot of different crystal systems. (d) Density scatter plot of shear modulus and bulk modulus. (e) Calculated $\kappa_\mathrm{L}$ as the function of temperature at different axes in Cs2SnSe3 and Cs2GeSe3 by solving the phonon BTE.
  • Figure 5: (a–g) First-principles analysis of the phonon thermal transport properties. Crystal structures and the projected 2D ELF diagram of (a) Cs2SnSe3 and (b) Cs2GeSe3. Phonon dispersion (left panels), atom-projected PDOS (middle panels) and spectral $\kappa_\mathrm{L}(\omega)$ (right panels) of (c) Cs2SnSe3 and (d) Cs2GeSe3. (e) Group velocity along the b-axis and (f) phonon lifetime as a function of frequency at 300 K for Cs2SnSe3 and Cs2GeSe3. (g) COHP and ICOHP projected on Sn–Se and Ge–Se bonds.
  • ...and 1 more figures