Table of Contents
Fetching ...

DeFecT-FF: Accelerated Modeling of Defects in Cd-Zn--Te-Se-S Compounds Combining High-Throughput DFT and Machine Learning Force Fields

Md Habibur Rahman, Arun Mannodi-Kanakkithodi

TL;DR

The paper addresses the computational bottleneck of mapping defect landscapes in Cd/Zn–Te/Se/S alloys essential for CdTe solar cells. It introduces DeFecT-FF, which fuses high-throughput DFT (GGA-PBE and hybrid HSE06+SOC) with crystal-graph ML force fields (M3GNet ALIGNN-based MLFFs) trained to hybrid-functional accuracy, guided by active learning and ShakeNBreak sampling. The authors deliver the largest unified HSE06 defect dataset across Cd/Zn–Te/Se/S compositions and present a workflow and nanoHUB tool that enables defect enumeration, MLFF optimization, and defect formation energy diagrams as functions of the Fermi level $E_F$ and chemical potentials, greatly reducing the need for expensive DFT relaxations. These advances yield near-DFT accuracy with substantial speedups (e.g., multi-hour DFT relaxations reduced to minutes) and enable rapid, charge-aware defect surveys to guide alloying and doping strategies for CdSeTe solar cells, ultimately aiming to close the voltage deficit in this photovoltaic platform.

Abstract

We developed DeFecT-FF, a framework for predicting the energies and ground-state configurations of native point defects, extrinsic dopants, impurities, and defect complexes in zincblende-phase Cd/Zn-Te/Se/S compounds relevant to CdTe-based solar cells. The framework combines high-throughput DFT data with crystal graph-based machine learning force fields (MLFFs) trained to reproduce DFT energies and forces. Alloying at Cd or Te sites offers a route to tune the electronic and defect properties of CdTe absorbers for improved solar efficiency. Given the vast number of possible defect types, charge states, and symmetry-breaking configurations, traditional DFT approaches are computationally prohibitive. Our dataset includes GGA-PBE and HSE06-optimized structures for bulk, alloyed, interface, and grain-boundary systems. Using active learning, we expanded the dataset and trained MLFFs to accurately predict energies across charge states. The model enabled rapid screening and discovery of new low-energy defect configurations, validated through HSE06 calculations with spin-orbit coupling. The DeFecT-FF framework is publicly available as a nanoHUB tool, allowing users to upload crystallographic files, automatically generate defects, and compute defect formation energies versus Fermi level and chemical potentials, greatly reducing the need for expensive DFT simulations.

DeFecT-FF: Accelerated Modeling of Defects in Cd-Zn--Te-Se-S Compounds Combining High-Throughput DFT and Machine Learning Force Fields

TL;DR

The paper addresses the computational bottleneck of mapping defect landscapes in Cd/Zn–Te/Se/S alloys essential for CdTe solar cells. It introduces DeFecT-FF, which fuses high-throughput DFT (GGA-PBE and hybrid HSE06+SOC) with crystal-graph ML force fields (M3GNet ALIGNN-based MLFFs) trained to hybrid-functional accuracy, guided by active learning and ShakeNBreak sampling. The authors deliver the largest unified HSE06 defect dataset across Cd/Zn–Te/Se/S compositions and present a workflow and nanoHUB tool that enables defect enumeration, MLFF optimization, and defect formation energy diagrams as functions of the Fermi level and chemical potentials, greatly reducing the need for expensive DFT relaxations. These advances yield near-DFT accuracy with substantial speedups (e.g., multi-hour DFT relaxations reduced to minutes) and enable rapid, charge-aware defect surveys to guide alloying and doping strategies for CdSeTe solar cells, ultimately aiming to close the voltage deficit in this photovoltaic platform.

Abstract

We developed DeFecT-FF, a framework for predicting the energies and ground-state configurations of native point defects, extrinsic dopants, impurities, and defect complexes in zincblende-phase Cd/Zn-Te/Se/S compounds relevant to CdTe-based solar cells. The framework combines high-throughput DFT data with crystal graph-based machine learning force fields (MLFFs) trained to reproduce DFT energies and forces. Alloying at Cd or Te sites offers a route to tune the electronic and defect properties of CdTe absorbers for improved solar efficiency. Given the vast number of possible defect types, charge states, and symmetry-breaking configurations, traditional DFT approaches are computationally prohibitive. Our dataset includes GGA-PBE and HSE06-optimized structures for bulk, alloyed, interface, and grain-boundary systems. Using active learning, we expanded the dataset and trained MLFFs to accurately predict energies across charge states. The model enabled rapid screening and discovery of new low-energy defect configurations, validated through HSE06 calculations with spin-orbit coupling. The DeFecT-FF framework is publicly available as a nanoHUB tool, allowing users to upload crystallographic files, automatically generate defects, and compute defect formation energies versus Fermi level and chemical potentials, greatly reducing the need for expensive DFT simulations.

Paper Structure

This paper contains 9 sections, 8 equations, 28 figures, 7 tables.

Figures (28)

  • Figure 1: Statistics of the HSE06 dataset: (a) Number of bulk configurations from the CdSe$_{x}$Te$_{1-x}$, CdS$_{x}$Se$_{1-x}$, Cd$_{x}$Zn$_{1-x}$S, Cd$_{x}$Zn$_{1-x}$Se, Cd$_{x}$Zn$_{1-x}$Te, Cd$_{0.5}$Zn$_{0.5}$S$_{x}$Se$_{1-x}$, Cd$_{0.5}$Zn$_{0.5}$Se$_{x}$Te$_{1-x}$, ZnS$_{x}$Se$_{1-x}$, and ZnSe$_{x}$Te$_{1-x}$ compositions. (b) Distribution of defect configurations across the Cd–chalcogen and Zn–chalcogen binaries and ternaries. (c) Violin plots of crystal formation energies (meV/atom) for the entire dataset across five charge states ($+2$ to $-2$). (d,e) Defect formation energy diagrams for CdTe under Cd-rich and Te-rich conditions from HSE06 functional, highlighting the relative stability of key native (V$_{Cd}$, V$_{Te}$) and extrinsic defects (As$_{Te}$, Cl$_{Te}$).
  • Figure 2: (a–c) Parity plots comparing crystal formation energies from DFT and MLFF predictions for three representative charge states: (a) q = +1, (b) q = 0 (neutral), and (c) q = –1. The MLFF accurately reproduces the DFT energies with small errors, as indicated by the RMSE values shown in each panel. (d–f) Defect formation energies computed using MLFF predictions for a subset of the defect configurations shown in panels (a–c), compared against values from full DFT. The MLFF defect energies were obtained by adding DFT reference energies and applying charge corrections to the MLFF-predicted total energies.
  • Figure 3: Workflow for accelerated defect predictions using the DeFecT-FF framework. An initial defect structure (example: As$_i$ in a mixed "2Te--2Se" local environment) is constructed and passed through the ShakeNBreak snb_1 symmetry-breaking procedure to generate a diverse set of competing defect geometries. These distorted configurations are rapidly relaxed using the rigorously optimized machine-learned force field to identify the lowest-energy structure prior to high-fidelity DFT calculations. The optimized geometry is then used to perform static HSE+SOC calculation, yielding accurate defect formation energy diagrams.
  • Figure 4: Benchmarking DeFecT-FF for a selected defect complex in CdSe$_{0.12}$Te$_{0.88}$. (a) Visualization of the As$_\mathrm{Se}$ + Cl$_\mathrm{Se}$ complex in the CdSe$_{0.12}$Te$_{0.88}$ alloy supercell. (b) Total energy relaxation profiles for different charge states, comparing the converged DFT energies with the DeFecT-FF–relaxed energies. (c) Defect formation energies under Cd-rich conditions for charge states $+2$ to $-2$, showing close agreement between DFT and DeFecT-FF predictions. (d) Computational cost, measured in core-hours, highlighting the significant reduction in wall-time achieved when using DeFecT-FF instead of full DFT relaxations.
  • Figure 5: Defect formation energy diagrams for As$_{X}$, Cl$_{X}$, and As$_{X}$+Cl$_{X}$ defects in (a) CdTe and (b) CdSe$_{0.25}$Te$_{0.75}$, under Cd-rich conditions; X = Te or Se. (c) Defect charge transition levels for As$_{X}$, Cl$_{X}$, and As$_{X}$+Cl$_{X}$, computed for different CdSe$_x$Te$_{1-x}$ compositions ($x = 0.0$, 0.06, 0.12, 0.25). Blue lines indicate the As$_{X}$ (0/–1) acceptor level, red lines show the Cl$_{X}$ (+1/0) donor level, and purple lines show the As$_{X}$+Cl$_{X}$ (0/–1) level. For each compound, the VBM is placed at E$_F$ = 0 eV and the CBM is placed at the value of the computed band gap. All results are from HSE06+SOC calculations performed after DeFecT-FF optimization.
  • ...and 23 more figures