Table of Contents
Fetching ...

Computational discovery of bifunctional organic semiconductors for energy and biosensing

Patrick Sorrel Mvoto Kongo, Steve Cabrel Teguia Kouam, Jean-Pierre Tchapet Njafa, Serge Guy Nana Engo

Abstract

The discovery of synthetically accessible organic semiconductors with exceptional performance remains a critical bottleneck in materials science. While these materials offer compelling advantages - structural modularity, mechanical flexibility, and cost-effective solution processing - for applications in photovoltaics and biosensors, identifying candidates that balance high efficiency with practical synthesis presents significant challenges. To address this challenge, we developed a high-throughput screening approach using 17 458 molecules from the PubChemQC B3LYP/6-31G*//PM6 dataset. Our strategy employs a composite metric, PCESAScore = PCE - SAScore, which systematically balances power conversion efficiency (PCE) predictions from the Scharber model against synthetic accessibility scores. This approach successfully identified seven multi-functional candidates that demonstrate both exceptional photovoltaic performance (PCE up to 36.1 %) and strong protein-binding affinity for biosensing applications. Notably, molecule 4550 emerged as the optimal candidate, exhibiting a ligand efficiency of 0.340 kcal/mol/heavy atom with 100 % target promiscuity. Our computational framework integrates machine learning, density functional theory, and molecular docking to bridge the gap between theoretical performance and experimental feasibility. These findings establish a systematic pathway for discovering synthetically compatible organic semiconductors that can simultaneously address energy conversion and molecular recognition challenges.

Computational discovery of bifunctional organic semiconductors for energy and biosensing

Abstract

The discovery of synthetically accessible organic semiconductors with exceptional performance remains a critical bottleneck in materials science. While these materials offer compelling advantages - structural modularity, mechanical flexibility, and cost-effective solution processing - for applications in photovoltaics and biosensors, identifying candidates that balance high efficiency with practical synthesis presents significant challenges. To address this challenge, we developed a high-throughput screening approach using 17 458 molecules from the PubChemQC B3LYP/6-31G*//PM6 dataset. Our strategy employs a composite metric, PCESAScore = PCE - SAScore, which systematically balances power conversion efficiency (PCE) predictions from the Scharber model against synthetic accessibility scores. This approach successfully identified seven multi-functional candidates that demonstrate both exceptional photovoltaic performance (PCE up to 36.1 %) and strong protein-binding affinity for biosensing applications. Notably, molecule 4550 emerged as the optimal candidate, exhibiting a ligand efficiency of 0.340 kcal/mol/heavy atom with 100 % target promiscuity. Our computational framework integrates machine learning, density functional theory, and molecular docking to bridge the gap between theoretical performance and experimental feasibility. These findings establish a systematic pathway for discovering synthetically compatible organic semiconductors that can simultaneously address energy conversion and molecular recognition challenges.
Paper Structure (84 sections, 6 equations, 39 figures, 12 tables)

This paper contains 84 sections, 6 equations, 39 figures, 12 tables.

Figures (39)

  • Figure 1: High-throughput computational screening workflow applied to 17458.0 molecules from the PubChemQC database. The systematic approach progresses through four sequential filtering stages: (1) frontier orbital alignment with PCBM/PCDTBT reference materials, retaining 17334.0 donor candidates and 16.0 acceptor molecules; (2) Scharber model PCE estimation under standard AM1.5G illumination conditions; (3) synthetic accessibility filtering using SAScore criteria ($< 4.0$) combined with composite PCE$_{\text{SAScore}} > 0$ thresholds, yielding 7.0 synthetically viable candidates; (4) comprehensive multifunctional assessment integrating photophysical characterization, protein-binding evaluation, and machine learning-based property profiling.
  • Figure 2: Probability density distributions of frontier orbital energies. HOMO energies (blue) center around -6eV, LUMO energies (orange) center near -1eV, and HOMO-LUMO gap (green) centers at approximately 5eV. Vertical dashed lines show peak density for each distribution.
  • Figure 3: Photovoltaic performance relationships between short-circuit current density and open-circuit voltage. Left panel shows PCBM systems with moderate PCE values reaching 8%, characterized by current densities up to 400Am but generally lower voltages. Right panel displays PCDTBT systems exhibiting highly selective performance where most molecules show negligible efficiency, but exceptional outliers achieve PCE values exceeding 30% through high open-circuit voltages ($>1.0V$) combined with moderate current densities.
  • Figure S. 1 : Correlations between open-circuit voltage ($V_\mathrm{OC}$) and frontier orbital energies. (Left) $V_\mathrm{OC}$ vs. HOMO for PCBM pairings, following the linear relationship $V_\mathrm{OC} = HOMO - 4.0$. (Right) $V_\mathrm{OC}$ vs. LUMO for PCDTBT pairings, following $V_{OC} = 5.2 - LUMO$. The clear linear trends confirm that frontier orbital alignment is the primary determinant of open-circuit voltage in these systems, with deeper HOMO and higher LUMO levels yielding higher $V_\mathrm{OC}$.
  • Figure S. 2 : K-means clustering analysis (k=5) of $V_\mathrm{OC}$ values for molecules paired with PCBM acceptor, as identified by validation scores (Panel A). Panel D shows the distribution of $V_\mathrm{OC}$ across the five identified clusters (0-4), revealing distinct voltage tiers. The t-SNE visualization (Panel C) confirms well-separated clusters corresponding to these voltage performance groups.
  • ...and 34 more figures