Table of Contents
Fetching ...

Excitonic Landscapes in Monolayer Lateral Heterostructures Revealed by Unsupervised Machine Learning

Maninder Kaur, Nicolas T. Sandino, Jason P. Terry, Mahdi Ghafariasl, Yohannes Abate

TL;DR

Hyperspectral PL datasets from graded TMDC alloys and MoS2–WS2 lateral heterostructures are rich but complex to interpret. A scalable unsupervised framework combining PCA, t-SNE, DBSCAN, and multi-peak Gaussian fitting maps spectra to spatial domains, revealing excitonic landscapes tied to composition, strain, and defects. The method identifies three emission species (band-edge excitons, alloy-disorder, defect-bound states) and, in heterostructures, six distinct domains corresponding to MoS2 cores, WS2 regions, and interfaces with interfacial states. This automated, label-free analysis provides actionable nanoscale insights for the design of 2D optoelectronic devices and is readily applicable to other hyperspectral materials datasets.

Abstract

Two-dimensional (2D) in-plane heterostructures including compositionally graded alloys and lateral heterostructures with defined interfaces display rich optoelectronic properties and offer versatile platforms to explore one-dimensional interface physics and many-body interaction effects. Graded \(\mathrm{Mo}_x\mathrm{W}_{1-x}\mathrm{S}_2\) alloys show smooth spatial variations in composition and strain that continuously tune excitonic emission, while \(\mathrm{MoS}_2\)--\(\mathrm{WS}_2\) lateral heterostructures contain atomically sharp interfaces supporting one-dimensional excitonic phenomena. These single-layer systems combine tunable optical and electronic properties with potential for stable, high-performance optoelectronic devices. Hyperspectral and nano-resolved photoluminescence (PL) imaging enable spatial mapping of optical features along with local variations in composition, strain, and defects, but manual interpretation of such large datasets is slow and subjective. Here, we introduce a fast and scalable unsupervised machine-learning (ML) framework to extract quantitative and interpretable information from hyperspectral PL datasets of graded \(\mathrm{Mo}_x\mathrm{W}_{1-x}\mathrm{S}_2\) alloys and \(\mathrm{MoS}_2\)--\(\mathrm{WS}_2\) heterostructures. Combining principal-component analysis (PCA), t-distributed stochastic neighbor embedding (t-SNE), and density-based spatial clustering (DBSCAN), we uncover spectrally distinct domains associated with composition, strain, and defect variations. Decomposition of representative spectra reveals multiple emission species, including band-edge excitons and defect-related transitions, demonstrating that ML-driven analysis provides a robust and automated route to interpret rich optical properties of 2D materials.

Excitonic Landscapes in Monolayer Lateral Heterostructures Revealed by Unsupervised Machine Learning

TL;DR

Hyperspectral PL datasets from graded TMDC alloys and MoS2–WS2 lateral heterostructures are rich but complex to interpret. A scalable unsupervised framework combining PCA, t-SNE, DBSCAN, and multi-peak Gaussian fitting maps spectra to spatial domains, revealing excitonic landscapes tied to composition, strain, and defects. The method identifies three emission species (band-edge excitons, alloy-disorder, defect-bound states) and, in heterostructures, six distinct domains corresponding to MoS2 cores, WS2 regions, and interfaces with interfacial states. This automated, label-free analysis provides actionable nanoscale insights for the design of 2D optoelectronic devices and is readily applicable to other hyperspectral materials datasets.

Abstract

Two-dimensional (2D) in-plane heterostructures including compositionally graded alloys and lateral heterostructures with defined interfaces display rich optoelectronic properties and offer versatile platforms to explore one-dimensional interface physics and many-body interaction effects. Graded alloys show smooth spatial variations in composition and strain that continuously tune excitonic emission, while -- lateral heterostructures contain atomically sharp interfaces supporting one-dimensional excitonic phenomena. These single-layer systems combine tunable optical and electronic properties with potential for stable, high-performance optoelectronic devices. Hyperspectral and nano-resolved photoluminescence (PL) imaging enable spatial mapping of optical features along with local variations in composition, strain, and defects, but manual interpretation of such large datasets is slow and subjective. Here, we introduce a fast and scalable unsupervised machine-learning (ML) framework to extract quantitative and interpretable information from hyperspectral PL datasets of graded alloys and -- heterostructures. Combining principal-component analysis (PCA), t-distributed stochastic neighbor embedding (t-SNE), and density-based spatial clustering (DBSCAN), we uncover spectrally distinct domains associated with composition, strain, and defect variations. Decomposition of representative spectra reveals multiple emission species, including band-edge excitons and defect-related transitions, demonstrating that ML-driven analysis provides a robust and automated route to interpret rich optical properties of 2D materials.

Paper Structure

This paper contains 4 sections, 7 figures.

Figures (7)

  • Figure 1: Workflow for unsupervised ML analysis of hyperspectral PL data from alloy and heterostructure monolayers. (a) For alloy samples, the hyperspectral cube is reduced using PCA, followed by t-SNE visualization and DBSCAN clustering; selected pixels are then subjected to Gaussian fitting. (b) For heterostructure samples, the same dimensionality-reduction and clustering workflow is applied.
  • Figure 2: Hyperspectral analysis of a $\mathrm{Mo}_x\mathrm{W}_{1-x}\mathrm{S}_2$ monolayer. (a) Total PL intensity map obtained directly from experimental data (without machine learning). (b) PCA explained variance plot showing the fraction of total variance captured by each component. (c) Eigen-spectra ($\mathit{PC1}$–$\mathit{PC3}$) corresponding to the main modes of spectral variation. (d–f) Spatial projections of ($\mathit{PC1}$–$\mathit{PC3}$) across the flake.
  • Figure 3: Unsupervised clustering and visualization of hyperspectral PL spectra from a graded $\mathrm{Mo}_x\mathrm{W}_{1-x}\mathrm{S}_2$ monolayer. (a) t-SNE projection of the PCA-reduced spectra colored by DBSCAN cluster labels. (b) Real-space mapping of the same clusters across the flake. (c) Representative Gaussian fits to a selected pixel spectrum (from the 51 marked in panel b), showing one-, two-, and three-component fitting models for comparison.
  • Figure 4: Spatial mapping of dominant emission parameters from three-component Gaussian fits in a graded Mo$_x$W$_{1-x}$S$_2$ monolayer. (a) Map of $A_{2}$ from the second Gaussian component. (b) Map of $\lambda_{3}$ from the third Gaussian component. Both maps are overlaid on the spectral cluster map from Fig. 3(b), where the 51 marked pixels (black crosses) are color-coded here according to the fitted amplitude and wavelength values. For completeness, the full model yields six fitted parameters ($A_{1}$, $A_{2}$, $A_{3}$, and $\lambda_{1}$, $\lambda_{2}$, $\lambda_{3}$); however, only the two dominant quantities ($A_{2}$ and $\lambda_{3}$) are shown here for brevity.
  • Figure 5: Hyperspectral analysis of the $\mathrm{MoS_{2}}$–$\mathrm{WS_{2}}$ heterostructure PL data. (a) Total PL intensity map obtained directly from experimental data (without machine learning). (b) PCA explained variance plot showing the fraction of total variance captured by each component. (c) Eigen-spectra ($\mathit{PC1}$–$\mathit{PC3}$) corresponding to the main modes of spectral variation. (d–f) Spatial projections of $\mathit{PC1}$–$\mathit{PC3}$ across the flake.
  • ...and 2 more figures