Excitonic Landscapes in Monolayer Lateral Heterostructures Revealed by Unsupervised Machine Learning
Maninder Kaur, Nicolas T. Sandino, Jason P. Terry, Mahdi Ghafariasl, Yohannes Abate
TL;DR
Hyperspectral PL datasets from graded TMDC alloys and MoS2–WS2 lateral heterostructures are rich but complex to interpret. A scalable unsupervised framework combining PCA, t-SNE, DBSCAN, and multi-peak Gaussian fitting maps spectra to spatial domains, revealing excitonic landscapes tied to composition, strain, and defects. The method identifies three emission species (band-edge excitons, alloy-disorder, defect-bound states) and, in heterostructures, six distinct domains corresponding to MoS2 cores, WS2 regions, and interfaces with interfacial states. This automated, label-free analysis provides actionable nanoscale insights for the design of 2D optoelectronic devices and is readily applicable to other hyperspectral materials datasets.
Abstract
Two-dimensional (2D) in-plane heterostructures including compositionally graded alloys and lateral heterostructures with defined interfaces display rich optoelectronic properties and offer versatile platforms to explore one-dimensional interface physics and many-body interaction effects. Graded \(\mathrm{Mo}_x\mathrm{W}_{1-x}\mathrm{S}_2\) alloys show smooth spatial variations in composition and strain that continuously tune excitonic emission, while \(\mathrm{MoS}_2\)--\(\mathrm{WS}_2\) lateral heterostructures contain atomically sharp interfaces supporting one-dimensional excitonic phenomena. These single-layer systems combine tunable optical and electronic properties with potential for stable, high-performance optoelectronic devices. Hyperspectral and nano-resolved photoluminescence (PL) imaging enable spatial mapping of optical features along with local variations in composition, strain, and defects, but manual interpretation of such large datasets is slow and subjective. Here, we introduce a fast and scalable unsupervised machine-learning (ML) framework to extract quantitative and interpretable information from hyperspectral PL datasets of graded \(\mathrm{Mo}_x\mathrm{W}_{1-x}\mathrm{S}_2\) alloys and \(\mathrm{MoS}_2\)--\(\mathrm{WS}_2\) heterostructures. Combining principal-component analysis (PCA), t-distributed stochastic neighbor embedding (t-SNE), and density-based spatial clustering (DBSCAN), we uncover spectrally distinct domains associated with composition, strain, and defect variations. Decomposition of representative spectra reveals multiple emission species, including band-edge excitons and defect-related transitions, demonstrating that ML-driven analysis provides a robust and automated route to interpret rich optical properties of 2D materials.
