Automation of Quantum Dot Measurement Analysis via Explainable Machine Learning

Daniel Schug; Tyler J. Kovach; M. A. Wolfe; Jared Benson; Sanghyeok Park; J. P. Dodson; J. Corrigan; M. A. Eriksson; Justyna P. Zwolak

Automation of Quantum Dot Measurement Analysis via Explainable Machine Learning

Daniel Schug, Tyler J. Kovach, M. A. Wolfe, Jared Benson, Sanghyeok Park, J. P. Dodson, J. Corrigan, M. A. Eriksson, Justyna P. Zwolak

TL;DR

This work investigates explainable machine learning for automated quantum dot tuning by analyzing triangle plots, a key image-based diagnostic. It compares two image-vectorization strategies—a Gabor filterbank and a synthetic triangle model—within explainable boosting machines to yield accurate yet interpretable predictions of good versus bad triangle plots. A hybrid approach that combines a single informative Gabor feature with synthetic features achieves near-top accuracy while greatly enhancing interpretability, linking model cues directly to physical gate behaviors. The findings support integrating interpretable ML into real-time QD autotuning and generalize to related Si/Ge and other gate-defined quantum dot platforms. The study advances transparent, data-driven guidance for operating points in quantum devices.

Abstract

The rapid development of quantum dot (QD) devices for quantum computing has necessitated more efficient and automated methods for device characterization and tuning. This work demonstrates the feasibility and advantages of applying explainable machine learning techniques to the analysis of quantum dot measurements, paving the way for further advances in automated and transparent QD device tuning. Many of the measurements acquired during the tuning process come in the form of images that need to be properly analyzed to guide the subsequent tuning steps. By design, features present in such images capture certain behaviors or states of the measured QD devices. When considered carefully, such features can aid the control and calibration of QD devices. An important example of such images are so-called $\textit{triangle plots}$, which visually represent current flow and reveal characteristics important for QD device calibration. While image-based classification tools, such as convolutional neural networks (CNNs), can be used to verify whether a given measurement is $\textit{good}$ and thus warrants the initiation of the next phase of tuning, they do not provide any insights into how the device should be adjusted in the case of $\textit{bad}$ images. This is because CNNs sacrifice prediction and model intelligibility for high accuracy. To ameliorate this trade-off, a recent study introduced an image vectorization approach that relies on the Gabor wavelet transform (Schug $\textit{et al.}$ 2024 $\textit{Proc. XAI4Sci: Explainable Machine Learning for Sciences Workshop (AAAI 2024) (Vancouver, Canada)}$ pp 1-6). Here we propose an alternative vectorization method that involves mathematical modeling of synthetic triangles to mimic the experimental data. Using explainable boosting machines, we show that this new method offers superior explainability of model prediction without sacrificing accuracy.

Automation of Quantum Dot Measurement Analysis via Explainable Machine Learning

TL;DR

Abstract

, which visually represent current flow and reveal characteristics important for QD device calibration. While image-based classification tools, such as convolutional neural networks (CNNs), can be used to verify whether a given measurement is

and thus warrants the initiation of the next phase of tuning, they do not provide any insights into how the device should be adjusted in the case of

images. This is because CNNs sacrifice prediction and model intelligibility for high accuracy. To ameliorate this trade-off, a recent study introduced an image vectorization approach that relies on the Gabor wavelet transform (Schug

2024

pp 1-6). Here we propose an alternative vectorization method that involves mathematical modeling of synthetic triangles to mimic the experimental data. Using explainable boosting machines, we show that this new method offers superior explainability of model prediction without sacrificing accuracy.

Paper Structure (11 sections, 8 equations, 5 figures, 1 table)

This paper contains 11 sections, 8 equations, 5 figures, 1 table.

Introduction
Background and methods
Quantum dot tuning problem and triangle plots
Triangle plots dataset
Gabor filterbank approach
Synthetic triangle plots
Synthetic data modeling approach
Results
Prediction interpretability
Conclusion and outlook
Misclassified data: A post-hoc analysis

Figures (5)

Figure 1: (a) False-colored SEM micrograph of a typical Si/Si$_{x}$Ge$_{1-x}$ heterostructure device. The red highlighted gates are the screening gates that are swept for the triangle plot. This quad-QD device has three distinct current channels between the screening gates marked with white arrows. One of the channels (in the lower half of the device) contains four QDs in series while the other two channels (in the upper half of the device) are single QD channels for charge-based single-electron-transistor readout. (b) An example of a good triangle plot with high visibility in the triangle region above the background and (c) a corresponding synthetic triangle plot. (c) An example of a bad triangle plot with little to no current in the triangle region and (d) a corresponding synthetic triangle plot.
Figure 2: (a) A sample experimentally acquired scan, the same as in figure 1(b). (b) The Gabor filters for the scan shown in (a), weighted by their importance as indicated by the EBM model. (c) Synthetic fit to scan shown in (a). The $\mathcal{H}_B$, $\mathcal{V}_B$, and $\mathcal{D}_B$ are the horizontal, vertical, and diagonal optimal boundaries, respectively, and the $\theta$ indicates the orientation of the diagonal boundary.
Figure 3: The feature importance plot for (a) the Gabor filterbank vectorization approach, (b) the synthetic data modeling vectorization approach, and (c) the hybrid vectorization approach. The mean absolute score refers to the extent to which a feature contributes to all predictions. Feature curve for (d) the oriented Gabor feature $\mathcal{G}_{16,16}^{45}$, (e) the fit fitness feature $\mathcal{F}$ of the synthetic triangle fit, and (f) the diagonal boundary component $\mathcal{D}_{B}$ of the synthetic triangle fit for the synthetic data modeling approach shown in orange and for the hybrid approach shown in blue. The $x$-axis indicates the value attained by the feature [normalized to $(0,1)$ in (d) and (e)], and the score indicates the log odds towards good (positive) and bad (negative) classes for each respective feature. The histograms in (g-i) depict the relative class distribution as a function of the normalized feature value, with cyan representing good triangles and pink representing bad triangles.
Figure 4: (a) A sample experimentally acquired bad scan. (e, i) Two sample experimentally acquired good scans. (b, f, j) The fitted synthetic triangle plots for scans (a), (e), and (i), respectively. (c, g, k) Gabor-filter-based local interpretation of scans is shown in (a), (e), and (i), respectively. (d, h, l) The feature importance for the hybrid vectorization method for scans (a), (e), and (i), respectively. The intercept coefficient $\alpha=4$ is removed from plots (d), (h), and (l) for clarity.
Figure A1: Sample ambiguous triangle plots. (a)--(c) Good-class triangle plots classified by the model as bad. (d)--(e) Bad-class triangle plots classified by the model as good.

Theorems & Definitions (3)

Definition
Definition
Definition

Automation of Quantum Dot Measurement Analysis via Explainable Machine Learning

TL;DR

Abstract

Automation of Quantum Dot Measurement Analysis via Explainable Machine Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (5)

Theorems & Definitions (3)