Model-free Distortion Canceling and Control of Quantum Devices
Ahmed F. Fouad, Akram Youssry, Ahmed El-Rafei, Sherif Hammad
TL;DR
This work tackles controlling closed quantum systems when control signals are subject to unknown classical distortions and detailed system models are unavailable. It introduces a model-free DRL approach using REINFORCE to steer the system’s state probability distribution toward chosen target distributions, enabled by a novel multi-NN controller architecture that scales to multiple targets and accommodates both MDP and POMDP settings with continuous or discrete actions. Validating on a voltage-controlled photonic waveguide array, the method achieves fidelity exceeding 99% within 10 ms and effectively cancels distortions, outperforming conventional constant-step control. The framework offers robust, closed-loop quantum control without a priori system identification, with potential applicability to a wide range of quantum devices and to open-system scenarios.
Abstract
Quantum devices need precise control to achieve their full capability. In this work, we address the problem of controlling closed quantum systems, tackling two main issues. First, in practice the control signals are usually subject to unknown classical distortions that could arise from the device fabrication, material properties and/or instruments generating those signals. Second, in most cases modeling the system is very difficult or not even viable due to uncertainties in the relations between some variables and inaccessibility to some measurements inside the system. In this paper, we introduce a general model-free control approach based on deep reinforcement learning (DRL), that can work for any closed quantum system. We train a deep neural network (NN), using the REINFORCE policy gradient algorithm to control the state probability distribution of a closed quantum system as it evolves, and drive it to different target distributions. We present a novel controller architecture that comprises multiple NNs. This enables accommodating as many different target state distributions as desired, without increasing the complexity of the NN or its training process. The used DRL algorithm works whether the control problem can be modeled as a Markov decision process (MDP) or a partially observed MDP. Our method is valid whether the control signals are discrete- or continuous-valued. We verified our method through numerical simulations based on a photonic waveguide array chip. We trained a controller to generate sequences of different target output distributions of the chip with fidelity higher than 99%, where the controller showed superior performance in canceling the classical signal distortions.
