Table of Contents
Fetching ...

Learning unified control of internal spin squeezing in atomic qudits for magnetometry

C. Z. Cao, J. Z. Han, M. Xiong, M. Deng, L. Wang, X. Lv, M. Xue

Abstract

Generating and preserving metrologically useful quantum states is a central challenge in quantum-enhanced atomic magnetometry. In multilevel atoms operated in the low-field regime, the nonlinear Zeeman (NLZ) effect is both a resource and a limitation. It nonlinearly redistributes internal spin fluctuations to generate spin-squeezed states within a single atomic qudit, yet under fixed readout it distorts the measurement-relevant quadrature and limits the accessible metrological gain. This challenge is compounded by the time dependence of both the squeezing axis and the effective nonlinear action. Here we show that physics-informed reinforcement learning can transform NLZ dynamics from a source of readout degradation into a sustained metrological resource. Using only experimentally accessible low-order spin moments, a trained agent identifies, in the $f=21/2$ manifold of $^{161}\mathrm{Dy}$, a unified control policy that rapidly prepares strongly squeezed internal states and stabilizes more than $4\,\mathrm{dB}$ of fixed-axis spin squeezing under always-on NLZ evolution. Including state-preparation overhead, the learned protocol yields a single-atom magnetic sensitivity of $13.9\,\mathrm{pT}/\sqrt{\mathrm{Hz}}$, corresponding to an advantage of approximately $3\,\mathrm{dB}$ beyond the standard quantum limit. Our results establish learning-based control as a practical route for converting unavoidable intrinsic nonlinear dynamics in multilevel quantum sensors into operational metrological advantage.

Learning unified control of internal spin squeezing in atomic qudits for magnetometry

Abstract

Generating and preserving metrologically useful quantum states is a central challenge in quantum-enhanced atomic magnetometry. In multilevel atoms operated in the low-field regime, the nonlinear Zeeman (NLZ) effect is both a resource and a limitation. It nonlinearly redistributes internal spin fluctuations to generate spin-squeezed states within a single atomic qudit, yet under fixed readout it distorts the measurement-relevant quadrature and limits the accessible metrological gain. This challenge is compounded by the time dependence of both the squeezing axis and the effective nonlinear action. Here we show that physics-informed reinforcement learning can transform NLZ dynamics from a source of readout degradation into a sustained metrological resource. Using only experimentally accessible low-order spin moments, a trained agent identifies, in the manifold of , a unified control policy that rapidly prepares strongly squeezed internal states and stabilizes more than of fixed-axis spin squeezing under always-on NLZ evolution. Including state-preparation overhead, the learned protocol yields a single-atom magnetic sensitivity of , corresponding to an advantage of approximately beyond the standard quantum limit. Our results establish learning-based control as a practical route for converting unavoidable intrinsic nonlinear dynamics in multilevel quantum sensors into operational metrological advantage.

Paper Structure

This paper contains 12 sections, 31 equations, 4 figures, 1 table.

Figures (4)

  • Figure 1: Physics-informed reinforcement-learning framework for control of internal spin dynamics in an atomic qudit. (a) We consider a single ${}^{161}\mathrm{Dy}$ atom with nuclear spin $i=5/2$ and electronic angular momentum $j=8$, focusing on the hyperfine manifold with $f=21/2$. In a magnetic field, the multilevel Zeeman structure acquires a quadratic component, giving rise to an always-on nonlinear Zeeman (NLZ) interaction $\propto \hat{f}_z^2$. (b) Control problem. Starting from an initial coherent spin state, the qudit evolves under repeated cycles of free NLZ evolution and agent-selected transverse rotations $R_\mu(\beta)$ with $\mu\in\{x,y\}$. At each step, the agent receives an observation $o_k\in\mathcal{O}$ consisting of experimentally accessible low-order spin moments, chooses an action $a_k\in\mathcal{A}$ from a discrete set of realizable rotations, and is trained to maximize the cumulative reward $\mathcal{G}$ defined from physically motivated squeezing metrics. This defines a unified control problem under always-on nonlinear dynamics.
  • Figure 2: Learned pulse protocol from PIRL. (a) Control pulse sequence selected by the PIRL agent. Colored bars denote discrete transverse rotations applied at each control step. (b) Time evolution of the Wineland squeezing parameter $\xi^2(t)$ (red solid) and the fixed-axis squeezing parameter $\xi_y^2(t)$ (green solid) as functions of the dimensionless time $\chi t$. Gray solid and dotted curves denote the QZE (OAT) evolution and the effective TACT benchmark, respectively. The vertical gray line near $\chi t\simeq 0.15$ marks the reward-defined switching time $t_r$. (c) Spin Wigner distributions at representative times $t_1$--$t_5$ indicated in panel (b), covering the squeezing-generation and stabilization stages.
  • Figure 3: Fidelity-based evidence for the toggling-frame stabilization mechanism. Fidelity evolution for several reference states. Solid and dashed curves denote evolution under $\hat{f}_y^2$ and $\hat{f}_z^2$, respectively. The reference states are the coherent spin state $\lvert f,m_x{=}f\rangle$, the PIRL state at $t_3$, and the OAT and TACT states at their respective minima of $\xi^2$. For the OAT and TACT states, an additional $R_x$ rotation is applied so that $\xi_y^2=\xi^2$.
  • Figure 4: Metrological analysis of the PIRL strategy. (a) Phase sensitivity relative to the SQL, expressed in dB, as a function of the dimensionless interrogation time $\chi t$. Encoding under the unified RL control starting from the stabilized state at time $t_4$ in Fig. \ref{['fig:pulseseq']}(c) (red solid), QZE evolution from the TACT-optimal squeezed state (blue solid), and the same initial state under the $R_x$-pulse stabilization protocol of Ref. yang2025quantum (green solid). The black dashed line denotes the SQL. (b) Readout distributions $P_m(\phi)$ for representative probe states and control protocols. Panels 1 and 2 show the distributions before ($\phi=0$, top) and after ($\phi=0.2$, bottom) phase encoding for the initial states used in the blue and red curves of panel (a), respectively. Panels 3 and 4 show the corresponding distributions at $\chi t\approx 0.09$ for the green and red curves in panel (a), respectively. (c) Single-atom magnetic-field sensitivity relative to the SQL, expressed in dB, as a function of the total protocol time $T_{\mathrm{tot}}$. The shaded region indicates the preparation time $T_{\mathrm p}$, and the inset shows the absolute sensitivity $\delta B(T_{\mathrm{tot}})$ in $\mathrm{nT}/\sqrt{\mathrm{Hz}}$.