Exploring the flavor structure of quarks and leptons with reinforcement learning
Satsuki Nishimura, Coh Miyao, Hajime Otsuka
TL;DR
This work tackles the flavor puzzle by applying a value-based reinforcement learning approach to Froggatt--Nielsen models with a $U(1)$ flavor symmetry. Using a Deep Q-network, the agent searches over 19-dimensional $U(1)$ charge configurations to reproduce quark and lepton masses and mixings, treating the flavon-induced parameter $\epsilon = v_\phi/M$ as the driver of Yukawa hierarchies. The results show the agent identifies 21 realistic quark-charge patterns and consistently favors normal ordering for neutrino masses, with predicted $m_{\beta\beta}$ in the meV range and nonzero Majorana phases arising from flavon dynamics. This demonstrates that reinforcement learning can be a powerful, model-agnostic tool to explore flavor-model spaces and motivate extensions to SMEFT and flavon CP phenomena.
Abstract
We propose a method to explore the flavor structure of quarks and leptons with reinforcement learning. As a concrete model, we utilize a basic value-based algorithm for models with $U(1)$ flavor symmetry. By training neural networks on the $U(1)$ charges of quarks and leptons, the agent finds 21 models to be consistent with experimentally measured masses and mixing angles of quarks and leptons. In particular, an intrinsic value of normal ordering tends to be larger than that of inverted ordering, and the normal ordering is well fitted with the current experimental data in contrast to the inverted ordering. A specific value of effective mass for the neutrinoless double beta decay and a sizable leptonic CP violation induced by an angular component of flavon field are predicted by autonomous behavior of the agent. Our finding results indicate that the reinforcement learning can be a new method for understanding the flavor structure.
