Reinforcement learning-based statistical search strategy for an axion model from flavor

Satsuki Nishimura; Coh Miyao; Hajime Otsuka

Reinforcement learning-based statistical search strategy for an axion model from flavor

Satsuki Nishimura, Coh Miyao, Hajime Otsuka

Abstract

We propose a reinforcement learning-based search strategy to explore new physics beyond the Standard Model. The reinforcement learning, which is one of machine learning methods, is a powerful approach to find model parameters with phenomenological constraints. As a concrete example, we focus on a minimal axion model with a global $U(1)$ flavor symmetry. Agents of the learning succeed in finding $U(1)$ charge assignments of quarks and leptons solving the flavor and cosmological puzzles in the Standard Model, and find more than 150 realistic solutions for the quark sector taking renormalization effects into account. For the solutions found by the reinforcement learning-based analysis, we discuss the sensitivity of future experiments for the detection of an axion which is a Nambu-Goldstone boson of the spontaneously broken $U(1)$. We also examine how fast the reinforcement learning-based searching method finds the best discrete parameters in comparison with conventional optimization methods. In conclusion, the efficient parameter search based on the reinforcement learning-based strategy enables us to perform a statistical analysis of the vast parameter space associated with the axion model from flavor.

Reinforcement learning-based statistical search strategy for an axion model from flavor

Abstract

flavor symmetry. Agents of the learning succeed in finding

charge assignments of quarks and leptons solving the flavor and cosmological puzzles in the Standard Model, and find more than 150 realistic solutions for the quark sector taking renormalization effects into account. For the solutions found by the reinforcement learning-based analysis, we discuss the sensitivity of future experiments for the detection of an axion which is a Nambu-Goldstone boson of the spontaneously broken

. We also examine how fast the reinforcement learning-based searching method finds the best discrete parameters in comparison with conventional optimization methods. In conclusion, the efficient parameter search based on the reinforcement learning-based strategy enables us to perform a statistical analysis of the vast parameter space associated with the axion model from flavor.

Paper Structure (18 sections, 47 equations, 4 figures, 12 tables)

This paper contains 18 sections, 47 equations, 4 figures, 12 tables.

Introduction
Flaxion model and constraints
Model
Constraints from flavor physics
Constraints from cosmology
Dark Matter
Isocurvature Perturbation
Inflation
Reinforcement learning
The environment
Neural Network
Agent
Comparison with traditional methods
Exploring the flaxion physics
Conclusion
...and 3 more sections

Figures (4)

Figure 1: $k=\log_{10} M$ means the energy scale and $n$ is the number of models found in our numerical analysis. The left figure shows the distribution of intrinsic values $V$ for the whole terminal states. On the other hand, we extract 10 models with high intrinsic values at each $k$ in the right figure. Realistic models are numerous for $k=14,15$. In addition, their intrinsic values tend to be high. Note that the range of the vertical axis is different in each graph.
Figure 2: $k=\log_{10} M$ means the energy scale and $n$ is the number of models found in our numerical analysis. Our finding models are located around $N_{\mathrm{DW}}=30$, and a minimum value is $N_{\mathrm{DW}}=20$ at $k=17$.
Figure 3: The point clouds are formed at $M=10^{14}\,\hbox{GeV}$ (blue), $10^{15}\,\hbox{GeV}$ (red), $10^{16}\,\hbox{GeV}$ (green) and $10^{17}\,\hbox{GeV}$ (orange) in ascending order of $H_{\mathrm{inf}}$. The gray area is the excluded region from the inflation scenario. Thus, all of the terminal states found at $M=10^{14}\,\hbox{GeV}$ are unsuitable due to cosmological constraints.
Figure 4: Right circles (Red) and left triangles (green) are the results calculated by using the FN charges of the lepton sector corresponding to Benchmarks 1 and 2, respectively. The gray area indicates the region excluded by existing experiments PhysRevLett.104.041301ADMX:2018ogsADMX:2019uokSalemi:2021gckPhysRevLett.127.261803Noordhuis:2022ljwBenabou:2025jcv summarized in Ref. AxionLimits. On the other hand, the blue area indicates the sensitivity of DMRadio-$\mathrm{m}^3$ proposed in Ref. DMRadio:2022pkf.

Reinforcement learning-based statistical search strategy for an axion model from flavor

Abstract

Reinforcement learning-based statistical search strategy for an axion model from flavor

Authors

Abstract

Table of Contents

Figures (4)