From Polynomials to Databases: Arithmetic Structures in Galois Theory
Jurgen Mezinaj
TL;DR
This work presents a scalable framework for classifying Galois groups of irreducible degree-7 polynomials over $\mathbb{Q}$ by integrating classical resolvent techniques with neurosymbolic machine learning. It builds a large, reproducible database of over $1.18$ million septics annotated with invariant features and discriminants, and uses targeted resolvents (quadratic, 30-ic, 120-ic, and 35-ic) to identify the seven transitive subgroups of $S_7$ as classified by Foulkes. A neural-network pipeline augmented with invariant-based features improves detection of rare solvable groups, with non-$S_7$ analyses yielding notable gains in balanced accuracy. The work also connects constructive Galois theory via Hurwitz spaces and Hilbert irreducibility, providing explicit polynomials realizing groups such as $C_7$ and $C_7\rtimes C_3$, and discusses extensions to higher degrees. Overall, the paper demonstrates how hybrid symbolic-numeric methods can address the inverse Galois problem and enable empirical studies of Galois-group distributions under height constraints.
Abstract
We develop a computational framework for classifying Galois groups of irreducible degree-7 polynomials over~$\mathbb{Q}$, combining explicit resolvent methods with machine learning techniques. A database of over one million normalized projective septics is constructed, each annotated with algebraic invariants~$J_0, \dots, J_4$ derived from binary transvections. For each polynomial, we compute resolvent factorizations to determine its Galois group among the seven transitive subgroups of~$S_7$ identified by Foulkes. Using this dataset, we train a neurosymbolic classifier that integrates invariant-theoretic features with supervised learning, yielding improved accuracy in detecting rare solvable groups compared to coefficient-based models. The resulting database provides a reproducible resource for constructive Galois theory and supports empirical investigations into group distribution under height constraints. The methodology extends to higher-degree cases and illustrates the utility of hybrid symbolic-numeric techniques in computational algebra.
