Machine Learning Approaches to the Shafarevich-Tate Group of Elliptic Curves
Angelica Babei, Barinder S. Banwait, AJ Fong, Xiaoyu Huang, Deependra Singh
TL;DR
This work demonstrates that machine learning can effectively predict the order of the Shafarevich–Tate group for elliptic curves over $\mathbb{Q}$ by leveraging BSD-invariant features and $a_p$-based information from the LMFDB. It achieves high binary-classification accuracy for Sha sizes (notably exceeding 95% in several setups) and presents a regression framework (via LightGBM) that accurately predicts $|\Sha|$ and generalizes to curves with larger conductors, including the rank-29 Elkies–Klagsbrun curve. The study also tests Delaunay’s heuristics against empirical data and uses PCA to explore the structure of BSD features, providing both practical predictive tools and data-driven insights for BSD-related questions. The released codebase enables replication and extension by researchers investigating the BSD conjecture and the arithmetic of elliptic curves.
Abstract
We train machine learning models to predict the order of the Shafarevich-Tate group of an elliptic curve over $\mathbb{Q}$. Building on earlier work of He, Lee, and Oliver, we show that a feed-forward neural network classifier trained on subsets of the invariants arising in the Birch--Swinnerton-Dyer conjectural formula yields higher accuracies ($> 0.9$) than any model previously studied. In addition, we develop a regression model that may be used to predict orders of this group not seen during training and apply this to the elliptic curve of rank 29 recently discovered by Elkies and Klagsbrun. Finally we conduct some exploratory data analyses and visualizations on our dataset. We use the elliptic curve dataset from the L-functions and modular forms database (LMFDB).
