Table of Contents
Fetching ...

Machine Learning Approaches to the Shafarevich-Tate Group of Elliptic Curves

Angelica Babei, Barinder S. Banwait, AJ Fong, Xiaoyu Huang, Deependra Singh

TL;DR

This work demonstrates that machine learning can effectively predict the order of the Shafarevich–Tate group for elliptic curves over $\mathbb{Q}$ by leveraging BSD-invariant features and $a_p$-based information from the LMFDB. It achieves high binary-classification accuracy for Sha sizes (notably exceeding 95% in several setups) and presents a regression framework (via LightGBM) that accurately predicts $|\Sha|$ and generalizes to curves with larger conductors, including the rank-29 Elkies–Klagsbrun curve. The study also tests Delaunay’s heuristics against empirical data and uses PCA to explore the structure of BSD features, providing both practical predictive tools and data-driven insights for BSD-related questions. The released codebase enables replication and extension by researchers investigating the BSD conjecture and the arithmetic of elliptic curves.

Abstract

We train machine learning models to predict the order of the Shafarevich-Tate group of an elliptic curve over $\mathbb{Q}$. Building on earlier work of He, Lee, and Oliver, we show that a feed-forward neural network classifier trained on subsets of the invariants arising in the Birch--Swinnerton-Dyer conjectural formula yields higher accuracies ($> 0.9$) than any model previously studied. In addition, we develop a regression model that may be used to predict orders of this group not seen during training and apply this to the elliptic curve of rank 29 recently discovered by Elkies and Klagsbrun. Finally we conduct some exploratory data analyses and visualizations on our dataset. We use the elliptic curve dataset from the L-functions and modular forms database (LMFDB).

Machine Learning Approaches to the Shafarevich-Tate Group of Elliptic Curves

TL;DR

This work demonstrates that machine learning can effectively predict the order of the Shafarevich–Tate group for elliptic curves over by leveraging BSD-invariant features and -based information from the LMFDB. It achieves high binary-classification accuracy for Sha sizes (notably exceeding 95% in several setups) and presents a regression framework (via LightGBM) that accurately predicts and generalizes to curves with larger conductors, including the rank-29 Elkies–Klagsbrun curve. The study also tests Delaunay’s heuristics against empirical data and uses PCA to explore the structure of BSD features, providing both practical predictive tools and data-driven insights for BSD-related questions. The released codebase enables replication and extension by researchers investigating the BSD conjecture and the arithmetic of elliptic curves.

Abstract

We train machine learning models to predict the order of the Shafarevich-Tate group of an elliptic curve over . Building on earlier work of He, Lee, and Oliver, we show that a feed-forward neural network classifier trained on subsets of the invariants arising in the Birch--Swinnerton-Dyer conjectural formula yields higher accuracies () than any model previously studied. In addition, we develop a regression model that may be used to predict orders of this group not seen during training and apply this to the elliptic curve of rank 29 recently discovered by Elkies and Klagsbrun. Finally we conduct some exploratory data analyses and visualizations on our dataset. We use the elliptic curve dataset from the L-functions and modular forms database (LMFDB).

Paper Structure

This paper contains 18 sections, 3 equations, 12 figures, 6 tables.

Figures (12)

  • Figure 2.1: Feature Deleted vs Accuracy Across Models for $|\Sha(E/\mathbb{Q})| = 4$ and $| \Sha(E/\mathbb{Q})| = 9$.
  • Figure 2.2: Feature deleted vs accuracy in a feedforward neural network classification task between between: without and with $a_p$ values.
  • Figure 3.1: Feature Deleted vs Accuracy Across Models for $|\Sha(E/\mathbb{Q})| = 1$ and $|\Sha(E/\mathbb{Q})| = 4$.
  • Figure 4.1: The importance of the 10 most significant features computed using LightGBM. The values represent the information gain contributed by each feature to the model.
  • Figure 4.2: The accuracy within subsets of curves where $\sqrt{\Sha} \geq$ a given threshold for both the small conductor and large conductor datasets. The results are comparable between the model that includes all variables in the BSD formula and the model that substitutes $\mathop{\mathrm{\textup{Reg}}}\nolimits(E/\mathbb{Q})$ with $r$. The model that excludes both variables performs significantly worse.
  • ...and 7 more figures