Machine learning for modular multiplication

Kristin Lauter; Cathy Yuanchen Li; Krystal Maughan; Rachel Newton; Megha Srivastava

Machine learning for modular multiplication

Kristin Lauter, Cathy Yuanchen Li, Krystal Maughan, Rachel Newton, Megha Srivastava

TL;DR

Two machine learning approaches to modular multiplication are investigated: namely circular regression and a sequence-to-sequence transformer model, which give evidence for the hardness of tasks involving modular multiplication upon which cryptosystems are based.

Abstract

Motivated by cryptographic applications, we investigate two machine learning approaches to modular multiplication: namely circular regression and a sequence-to-sequence transformer model. The limited success of both methods demonstrated in our results gives evidence for the hardness of tasks involving modular multiplication upon which cryptosystems are based.

Machine learning for modular multiplication

TL;DR

Abstract

Paper Structure (14 sections, 13 equations, 12 figures, 3 tables)

This paper contains 14 sections, 13 equations, 12 figures, 3 tables.

Introduction
Circular regression for modular multiplication
The task
Transforming to a circular regression problem
Analysis of the algorithm
Experiment setup
Empirical results
Transformers for Modular Multiplication
The task
Representation and model
Memorization
Evaluation
Generalization results
Discussion

Figures (12)

Figure 2.1: Circular regression loss for $p=41$, $s=3$, plotted using the data set $\{(a_{i}, b_i = a_{i}s \pmod{p})\mid 0\leq a_{i} < p, a_{i}\in \mathbb{Z}\}$, which does not include errors in $b_i$, and has size $m=p$.
Figure 2.2: Circular regression gradient for $p=41$, $s=3$, data set $\{(a_{i}, b_i = a_{i}s \pmod{p})\mid 0\leq a_{i} < p, a_{i}\in \mathbb{Z}\}$. The red dots mark the gradient values when the predictions are at integer points.
Figure 2.3: Reciprocal of the circular regression gradient for $p=41$, $s=3$, data set $\{(a_{i}, b_i = a_{i}s \pmod{p})\mid 0\leq a_{i} < p, a_{i}\in \mathbb{Z}\}$, when the predictions are at integer points.
Figure 3.1: Training curve for modular multiplication task with $p=251$, $s=3$, and base $\mathcal{B}=7$ shows that optimizing sequence-to-sequence accuracy also helps improve arithmetic accuracy, as both test loss and arithmetic difference between generated outputs $\hat{y}_i$ and true values $y_i$ decrease during training.
Figure 3.2: Train accuracy for $p=83$, $3\leq s \leq83$, and $\mathcal{B}\in\{8, 9, 11\}$, after training for $5000$ epochs with learning rate $0.007$.
...and 7 more figures

Theorems & Definitions (2)

Definition 1: von Mises distribution
Example 2

Machine learning for modular multiplication

TL;DR

Abstract

Machine learning for modular multiplication

Authors

TL;DR

Abstract

Table of Contents

Figures (12)

Theorems & Definitions (2)