Table of Contents
Fetching ...

TAPAS: Datasets for Learning the Learning with Errors Problem

Eshika Saxena, Alberto Alfarano, François Charton, Emily Wenger, Kristin Lauter

TL;DR

The paper presents TAPAS, a collection of five large, preprocessed LWE datasets designed for off-the-shelf AI cryptanalysis research. It details a data-generation pipeline that combines subsampling and lattice-reduction techniques to produce millions of reduced LWE samples across diverse parameter settings, and it establishes baseline performance using SALSA and Cool & Cruel attacks. By providing extensive data, hardware- and software-agnostic preprocessing, and explicit cost metrics, TAPAS aims to accelerate AI-driven exploration of LWE security and pave the way for scaling laws and novel cryptanalytic methods. The work highlights both the potential of AI in cryptanalysis and the practical limits imposed by lattice-reduction quality and computational requirements, offering clear directions for future research and dataset expansion.

Abstract

AI-powered attacks on Learning with Errors (LWE), an important hard math problem in post-quantum cryptography, rival or outperform "classical" attacks on LWE under certain parameter settings. Despite the promise of this approach, a dearth of accessible data limits AI practitioners' ability to study and improve these attacks. Creating LWE data for AI model training is time- and compute-intensive and requires significant domain expertise. To fill this gap and accelerate AI research on LWE attacks, we propose the TAPAS datasets, a Toolkit for Analysis of Post-quantum cryptography using AI Systems. These datasets cover several LWE settings and can be used off-the-shelf by AI practitioners to prototype new approaches to cracking LWE. This work documents TAPAS dataset creation, establishes attack performance baselines, and lays out directions for future work.

TAPAS: Datasets for Learning the Learning with Errors Problem

TL;DR

The paper presents TAPAS, a collection of five large, preprocessed LWE datasets designed for off-the-shelf AI cryptanalysis research. It details a data-generation pipeline that combines subsampling and lattice-reduction techniques to produce millions of reduced LWE samples across diverse parameter settings, and it establishes baseline performance using SALSA and Cool & Cruel attacks. By providing extensive data, hardware- and software-agnostic preprocessing, and explicit cost metrics, TAPAS aims to accelerate AI-driven exploration of LWE security and pave the way for scaling laws and novel cryptanalytic methods. The work highlights both the potential of AI in cryptanalysis and the practical limits imposed by lattice-reduction quality and computational requirements, offering clear directions for future research and dataset expansion.

Abstract

AI-powered attacks on Learning with Errors (LWE), an important hard math problem in post-quantum cryptography, rival or outperform "classical" attacks on LWE under certain parameter settings. Despite the promise of this approach, a dearth of accessible data limits AI practitioners' ability to study and improve these attacks. Creating LWE data for AI model training is time- and compute-intensive and requires significant domain expertise. To fill this gap and accelerate AI research on LWE attacks, we propose the TAPAS datasets, a Toolkit for Analysis of Post-quantum cryptography using AI Systems. These datasets cover several LWE settings and can be used off-the-shelf by AI practitioners to prototype new approaches to cracking LWE. This work documents TAPAS dataset creation, establishes attack performance baselines, and lays out directions for future work.

Paper Structure

This paper contains 13 sections, 1 equation, 3 figures, 5 tables, 2 algorithms.

Figures (3)

  • Figure 1: Reduction over time for $N=256, \log q=20$ (left) and $N=512, \log q=28$ (right).Threshold $\tau$ denoted by the dashed blue line. Each red line denotes a separate reduction experiment.
  • Figure 2: Cliff shape in our five reduced datasets.Datasets that are more reduced have fewer columns of $\mathbf{A}$ with a normalized standard deviation of 1.0 (shorter cliff).
  • Figure : Interleaved lattice reduction $\mathrm{InterleavedReduction} (\Lambda_i, \alpha, \beta, \gamma, s, \tau)$: