Table of Contents
Fetching ...

Exploring Prime Number Classification: Achieving High Recall Rate and Rapid Convergence with Sparse Encoding

Serin Lee, S. Kim

TL;DR

The paper tackles prime-number classification by reframing it as a machine-learning problem under imbalanced data. It introduces a highly sparse encoding of integer sequences into a four-dimensional tensor and processes it with a ResNet feature extractor plus Transformer classifier, trained with weighted cross-entropy and a 5% resampling strategy. The approach achieves around $99\%$ recall for primes and $79\%$ recall for non-primes, with rapid convergence within a few thousand iterations, though it suffers a notable false positive rate driven mainly by semi-primes. The work demonstrates the feasibility of applying conventional deep learning architectures to prime number analysis and highlights avenues for reducing false positives and gaining deeper number-theoretic insights.

Abstract

This paper presents a novel approach at the intersection of machine learning and number theory, focusing on the classification of prime and non-prime numbers. At the core of our research is the development of a highly sparse encoding method, integrated with conventional neural network architectures. This combination has shown promising results, achieving a recall of over 99\% in identifying prime numbers and 79\% for non-prime numbers from an inherently imbalanced sequential series of integers, while exhibiting rapid model convergence before the completion of a single training epoch. We performed training using $10^6$ integers starting from a specified integer and tested on a different range of $2 \times 10^6$ integers extending from $10^6$ to $3 \times 10^6$, offset by the same starting integer. While constrained by the memory capacity of our resources, which limited our analysis to a span of $3\times10^6$, we believe that our study contribute to the application of machine learning in prime number analysis. This work aims to demonstrate the potential of such applications and hopes to inspire further exploration and possibilities in diverse fields.

Exploring Prime Number Classification: Achieving High Recall Rate and Rapid Convergence with Sparse Encoding

TL;DR

The paper tackles prime-number classification by reframing it as a machine-learning problem under imbalanced data. It introduces a highly sparse encoding of integer sequences into a four-dimensional tensor and processes it with a ResNet feature extractor plus Transformer classifier, trained with weighted cross-entropy and a 5% resampling strategy. The approach achieves around recall for primes and recall for non-primes, with rapid convergence within a few thousand iterations, though it suffers a notable false positive rate driven mainly by semi-primes. The work demonstrates the feasibility of applying conventional deep learning architectures to prime number analysis and highlights avenues for reducing false positives and gaining deeper number-theoretic insights.

Abstract

This paper presents a novel approach at the intersection of machine learning and number theory, focusing on the classification of prime and non-prime numbers. At the core of our research is the development of a highly sparse encoding method, integrated with conventional neural network architectures. This combination has shown promising results, achieving a recall of over 99\% in identifying prime numbers and 79\% for non-prime numbers from an inherently imbalanced sequential series of integers, while exhibiting rapid model convergence before the completion of a single training epoch. We performed training using integers starting from a specified integer and tested on a different range of integers extending from to , offset by the same starting integer. While constrained by the memory capacity of our resources, which limited our analysis to a span of , we believe that our study contribute to the application of machine learning in prime number analysis. This work aims to demonstrate the potential of such applications and hopes to inspire further exploration and possibilities in diverse fields.
Paper Structure (15 sections, 6 equations, 7 figures, 3 tables)

This paper contains 15 sections, 6 equations, 7 figures, 3 tables.

Figures (7)

  • Figure 1: Integrated ResNet and Transformer architecture employed for prime number classification
  • Figure 2: Probability density of prime counts in subranges
  • Figure 3: Variations in classification recall: Effects of sequence length and prime class weights
  • Figure 4: Transition of testing results over iterations
  • Figure 5: Comparison of 5% resampling and conventional training strategies
  • ...and 2 more figures