Table of Contents
Fetching ...

Bayesian Modeling of Collatz Stopping Times: A Probabilistic Machine Learning Perspective

Nicolò Bonacorsi, Matteo Bordoni

TL;DR

Conditioning the block-length distribution on $m\bmod 8$ markedly improves the generator's distributional fit, indicating that low-order modular structure is a key driver of heterogeneity in $\tau(n)$.

Abstract

We study the Collatz total stopping time $τ(n)$ over $n\le 10^7$ from a probabilistic machine learning viewpoint. Empirically, $τ(n)$ is a skewed and heavily overdispersed count with pronounced arithmetic heterogeneity. We develop two complementary models. First, a Bayesian hierarchical Negative Binomial regression (NB2-GLM) predicts $τ(n)$ from simple covariates ($\log n$ and residue class $n \bmod 8$), quantifying uncertainty via posterior and posterior predictive distributions. Second, we propose a mechanistic generative approximation based on the odd-block decomposition: for odd $m$, write $3m+1=2^{K(m)}m'$ with $m'$ odd and $K(m)=v_2(3m+1)\ge 1$; randomizing these block lengths yields a stochastic approximation calibrated via a Dirichlet-multinomial update. On held-out data, the NB2-GLM achieves substantially higher predictive likelihood than the odd-block generators. Conditioning the block-length distribution on $m\bmod 8$ markedly improves the generator's distributional fit, indicating that low-order modular structure is a key driver of heterogeneity in $τ(n)$.

Bayesian Modeling of Collatz Stopping Times: A Probabilistic Machine Learning Perspective

TL;DR

Conditioning the block-length distribution on markedly improves the generator's distributional fit, indicating that low-order modular structure is a key driver of heterogeneity in .

Abstract

We study the Collatz total stopping time over from a probabilistic machine learning viewpoint. Empirically, is a skewed and heavily overdispersed count with pronounced arithmetic heterogeneity. We develop two complementary models. First, a Bayesian hierarchical Negative Binomial regression (NB2-GLM) predicts from simple covariates ( and residue class ), quantifying uncertainty via posterior and posterior predictive distributions. Second, we propose a mechanistic generative approximation based on the odd-block decomposition: for odd , write with odd and ; randomizing these block lengths yields a stochastic approximation calibrated via a Dirichlet-multinomial update. On held-out data, the NB2-GLM achieves substantially higher predictive likelihood than the odd-block generators. Conditioning the block-length distribution on markedly improves the generator's distributional fit, indicating that low-order modular structure is a key driver of heterogeneity in .
Paper Structure (31 sections, 23 equations, 8 figures, 2 tables)

This paper contains 31 sections, 23 equations, 8 figures, 2 tables.

Figures (8)

  • Figure 1: Empirical distribution of $\tau(n)$ for $1\le n\le N$ (integer-aligned bins, width 2) with a KDE overlay computed on a large subsample to reduce noise. This motivates an overdispersed count likelihood.
  • Figure 2: Scatter of $\tau(n)$ vs. $n$ (log-$x$). The mean increases slowly and is approximately linear as a function of $\log n$, while the spread grows with $n$; banding suggests modular structure, motivating $\log n$ and $n\bmod 8$ as covariates.
  • Figure 3: Posterior predictive check for the hierarchical NB2-GLM (Model M3). The PPC matches the bulk well and mildly overestimates extreme right-tail mass.
  • Figure 4: Empirical block-length distribution $\hat{p}_k$ for $K=v_2(3m+1)$ (odd $m\le N$) vs. geometric reference $2^{-k}$ on log-$y$. This evaluates the "geometric $K$" heuristic.
  • Figure 5: Dirichlet posterior for $(p_k)$ (log scale) with uncertainty bars, compared to the geometric reference $2^{-k}$.
  • ...and 3 more figures