Top-b: Entropic Regulation of Relative Probability Bands in Autoregressive Language Processes

Deepon Halder; Raj Dabre

Top-b: Entropic Regulation of Relative Probability Bands in Autoregressive Language Processes

Deepon Halder, Raj Dabre

Abstract

Probabilistic language generators are theoretically modeled as discrete stochastic processes, yet standard decoding strategies (Top-k, Top-p) impose static truncation rules that fail to accommodate the dynamic information density of natural language. This misalignment often forces a suboptimal trade-off: static bounds are either too restrictive for high-entropy creative generation or too permissive for low-entropy logical reasoning. In this work, we formalize the generation process as a trajectory through a relative probability manifold. We introduce Top-b (Adaptive Relative Band Sampling), a decoding strategy that regulates the candidate set via a dynamic bandwidth coefficient coupled strictly to the instantaneous Shannon entropy of the model's distribution. We provide a theoretical framework demonstrating that Top-b acts as a variance-minimizing operator on the tail distribution. Empirical validation on GPQA and GSM8K benchmarks indicates that Top-b significantly reduces generation entropy and inter-decoding variance while maintaining competitive reasoning accuracy, effectively approximating a self-regulating control system for autoregressive generation.

Top-b: Entropic Regulation of Relative Probability Bands in Autoregressive Language Processes

Abstract

Paper Structure (25 sections, 2 theorems, 9 equations, 5 figures, 3 tables)

This paper contains 25 sections, 2 theorems, 9 equations, 5 figures, 3 tables.

Introduction
Related Work
Static and Distribution-Agnostic Truncation
Entropy-Aware and Information-Theoretic Sampling
Relative Thresholding and Min-p
Theoretical Stability in Reasoning
Formalism of Language Processes
The Discrete Language Process
Quantifying Uncertainty and The Mode
Locally Adaptive Relative Sampling (Top-b)
Defining the Dynamic Support
Adaptive Bandwidth via Entropy
Theoretical Behavior
Experiments
Experimental Settings
...and 10 more sections

Key Result

Proposition 1

The Top-b mechanism exhibits the following asymptotic behaviors:

Figures (5)

Figure 1: Structural comparison of static cumulative truncation (Top-$p$) versus entropy-regulated relative thresholding (Top-$b$).(a) In low-entropy reasoning regimes, Top-$p$ admits a long tail of low-probability distractor tokens, increasing the risk of logical incoherence. (b) In high-entropy creative regimes, Top-$p$'s static cumulative threshold arbitrarily truncates viable tokens, artificially restricting diversity. In contrast, Top-$b$ establishes a dynamic probability band anchored to the distribution mode ($p_{\max}$). (c) Under low entropy, the Top-$b$ bandwidth strictly contracts to prune the distractor tail and enforce deterministic reasoning. (d) Under high entropy, the bandwidth expands to safely retain linguistic diversity. This illustrates how Top-$b$ continuously adapts its sampling support to the local information density of the language process.
Figure 2: Accuracy variance across random seeds on GPQA. Top-b exhibits the lowest variance, indicating higher deterministic stability compared to Top-p and stochastic sampling methods.
Figure 3: Entropy trajectory over generation steps for Top-b and Top-p sampling. Top-b induces a monotonic reduction in entropy, while Top-p maintains higher entropy in later stages due to broader candidate sets.
Figure 4: Interaction between bandwidth parameter $b$ and temperature $T$ on GPQA performance. Mid-range $b$ values regularize high-temperature sampling, improving accuracy without inducing mode collapse.
Figure 5: Schematic of entropy-induced branching. Top-b acts as a pruning operator, collapsing diffuse branches into a single high-likelihood continuation.

Theorems & Definitions (3)

Definition 1: Top-b Support Set
Proposition 1: Entropy-Scaled Constraints
Lemma 1

Top-b: Entropic Regulation of Relative Probability Bands in Autoregressive Language Processes

Abstract

Top-b: Entropic Regulation of Relative Probability Bands in Autoregressive Language Processes

Authors

Abstract

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (3)