Table of Contents
Fetching ...

Auxiliary-Hyperparameter-Free Sampling: Entropy Equilibrium for Text Generation

Xiaodong Cai, Hai Lin, Shaoxiong Zhan, Weiqi Luo, Hong-Gee Kim, Hongyan Hao, Yu Yang, Hai-Tao Zheng

TL;DR

The paper tackles the sensitivity and deployment burden of hyperparameter tuning in token sampling for large language models. It introduces Entropy Equilibrium Sampling (EES), an auxiliary-hyperparameter-free method that selects a dynamic candidate set by enforcing an entropy-mass equilibrium, and provides theoretical guarantees of existence and uniqueness for the equilibrium threshold. Through extensive experiments across multiple models and tasks, the authors demonstrate that EES achieves competitive accuracy and coherence while maintaining robust performance across temperature settings and eliminating hyperparameter dependence. The work offers a practical sampling alternative with favorable deployment properties and analyzes its impact on diversity, creative writing benchmarks, and human/LLM evaluations.

Abstract

Token sampling strategies critically influence text generation quality in large language models (LLMs). However, existing methods introduce additional hyperparameters, requiring extensive tuning and complicating deployment. We present Entropy Equilibrium Sampling (EES), an auxiliary hyperparameter-free approach inspired by information theory that can dynamically adjust candidate sets by balancing normalized entropy with probability mass. We evaluate EES on both reasoning and generation tasks across a range of model architectures. Our results show that EES consistently performs well across temperature settings, delivering competitive accuracy and coherence while maintaining diversity. By eliminating the need for hyperparameter tuning, EES greatly simplifies deployment while improving performance. Code is available at https://github.com/shuanncai/EES

Auxiliary-Hyperparameter-Free Sampling: Entropy Equilibrium for Text Generation

TL;DR

The paper tackles the sensitivity and deployment burden of hyperparameter tuning in token sampling for large language models. It introduces Entropy Equilibrium Sampling (EES), an auxiliary-hyperparameter-free method that selects a dynamic candidate set by enforcing an entropy-mass equilibrium, and provides theoretical guarantees of existence and uniqueness for the equilibrium threshold. Through extensive experiments across multiple models and tasks, the authors demonstrate that EES achieves competitive accuracy and coherence while maintaining robust performance across temperature settings and eliminating hyperparameter dependence. The work offers a practical sampling alternative with favorable deployment properties and analyzes its impact on diversity, creative writing benchmarks, and human/LLM evaluations.

Abstract

Token sampling strategies critically influence text generation quality in large language models (LLMs). However, existing methods introduce additional hyperparameters, requiring extensive tuning and complicating deployment. We present Entropy Equilibrium Sampling (EES), an auxiliary hyperparameter-free approach inspired by information theory that can dynamically adjust candidate sets by balancing normalized entropy with probability mass. We evaluate EES on both reasoning and generation tasks across a range of model architectures. Our results show that EES consistently performs well across temperature settings, delivering competitive accuracy and coherence while maintaining diversity. By eliminating the need for hyperparameter tuning, EES greatly simplifies deployment while improving performance. Code is available at https://github.com/shuanncai/EES

Paper Structure

This paper contains 43 sections, 1 theorem, 21 equations, 4 figures, 13 tables, 1 algorithm.

Key Result

Theorem 4.1

For any probability distribution $\{p_i\}_{i=1}^n$ sorted in descending order ($p_1 \geq p_2 \geq \cdots \geq p_n > 0$), there exists a unique $k^* \in \{1, 2, \ldots, n\}$ such that the algorithm converges.

Figures (4)

  • Figure 1: Hyperparameter sensitivity across temperatures. EES achieves consistent optimal performance without tuning, while top-p requires temperature-specific hyperparameter adjustment.
  • Figure 2: Mechanism of EES
  • Figure 3: Accuracy-diversity performance across sampling methods and temperatures on two QA datasets using Llama3.1-8B.
  • Figure 4: Accuracy of different sampling methods under various temperature and hyperparameter combinations on StrategyQA using Llama3.1-8B. Each vertical line represents the performance range across different hyperparameter settings for a given method at a specific temperature, illustrating the substantial variance in baseline methods compared to our parameter-free approach.

Theorems & Definitions (2)

  • Theorem 4.1
  • proof