Table of Contents
Fetching ...

Towards Better Code Generation: Adaptive Decoding with Uncertainty Guidance

Kaifeng He, Mingwei Liu, Chong Wang, Zike Li, Yanlin Wang, Xin Peng, Zibin Zheng

TL;DR

AdaDec, a framework that introduces a selective pause-then-rerank mechanism into the decoding pipeline, and dynamically triggers a lookahead-based evaluation to re-score candidate tokens, which suggest that uncertainty-aware adaptive strategies are key to making LLM-driven code generation both robust and practical.

Abstract

The success of code synthesis using large language models (LLMs) depends heavily on navigating critical decision points during the decoding process. Standard uniform strategies, such as greedy decoding, often fall short because they fail to distinguish between deterministic steps and those characterized by high logical ambiguity. Our empirical analysis identifies a recurring failure mode: "logic drift" caused by the model's inability to correctly rank viable candidates during high-uncertainty intervals, even when the ground-truth token is available. To resolve this, we present AdaDec, a framework that introduces a selective pause-then-rerank mechanism into the decoding pipeline. Unlike static methods, AdaDec utilizes learned, model-specific entropy thresholds to identify when the model is "confused" and dynamically triggers a lookahead-based evaluation to re-score candidate tokens. Across benchmarks including HumanEval+, MBPP+, and DevEval, AdaDec achieves significant performance breakthroughs, boosting Pass@1 accuracy by up to 20.9% absolute over greedy decoding. The framework not only surpasses traditional Beam Search and specialized methods like AdapT in terms of reliability but also maintains high inference efficiency by intervening only at the most consequential steps. These results suggest that uncertainty-aware adaptive strategies are key to making LLM-driven code generation both robust and practical.

Towards Better Code Generation: Adaptive Decoding with Uncertainty Guidance

TL;DR

AdaDec, a framework that introduces a selective pause-then-rerank mechanism into the decoding pipeline, and dynamically triggers a lookahead-based evaluation to re-score candidate tokens, which suggest that uncertainty-aware adaptive strategies are key to making LLM-driven code generation both robust and practical.

Abstract

The success of code synthesis using large language models (LLMs) depends heavily on navigating critical decision points during the decoding process. Standard uniform strategies, such as greedy decoding, often fall short because they fail to distinguish between deterministic steps and those characterized by high logical ambiguity. Our empirical analysis identifies a recurring failure mode: "logic drift" caused by the model's inability to correctly rank viable candidates during high-uncertainty intervals, even when the ground-truth token is available. To resolve this, we present AdaDec, a framework that introduces a selective pause-then-rerank mechanism into the decoding pipeline. Unlike static methods, AdaDec utilizes learned, model-specific entropy thresholds to identify when the model is "confused" and dynamically triggers a lookahead-based evaluation to re-score candidate tokens. Across benchmarks including HumanEval+, MBPP+, and DevEval, AdaDec achieves significant performance breakthroughs, boosting Pass@1 accuracy by up to 20.9% absolute over greedy decoding. The framework not only surpasses traditional Beam Search and specialized methods like AdapT in terms of reliability but also maintains high inference efficiency by intervening only at the most consequential steps. These results suggest that uncertainty-aware adaptive strategies are key to making LLM-driven code generation both robust and practical.

Paper Structure

This paper contains 38 sections, 6 equations, 7 figures, 7 tables.

Figures (7)

  • Figure 1: Illustration of irreversible semantic errors in code due to high-uncertainty points.
  • Figure 2: Entropy comparison between drift points and non-drift decoding steps.
  • Figure 3: Change in the average rank of ground-truth tokens above and below a given entropy threshold, as the threshold increases.
  • Figure 4: Change in the percentage of decoding steps exceeding a given entropy threshold as the threshold increases.
  • Figure 5: Approach Overview of AdaDec
  • ...and 2 more figures