Table of Contents
Fetching ...

Spectral Logit Sculpting: Adaptive Low-Rank Logit Transformation for Controlled Text Generation

Jin Li, Zhebo Wang, Tianliang Lu, Mohan Li, Wenpeng Xing, Meng Han

TL;DR

Spectral Logit Sculpting (SLS) introduces an inference-time, parameter-free approach to sharpen LLM output distributions by maintaining a sliding buffer of top-$K$ logits, performing data-centered SVD to extract dominant spectral directions, and adaptively rescaling logits based on entropy and logit gaps. Activation of the spectral update is conditional on high uncertainty, reducing unnecessary computation while preserving contextual coherence. Empirical results across mathematics, coding, and science benchmarks show SLS outperforms entropy-minimization baselines and self-consistency methods on both base and math-tuned Qwen models, with notable gains in Math500, AMC, LeetCode, and UGPhysics. This method offers a practically scalable decoding enhancement with minimal overhead and no parameter updates, making it well-suited for large-scale deployment; future work will explore theoretical foundations and multimodal extensions.

Abstract

Entropy-based inference methods have gained traction for improving the reliability of Large Language Models (LLMs). However, many existing approaches, such as entropy minimization techniques, suffer from high computational overhead and fail to leverage historical token context effectively. To address these limitations, we propose Spectral Logit Sculpting (SLS), a lightweight inference-time optimization method that dynamically modulates token distributions using spectral and entropic properties of recent logits. SLS maintains a sliding buffer of top-K logits, performs on-the-fly Singular Value Decomposition (SVD) to identify dominant spectral directions, and adaptively rescales logits based on both entropy and logit gap statistics--only activating when uncertainty is high. Without updating any model parameters, SLS effectively sharpens the output distribution while preserving contextual consistency. Experimental results on multiple public benchmarks demonstrate that SLS consistently outperforms existing baseline methods, achieving superior accuracy in mathematical, coding, and scientific reasoning tasks.

Spectral Logit Sculpting: Adaptive Low-Rank Logit Transformation for Controlled Text Generation

TL;DR

Spectral Logit Sculpting (SLS) introduces an inference-time, parameter-free approach to sharpen LLM output distributions by maintaining a sliding buffer of top- logits, performing data-centered SVD to extract dominant spectral directions, and adaptively rescaling logits based on entropy and logit gaps. Activation of the spectral update is conditional on high uncertainty, reducing unnecessary computation while preserving contextual coherence. Empirical results across mathematics, coding, and science benchmarks show SLS outperforms entropy-minimization baselines and self-consistency methods on both base and math-tuned Qwen models, with notable gains in Math500, AMC, LeetCode, and UGPhysics. This method offers a practically scalable decoding enhancement with minimal overhead and no parameter updates, making it well-suited for large-scale deployment; future work will explore theoretical foundations and multimodal extensions.

Abstract

Entropy-based inference methods have gained traction for improving the reliability of Large Language Models (LLMs). However, many existing approaches, such as entropy minimization techniques, suffer from high computational overhead and fail to leverage historical token context effectively. To address these limitations, we propose Spectral Logit Sculpting (SLS), a lightweight inference-time optimization method that dynamically modulates token distributions using spectral and entropic properties of recent logits. SLS maintains a sliding buffer of top-K logits, performs on-the-fly Singular Value Decomposition (SVD) to identify dominant spectral directions, and adaptively rescales logits based on both entropy and logit gap statistics--only activating when uncertainty is high. Without updating any model parameters, SLS effectively sharpens the output distribution while preserving contextual consistency. Experimental results on multiple public benchmarks demonstrate that SLS consistently outperforms existing baseline methods, achieving superior accuracy in mathematical, coding, and scientific reasoning tasks.

Paper Structure

This paper contains 12 sections, 7 equations, 1 figure, 2 tables.

Figures (1)

  • Figure 1: Overview of the proposed SLS framework. The input sequence is processed by the language model to produce token logits $\mathbf{Z}_t$. Top-$K$ logits $\mathbf{z}_t$ are extracted for entropy computation $H_t$. If $H_t$ exceeds threshold $H_{\mathrm{thres}}$, SVD is applied to the sliding logit buffer to obtain principal components. The current logits are then adaptive rescaled via spectral projection before sampling. Finally, the restructured logits are used to update the buffer.