A Survey on LLM Inference-Time Self-Improvement
Xiangjue Dong, Maria Teleki, James Caverlee
TL;DR
The paper addresses the need to improve LLM performance during inference without parameter updates. It offers a taxonomy and comprehensive survey across Independent, Context-Aware, and Model-Aided ITSI methods, detailing constrained/contrastive decoding, retrieval, speculative decoding, and tool usage. Key contributions include compiling high-quality recent studies, clarifying method trade-offs, and outlining challenges and future directions. The work has practical significance for enabling cost-effective, scalable improvements to LLMs with frozen parameters while highlighting ethical and interpretability considerations.
Abstract
Techniques that enhance inference through increased computation at test-time have recently gained attention. In this survey, we investigate the current state of LLM Inference-Time Self-Improvement from three different perspectives: Independent Self-improvement, focusing on enhancements via decoding or sampling methods; Context-Aware Self-Improvement, leveraging additional context or datastore; and Model-Aided Self-Improvement, achieving improvement through model collaboration. We provide a comprehensive review of recent relevant studies, contribute an in-depth taxonomy, and discuss challenges and limitations, offering insights for future research.
