MITS: Enhanced Tree Search Reasoning for LLMs via Pointwise Mutual Information
Jiaxi Li, Yucheng Shi, Jin Lu, Ninghao Liu
TL;DR
This work addresses the challenge of efficiently exploring the space of reasoning paths in large language models. It introduces Mutual Information Tree Search (MITS), which uses pointwise mutual information (PMI) to score intermediate reasoning steps, enabling stepwise evaluation without costly look-ahead rollouts. MITS couples PMI-based beam search with entropy-driven dynamic sampling and a weighted voting scheme that fuses PMI with prediction consensus, achieving superior accuracy while maintaining computational efficiency across diverse reasoning benchmarks. The results demonstrate that information-theoretic guidance can substantially improve LLM reasoning performance and robustness, offering a principled framework for test-time reasoning.
Abstract
Tree search has become as a representative framework for test-time reasoning with large language models (LLMs), exemplified by methods such as Tree-of-Thought and Monte Carlo Tree Search that explore multiple reasoning paths. However, it remains difficult to provide instant and reliable quantitative assessments of intermediate reasoning step quality, and extensive path exploration is computationally costly. To address this, we propose Mutual Information Tree Search (MITS), a novel framework that guides reasoning with information-theoretic principles. MITS introduces an effective scoring function based on pointwise mutual information (PMI), which enables step-wise evaluation of reasoning paths and search tree expansion via beam search without expensive look-ahead simulations, achieving superior reasoning performances while maintaining computational efficiency. The framework is complemented by an entropy-based dynamic sampling strategy that adaptively allocates computational resources to uncertain reasoning steps where exploration is most beneficial. For final prediction, MITS employs a weighted voting scheme that combines PMI scores with prediction consensus. Through comprehensive experiments on diverse reasoning benchmarks, MITS consistently surpasses baseline methods, establishing a principled and efficient framework for LLM reasoning.
