Combinatorial Reasoning: Selecting Reasons in Generative AI Pipelines via Combinatorial Optimization

Mert Esencan; Tarun Advaith Kumar; Ata Akbari Asanjan; P. Aaron Lott; Masoud Mohseni; Can Unlu; Davide Venturelli; Alan Ho

Combinatorial Reasoning: Selecting Reasons in Generative AI Pipelines via Combinatorial Optimization

Mert Esencan, Tarun Advaith Kumar, Ata Akbari Asanjan, P. Aaron Lott, Masoud Mohseni, Can Unlu, Davide Venturelli, Alan Ho

TL;DR

The paper tackles the challenge that autonomous reasoning in large language models (LLMs) remains limited and prompts a need for automated, scalable prompting strategies. It introduces Combinatorial Reasoning (CR), a fully automated pipeline that samples candidate reasoning pieces from an LLM, encodes their relations into a Quadratic Unconstrained Binary Optimization (QUBO) problem, and uses Ising-machine solvers or related probabilistic optimizers to select a subset of reasons for a Chain-of-Thought (CoT) style final prompt. The authors formalize the QUBO construction (with $H=-(\tilde{L}+Q)$ and binary encodings $z_i= \sum_{w=0}^{W-1} 2^w x_{iw}$), demonstrate a sampling-and-optimization workflow, and validate CR on the BIG-Bench Hard (BBH) reasoning suite, showing improved average CoT performance over zero-shot and USP baselines and competitive human-level assessment on some tasks. They also discuss hardware-accelerated solvers (e.g., Digital Annealer) and potential integrations with theorem provers and retrieval-augmented generation, highlighting CR as a promising route to automated, scalable enhancement of AI reasoning in real-world knowledge tasks.

Abstract

Recent Large Language Models (LLMs) have demonstrated impressive capabilities at tasks that require human intelligence and are a significant step towards human-like artificial intelligence (AI). Yet the performance of LLMs at reasoning tasks have been subpar and the reasoning capability of LLMs is a matter of significant debate. While it has been shown that the choice of the prompting technique to the LLM can alter its performance on a multitude of tasks, including reasoning, the best performing techniques require human-made prompts with the knowledge of the tasks at hand. We introduce a framework for what we call Combinatorial Reasoning (CR), a fully-automated prompting method, where reasons are sampled from an LLM pipeline and mapped into a Quadratic Unconstrained Binary Optimization (QUBO) problem. The framework investigates whether QUBO solutions can be profitably used to select a useful subset of the reasons to construct a Chain-of-Thought style prompt. We explore the acceleration of CR with specialized solvers. We also investigate the performance of simpler zero-shot strategies such as linear majority rule or random selection of reasons. Our preliminary study indicates that coupling a combinatorial solver to generative AI pipelines is an interesting avenue for AI reasoning and elucidates design principles for future CR methods.

Combinatorial Reasoning: Selecting Reasons in Generative AI Pipelines via Combinatorial Optimization

TL;DR

and binary encodings

), demonstrate a sampling-and-optimization workflow, and validate CR on the BIG-Bench Hard (BBH) reasoning suite, showing improved average CoT performance over zero-shot and USP baselines and competitive human-level assessment on some tasks. They also discuss hardware-accelerated solvers (e.g., Digital Annealer) and potential integrations with theorem provers and retrieval-augmented generation, highlighting CR as a promising route to automated, scalable enhancement of AI reasoning in real-world knowledge tasks.

Abstract

Paper Structure (39 sections, 9 equations, 3 figures, 4 tables)

This paper contains 39 sections, 9 equations, 3 figures, 4 tables.

Introduction
Preliminaries and Prior Work
Large Language Models
Reasoning in Large Language Models
Chain of Thought
Self-Consistency
Universal Self-Adaptive Prompting (USP)
Combinatorial Optimization and Ising Machines
Simulated Annealing and Parallel Tempering
Combinatorial Reasoning
Sampling of Reasons
QUBO Mapping
Combinatorial Optimization Solver
Final Prompt Creation
Experimental Results
...and 24 more sections

Figures (3)

Figure 1: Workflow for Combinatorial Reasoning. The initial prompt is processed by the LLM $N$ times and the answers are filtered through a semantic matching procedure to produce answers with distinct reasons. The ensemble is mapped into a QUBO problem solved by an Ising machine. The final solution determines a set of reasons to be added to the prompt for a final LLM call that determines the final answer.
Figure 2: The performance of combinatorial reasoning (CR) against other methods. Human and USP results are reported from the publications for BBH and USP respectively wan_universal_2023suzgun2022challenging. USP is evaluated on a different, but comparable, LLM PaLM 2-M. Table \ref{['tab:main']} presents the cumulative results across BBH for these various tasks. Tasks marked with $\Lambda$ are algorithmic tasks while the others are NLP tasks.
Figure 3: Baseline analysis for Quadratic CR (same as main text) with Linear CR and Random Reasons. Overall performance across the ten datasets were Quadratic CR: $65.2\%$, Linear CR: $68.2\%$, Random: $57.4\%$. 0-shot and 0-shot CoT results are included for reference. The individual tasks are ordered according to the performance of 0-shot CoT.

Combinatorial Reasoning: Selecting Reasons in Generative AI Pipelines via Combinatorial Optimization

TL;DR

Abstract

Combinatorial Reasoning: Selecting Reasons in Generative AI Pipelines via Combinatorial Optimization

Authors

TL;DR

Abstract

Table of Contents

Figures (3)