Thinking by Subtraction: Confidence-Driven Contrastive Decoding for LLM Reasoning

Lexiang Tang; Weihao Gao; Bingchen Zhao; Lu Ma; Qiao jin; Bang Yang; Yuexian Zou

Thinking by Subtraction: Confidence-Driven Contrastive Decoding for LLM Reasoning

Lexiang Tang, Weihao Gao, Bingchen Zhao, Lu Ma, Qiao jin, Bang Yang, Yuexian Zou

TL;DR

This work proposes Thinking by Subtraction, a confidence-driven contrastive decoding approach that improves reasoning reliability through targeted token-level intervention that significantly improves accuracy across mathematical reasoning benchmarks while substantially reducing output length.

Abstract

Recent work on test-time scaling for large language model (LLM) reasoning typically assumes that allocating more inference-time computation uniformly improves correctness. However, prior studies show that reasoning uncertainty is highly localized: a small subset of low-confidence tokens disproportionately contributes to reasoning errors and unnecessary output expansion. Motivated by this observation, we propose Thinking by Subtraction, a confidence-driven contrastive decoding approach that improves reasoning reliability through targeted token-level intervention. Our method, Confidence-Driven Contrastive Decoding, detects low-confidence tokens during decoding and intervenes selectively at these positions. It constructs a contrastive reference by replacing high-confidence tokens with minimal placeholders, and refines predictions by subtracting this reference distribution at low-confidence locations. Experiments show that CCD significantly improves accuracy across mathematical reasoning benchmarks while substantially reducing output length, with minimal KV-cache overhead. As a training-free method, CCD enhances reasoning reliability through targeted low-confidence intervention without computational redundancy. Our code will be made available at: https://github.com/bolo-web/CCD.

Thinking by Subtraction: Confidence-Driven Contrastive Decoding for LLM Reasoning

TL;DR

Abstract

Paper Structure (39 sections, 15 equations, 10 figures, 4 tables, 1 algorithm)

This paper contains 39 sections, 15 equations, 10 figures, 4 tables, 1 algorithm.

Introduction
Methodology
Overview
Preliminaries: Confidence-Based Uncertainty and Contrastive Decoding
Confidence-Driven Contrastive Decoding
Contrastive Distribution Construction.
Confidence-Driven Logit Subtraction.
Theoretical Analysis
From Confidence Correction to Contrastive Distribution Construction.
High-Confidence Tokens as Semantic Anchors.
Masking Anchors Increases Uncertainty.
Dual KV Cache Maintenance
Algorithms
Empirical and Statistical Evaluation
Experimental Setup
...and 24 more sections

Figures (10)

Figure 1: Trajectory-level relationship between token confidence and final answer correctness. Across 2,880 reasoning trajectories generated by Qwen3-8B on AIME24.
Figure 2: Overview of Confidence-Driven Contrastive Decoding (CCD). The decoding process consists of four key components: (1) online estimation of token-level confidence; (2) confidence-driven token selection that partitions chain-of-thought tokens into low-confidence (LC) and high-confidence (HC) sets; (3) contrastive decoding triggered at LC-CoT tokens to refine uncertain predictions; and (4) dual key--value (KV) cache maintenance to support selective intervention without disrupting standard autoregressive decoding.
Figure 3: Ablation on different replacement intervals
Figure 4: Ablation on using different special token.
Figure 6: Token-level confidence improvement at low-confidence decoding positions.
...and 5 more figures

Thinking by Subtraction: Confidence-Driven Contrastive Decoding for LLM Reasoning

TL;DR

Abstract

Thinking by Subtraction: Confidence-Driven Contrastive Decoding for LLM Reasoning

Authors

TL;DR

Abstract

Table of Contents

Figures (10)