Table of Contents
Fetching ...

CoT Vectors: Transferring and Probing the Reasoning Mechanisms of LLMs

Li Li, Ziyi Wang, Yongliang Wu, Jianfei Cai, Xu Yang

TL;DR

CoT Vectors extend the task-vector paradigm to multi-step reasoning by encoding reasoning as a compact, task-general vector that can be injected into LLM forward passes. The authors formalize CoT as a shift in attention and propose two acquisition methods: Extracted CoT Vectors (activation-based) and Learnable CoT Vectors (teacher–student optimized), with Learnable Vectors providing greater stability and competitive performance versus LoRA. Empirical results on Qwen2.5-Math-7B and LLaMA-3.1-8B-Instruct across GSM8K, MATH, and MMLU-Pro show that Learnable CoT Vectors achieve strong gains and cross-model/dataset transfer, while extracting vectors reveal a three-stage, shallow-middle-deep organization of reasoning. As a probe, CoT Vectors illuminate how latent space structure, information density, acquisition mechanisms, and pre-training shape multi-step reasoning, offering practical, low-cost, parameter-efficient guidance for enhancing and understanding LLM reasoning.

Abstract

Chain-of-Thought (CoT) prompting has emerged as a powerful approach to enhancing the reasoning capabilities of Large Language Models (LLMs). However, existing implementations, such as in-context learning and fine-tuning, remain costly and inefficient. To improve CoT reasoning at a lower cost, and inspired by the task vector paradigm, we introduce CoT Vectors, compact representations that encode task-general, multi-step reasoning knowledge. Through experiments with Extracted CoT Vectors, we observe pronounced layer-wise instability, manifesting as a U-shaped performance curve that reflects a systematic three-stage reasoning process in LLMs. To address this limitation, we propose Learnable CoT Vectors, optimized under a teacher-student framework to provide more stable and robust guidance. Extensive evaluations across diverse benchmarks and models demonstrate that CoT Vectors not only outperform existing baselines but also achieve performance comparable to parameter-efficient fine-tuning methods, while requiring fewer trainable parameters. Moreover, by treating CoT Vectors as a probe, we uncover how their effectiveness varies due to latent space structure, information density, acquisition mechanisms, and pre-training differences, offering new insights into the functional organization of multi-step reasoning in LLMs. The source code will be released.

CoT Vectors: Transferring and Probing the Reasoning Mechanisms of LLMs

TL;DR

CoT Vectors extend the task-vector paradigm to multi-step reasoning by encoding reasoning as a compact, task-general vector that can be injected into LLM forward passes. The authors formalize CoT as a shift in attention and propose two acquisition methods: Extracted CoT Vectors (activation-based) and Learnable CoT Vectors (teacher–student optimized), with Learnable Vectors providing greater stability and competitive performance versus LoRA. Empirical results on Qwen2.5-Math-7B and LLaMA-3.1-8B-Instruct across GSM8K, MATH, and MMLU-Pro show that Learnable CoT Vectors achieve strong gains and cross-model/dataset transfer, while extracting vectors reveal a three-stage, shallow-middle-deep organization of reasoning. As a probe, CoT Vectors illuminate how latent space structure, information density, acquisition mechanisms, and pre-training shape multi-step reasoning, offering practical, low-cost, parameter-efficient guidance for enhancing and understanding LLM reasoning.

Abstract

Chain-of-Thought (CoT) prompting has emerged as a powerful approach to enhancing the reasoning capabilities of Large Language Models (LLMs). However, existing implementations, such as in-context learning and fine-tuning, remain costly and inefficient. To improve CoT reasoning at a lower cost, and inspired by the task vector paradigm, we introduce CoT Vectors, compact representations that encode task-general, multi-step reasoning knowledge. Through experiments with Extracted CoT Vectors, we observe pronounced layer-wise instability, manifesting as a U-shaped performance curve that reflects a systematic three-stage reasoning process in LLMs. To address this limitation, we propose Learnable CoT Vectors, optimized under a teacher-student framework to provide more stable and robust guidance. Extensive evaluations across diverse benchmarks and models demonstrate that CoT Vectors not only outperform existing baselines but also achieve performance comparable to parameter-efficient fine-tuning methods, while requiring fewer trainable parameters. Moreover, by treating CoT Vectors as a probe, we uncover how their effectiveness varies due to latent space structure, information density, acquisition mechanisms, and pre-training differences, offering new insights into the functional organization of multi-step reasoning in LLMs. The source code will be released.

Paper Structure

This paper contains 34 sections, 10 equations, 7 figures, 15 tables.

Figures (7)

  • Figure 1: Overview of our approach. (a) Standard LLM may struggle to produce a correct reasoning chain for a complex problem. (b) Conventional fine-tuning adapt the model to such tasks by training on a support set, but requires updating model parameters, incurring high computational cost. (c) Our proposed CoT Vector leverages the support set to obtain a compact reasoning representation, which can be injected into the forward process of model to guide reasoning efficiently.
  • Figure 2: Methods for acquiring and applying CoT Vectors. (a) Extracted CoT Vector is obtained by recording the activation gap at the $k$-th layer between inputs with and without CoT. (b) Learnable CoT Vector is inserted into the $k$-th layer activations of a student sequence without CoT, and trained by aligning the student’s final answer-token hidden states with those of a teacher sequence that includes CoT. (c) At test time, CoT Vector is added to the activations at the $k$-th layer during each forward pass of auto-regressive generation, guiding the reasoning process.
  • Figure 3: (a) Layer-wise performance of two LLMs with both extracted and Learnable CoT Vectors, averaged over four datasets. (b) Layer-wise information density curves of two LLMs, obtained via PCA on 500 sampled instances across four datasets. Abbreviations: PC = principal component; var. = variance; cum. = cumulative; dims = dimensions. (c–d) T-SNE visualizations of hidden states at shallow, middle, and deep layers on GSM8K (500 samples) of two LLMs. Left: sample distributions under non-CoT and baseline (with CoT) inputs. Right: same baseline with additional insertion of Extracted and Learnable CoT Vectors. Color scheme is consistent across (a, c, d): orange = non-CoT, blue = baseline, green = Extracted CoT Vector, red = Learnable CoT Vector.
  • Figure 4: T-SNE visualization of over-fit and under-fit Learnable CoT Vectors (Layer 30 of LLaMA on GSM8K).
  • Figure 5: Layer-wise performance analysis of two LLMs and four benchmarks.
  • ...and 2 more figures