SoT: Structured-of-Thought Prompting Guides Multilingual Reasoning in Large Language Models
Rui Qi, Zhibo Man, Yufeng Chen, Fengran Mo, Jinan Xu, Kaiyu Huang
TL;DR
The paper addresses multilingual reasoning gaps in large language models by introducing Structured-of-Thought (SoT), a training-free prompting framework that aligns cross-language reasoning via a multi-step process. SoT comprises Language Thinking Transformation, Structured Knowledge Extraction, Language-Specific Knowledge Injection, and Answer Generation, formalized with $\mathcal{R}$, $\mathcal{K}$, $\mathcal{K}^{L_s}$, and $\mathcal{F}$ to guide reasoning and output in the source language. Empirical results on MGSM, MSVAMP, and XCOPA show SoT outperforms several training-free baselines and remains compatible with CoT and ICL, while incurring modest time and token overhead. These findings suggest SoT provides robust, scalable improvements for multilingual reasoning without requiring language-specific fine-tuning, enabling broader deployment across languages and backbones.
Abstract
Recent developments have enabled Large Language Models (LLMs) to engage in complex reasoning tasks through deep thinking. However, the capacity of reasoning has not been successfully transferred to non-high-resource languages due to resource constraints, which struggles with multilingual reasoning tasks. To this end, we propose Structured-of-Thought (SoT), a training-free method that improves the performance on multilingual reasoning through a multi-step transformation: Language Thinking Transformation and Structured Knowledge Transformation. The SoT method converts language-specific semantic information into language-agnostic structured representations, enabling the models to understand the query in different languages more sophisticated. Besides, SoT effectively guides LLMs toward more concentrated reasoning to maintain consistent underlying reasoning pathways when handling cross-lingual variations in expression. Experimental results demonstrate that SoT outperforms several strong baselines on multiple multilingual reasoning benchmarks when adapting to various backbones of LLMs. It can also be integrated with other training-free strategies for further improvements. Our code is available at https://github.com/Cherry-qwq/SoT.
