LLM Reasoning Engine: Specialized Training for Enhanced Mathematical Reasoning
Shuguang Chen, Guang Lin
TL;DR
This work tackles the difficulty of mathematical reasoning in LLMs, addressing data scarcity and error propagation by introducing a paraphrase-based data augmentation pipeline and specialized multitask training objectives. The method combines question paraphrasing via GPT-4 with Rationale Re-Ranking and Mistake Identification within a multitask fine-tuning framework, culminating in a final objective $\mathcal{L}_{final}(\theta)= \lambda_{1}\mathcal{L}_{SFT} + \lambda_{2}\mathcal{L}_{RR} + \lambda_{3}\mathcal{L}_{MI}$. Experiments across four open-source models and four math-oriented datasets show consistent gains, with larger improvements for weaker models and notable benefits when paraphrase is combined with RR/MI. The results highlight the value of linguistic diversification and structured reasoning guidance for mathematical problem solving, and point to practical implications for real-world tasks requiring reliable math reasoning, potentially amplified by integrating symbolic computation. Future work may explore hybrid neural-symbolic approaches to further reduce arithmetic errors and improve reliability in long reasoning chains.
Abstract
Large Language Models (LLMs) have shown remarkable performance in various natural language processing tasks but face challenges in mathematical reasoning, where complex problem-solving requires both linguistic understanding and mathematical reasoning skills. Existing approaches to address this challenge often rely on ensemble methods and suffer from the problem of data scarcity in target domains. In this work, we present a novel method to enhance LLMs' capabilities in mathematical reasoning tasks. Motivated by the need to bridge this gap, our approach incorporates a question paraphrase strategy, which aims at diversifying the linguistic forms of mathematical questions to improve generalization. Additionally, specialized training objectives are employed to guide the model's learning process, focusing on enhancing its understanding of mathematical concepts and reasoning processes. We conduct experiments on four datasets using different LLMs, and demonstrate the effectiveness of our approach in improving LLMs' performance on mathematical reasoning tasks. Our findings underscore the significance of our methodology in the advancement of large language models and its potential implications for real-world applications that require mathematical reasoning abilities.
