PersonaMath: Boosting Mathematical Reasoning via Persona-Driven Data Augmentation
Jing Luo, Longze Chen, Run Luo, Liang Zhu, Chang Ao, Jiaming Li, Yukun Chen, Xin Cheng, Wen Yang, Jiayuan Su, Ahmadreza Argha, Hamid Alinejad-Rokny, Chengming Li, Shiwen Ni, Min Yang
TL;DR
The paper tackles the open-source vs. closed-source gap in mathematical reasoning by introducing PersonaMathQA, a persona-driven data augmentation dataset derived from MATH and GSM8K. It employs a two-stage pipeline: Stage 1 uses a closed LLM to generate detailed CoT and rewrites questions across 11 ISCO-08 occupation-based personas to diversify data; Stage 2 uses reflection on misanswered items to regenerate corrected CoT with increased emphasis on hard problems. Fine-tuning open-source models on PersonaMathQA yields state-of-the-art results on MATH and GSM8K (e.g., PersonaMath-7B reaching 61.2% and 87.8%), despite the dataset being smaller than some baselines. The approach demonstrates data efficiency, introduces occupation-based persona classification for data diversity, and publicly releases the dataset, models, and code.
Abstract
While closed-source Large Language Models (LLMs) demonstrate strong mathematical problem-solving abilities, open-source models still face challenges with such tasks. To bridge this gap, we propose a data augmentation approach and introduce PersonaMathQA, a dataset derived from MATH and GSM8K, on which we train the PersonaMath models. Our approach consists of two stages: the first stage focuses on learning from Persona Diversification, and the second stage emphasizes learning from Reflection. In the first stage, we regenerate detailed chain-of-thought (CoT) solutions as instructions using a closed-source LLM and introduce a persona-driven data augmentation technique. This technique innovatively classifies personas based on occupations, significantly enhancing the dataset's diversity and quality. In the second stage, we incorporate reflection to fully leverage more challenging and valuable questions. Evaluation of our PersonaMath models on MATH and GSM8K reveals that the PersonaMath-7B model (based on Qwen2.5-7B) achieves an accuracy of 61.2% on MATH and 87.8% on GSM8K, surpassing all baseline methods and achieving state-of-the-art performance. Notably, our dataset contains only 128.9K data points-merely 32.6% of MetaMathQA and 49.5% of MathInstruct-yet our model outperforms these baselines, demonstrating the high quality and diversity of our dataset, which enables more efficient model training. We open-source the PersonaMathQA dataset, PersonaMath models, and our code for public usage.
