CoMAT: Chain of Mathematically Annotated Thought Improves Mathematical Reasoning

Joshua Ong Jun Leang; Aryo Pradipta Gema; Shay B. Cohen

CoMAT: Chain of Mathematically Annotated Thought Improves Mathematical Reasoning

Joshua Ong Jun Leang, Aryo Pradipta Gema, Shay B. Cohen

TL;DR

CoMAT tackles the challenge of mathematical reasoning in large language models by introducing a two-stage, within-LLM symbolic reasoning framework: Symbolic Conversion to create a formal representation and Reasoning Execution to derive solutions. It eliminates external solvers, improving faithfulness and verifiability while delivering strong gains across diverse benchmarks and languages, including Olympiad-level problems and low-resource contexts. Ablation and Shapley-value analyses show that all four symbolic steps contribute, with the initial conversion step being particularly crucial for accuracy. Overall, CoMAT demonstrates robust, transparent reasoning with improved performance and error traceability across complex mathematical tasks.

Abstract

Mathematical reasoning remains a significant challenge for large language models (LLMs), despite progress in prompting techniques such as Chain-of-Thought (CoT). We present Chain of Mathematically Annotated Thought (CoMAT), which enhances reasoning through two stages: Symbolic Conversion (converting natural language queries into symbolic form) and Reasoning Execution (deriving answers from symbolic representations). CoMAT operates entirely with a single LLM and without external solvers. Across four LLMs, CoMAT outperforms traditional CoT on six out of seven benchmarks, achieving gains of 4.48% on MMLU-Redux (MATH) and 4.58% on GaoKao MCQ. In addition to improved performance, CoMAT ensures faithfulness and verifiability, offering a transparent reasoning process for complex mathematical tasks

CoMAT: Chain of Mathematically Annotated Thought Improves Mathematical Reasoning

TL;DR

Abstract

CoMAT: Chain of Mathematically Annotated Thought Improves Mathematical Reasoning

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (9)