FormulaReasoning: A Dataset for Formula-Based Numerical Reasoning

Xiao Li; Bolin Zhu; Kaiwen Shi; Sichen Liu; Yin Zhu; Yiwei Liu; Gong Cheng

FormulaReasoning: A Dataset for Formula-Based Numerical Reasoning

Xiao Li, Bolin Zhu, Kaiwen Shi, Sichen Liu, Yin Zhu, Yiwei Liu, Gong Cheng

TL;DR

FormulaReasoning introduces a bilingual dataset for formula-based numerical reasoning that requires explicit physics formulas, such as $Q_{absorbed}=m c \Delta T$, to ground calculations. Each question is annotated with normalized formulas, parameter names, symbols, units, and explanations, and a consolidated formula database serves as external knowledge. The study benchmarks a broad range of approaches, including large LLMs with CoT prompts, retrieval-augmented methods, supervised fine-tuning, and Direct Preference Optimization, revealing substantial performance gaps for smaller models and the value of external formula knowledge. The dataset provides a solid baseline and a resource for future improvements in domain-guided, multi-step reasoning, with public releases on HuggingFace and GitHub. These results highlight the significance of explicit formula knowledge for robust numerical reasoning in real-world tasks.

Abstract

The application of formulas (e.g., physics formulas) is a fundamental human ability in solving numerical reasoning problems. Existing numerical reasoning datasets rarely explicitly state the formulas employed, as their questions often rely on implicit commonsense mathematical knowledge. To address this gap, we introduce FormulaReasoning, a new dataset specifically designed for formula-based numerical reasoning. It consists of 5,324 questions that require numerical calculations grounded in external physics formulas. We provide normalized, fine-grained annotations in both English and Chinese, including formula structures, parameter names, symbols, numerical values, and units-curated through extensive manual effort with LLM-assisted validation to ensure high quality. Additionally, we offer a consolidated formula database to serve as an external knowledge source. We analyze various reasoning approaches on FormulaReasoning, with emphasis on comparative evaluation of different architectural and methodological frameworks. Our assessment includes retrieval-augmented methods, approaches that decompose reasoning into formula generation, parameter extraction, and numerical calculation, as well as optimization techniques using preference data. We identify key challenges in formula-based numerical reasoning that require further investigation across different reasoning paradigms, highlighting opportunities for methodological advancement.

FormulaReasoning: A Dataset for Formula-Based Numerical Reasoning

TL;DR

FormulaReasoning introduces a bilingual dataset for formula-based numerical reasoning that requires explicit physics formulas, such as

, to ground calculations. Each question is annotated with normalized formulas, parameter names, symbols, units, and explanations, and a consolidated formula database serves as external knowledge. The study benchmarks a broad range of approaches, including large LLMs with CoT prompts, retrieval-augmented methods, supervised fine-tuning, and Direct Preference Optimization, revealing substantial performance gaps for smaller models and the value of external formula knowledge. The dataset provides a solid baseline and a resource for future improvements in domain-guided, multi-step reasoning, with public releases on HuggingFace and GitHub. These results highlight the significance of explicit formula knowledge for robust numerical reasoning in real-world tasks.

Abstract

Paper Structure (54 sections, 2 equations, 10 figures, 10 tables)

This paper contains 54 sections, 2 equations, 10 figures, 10 tables.

Introduction
Limitations of existing datasets
Our work
Related Work
Numerical Reasoning Datasets
Numerical Reasoning Methods
Dataset Construction
Preprocessing
Formula Normalization
Coarse-grained annotation
Fine-grained annotation
Formula Database Construction
Symbolic rules based merging
Semantics based merging
Manual review and error correction
...and 39 more sections

Figures (10)

Figure 1: An example from FormulaReasoning. Numerical values with units given in the question and obtained from intermediate steps are highlighted in red and purple, respectively. Formulas and their elements are in blue.
Figure 2: Prompt for explanation normalization.
Figure 3: Prompt for parameter extraction.
Figure 4: Prompt for correcting calculation errors.
Figure 5: An example of removed question.
...and 5 more figures

FormulaReasoning: A Dataset for Formula-Based Numerical Reasoning

TL;DR

Abstract

FormulaReasoning: A Dataset for Formula-Based Numerical Reasoning

Authors

TL;DR

Abstract

Table of Contents

Figures (10)