Table of Contents
Fetching ...

Enhancing Quantitative Reasoning Skills of Large Language Models through Dimension Perception

Yuncheng Huang, Qianyu He, Jiaqing Liang, Sihang Jiang, Yanghua Xiao, Yunwen Chen

TL;DR

This work addresses the deficit in LLM quantitative reasoning due to missing dimensional knowledge by introducing a dimension-aware framework built around DimUnitKB, a unit-linking system (DimKS), and the DimEval benchmark. It formalizes dimensional concepts, develops a comprehensive, bilingual dimensional unit knowledge base, and links textual mentions to dimensional units, enabling dimension-aware reasoning. By finetuning with DimEval and augmenting data for Q-MWP, the approach substantially improves performance on quantitative reasoning tasks (e.g., from 43.55% to 50.67% on Q-Ape210k vs GPT-4) and demonstrates robust gains even for smaller models. The results underscore the practical impact of embedding dimensional understanding into LLMs, with potential applications across mathematics, science, engineering, and finance where units and dimensions are central.

Abstract

Quantities are distinct and critical components of texts that characterize the magnitude properties of entities, providing a precise perspective for the understanding of natural language, especially for reasoning tasks. In recent years, there has been a flurry of research on reasoning tasks based on large language models (LLMs), most of which solely focus on numerical values, neglecting the dimensional concept of quantities with units despite its importance. We argue that the concept of dimension is essential for precisely understanding quantities and of great significance for LLMs to perform quantitative reasoning. However, the lack of dimension knowledge and quantity-related benchmarks has resulted in low performance of LLMs. Hence, we present a framework to enhance the quantitative reasoning ability of language models based on dimension perception. We first construct a dimensional unit knowledge base (DimUnitKB) to address the knowledge gap in this area. We propose a benchmark DimEval consisting of seven tasks of three categories to probe and enhance the dimension perception skills of LLMs. To evaluate the effectiveness of our methods, we propose a quantitative reasoning task and conduct experiments. The experimental results show that our dimension perception method dramatically improves accuracy (43.55%->50.67%) on quantitative reasoning tasks compared to GPT-4.

Enhancing Quantitative Reasoning Skills of Large Language Models through Dimension Perception

TL;DR

This work addresses the deficit in LLM quantitative reasoning due to missing dimensional knowledge by introducing a dimension-aware framework built around DimUnitKB, a unit-linking system (DimKS), and the DimEval benchmark. It formalizes dimensional concepts, develops a comprehensive, bilingual dimensional unit knowledge base, and links textual mentions to dimensional units, enabling dimension-aware reasoning. By finetuning with DimEval and augmenting data for Q-MWP, the approach substantially improves performance on quantitative reasoning tasks (e.g., from 43.55% to 50.67% on Q-Ape210k vs GPT-4) and demonstrates robust gains even for smaller models. The results underscore the practical impact of embedding dimensional understanding into LLMs, with potential applications across mathematics, science, engineering, and finance where units and dimensions are central.

Abstract

Quantities are distinct and critical components of texts that characterize the magnitude properties of entities, providing a precise perspective for the understanding of natural language, especially for reasoning tasks. In recent years, there has been a flurry of research on reasoning tasks based on large language models (LLMs), most of which solely focus on numerical values, neglecting the dimensional concept of quantities with units despite its importance. We argue that the concept of dimension is essential for precisely understanding quantities and of great significance for LLMs to perform quantitative reasoning. However, the lack of dimension knowledge and quantity-related benchmarks has resulted in low performance of LLMs. Hence, we present a framework to enhance the quantitative reasoning ability of language models based on dimension perception. We first construct a dimensional unit knowledge base (DimUnitKB) to address the knowledge gap in this area. We propose a benchmark DimEval consisting of seven tasks of three categories to probe and enhance the dimension perception skills of LLMs. To evaluate the effectiveness of our methods, we propose a quantitative reasoning task and conduct experiments. The experimental results show that our dimension perception method dramatically improves accuracy (43.55%->50.67%) on quantitative reasoning tasks compared to GPT-4.
Paper Structure (53 sections, 6 equations, 7 figures, 9 tables, 2 algorithms)

This paper contains 53 sections, 6 equations, 7 figures, 9 tables, 2 algorithms.

Figures (7)

  • Figure 1: An example of quantitative reasoning through dimension perception. ChatGPT (October 31, 2023) failed to identify the incorrect units in the question due to a lack of understanding of dimensional concepts, leading to erroneous inferences. Our approach derives accurate quantities through dimensional knowledge. Dimension is strictly defined in Section \ref{['sec:preliminary']}.
  • Figure 2: The framework for enhancing quantitative reasoning skills through dimension perception. (a) Step 1 involves the construction of a dimensional unit knowledge base (DimUnitKB). (b) Step 2 proposes and develops seven pretraining tasks of three categories for dimension perception, with corresponding datasets assembled through semi-automated and bootstrapping retrieval methods. (c) Step 3 improves performance by utilizing quantity-oriented data augmentation for tasks under quantitative reasoning.
  • Figure 3: Popular units sorted by frequency feature in DimUnitKB.
  • Figure 4: Top fourteen quantity kinds and their corresponding top five units, with the numerical values denoting their frequency feature in DimUnitKB.
  • Figure 5: Illustrative examples of DimEval.
  • ...and 2 more figures

Theorems & Definitions (8)

  • Definition 1: Unit Linking
  • Definition 2: Quantity Extraction
  • Definition 3: QuantityKind Match
  • Definition 4: Comparable Analysis
  • Definition 5: Dimension Prediction
  • Definition 6: Dimension Arithmetic Task
  • Definition 7: Magnitude Comparison
  • Definition 8: Unit Conversion