Laying the Foundation First? Investigating the Generalization from Atomic Skills to Complex Reasoning Tasks

Yuncheng Huang; Qianyu He; Yipei Xu; Jiaqing Liang; Yanghua Xiao

Laying the Foundation First? Investigating the Generalization from Atomic Skills to Complex Reasoning Tasks

Yuncheng Huang, Qianyu He, Yipei Xu, Jiaqing Liang, Yanghua Xiao

TL;DR

The paper investigates whether atomic skills learned in isolation can generalize to complex reasoning tasks, using math word problems to study arithmetic and unit conversion. It introduces a probing framework and a two-stage hierarchical curriculum learning (skill training followed by applied learning) to induce skill generalization. Findings show that atomic skills do not spontaneously generalize, but hierarchical curriculum learning effectively promotes generalization and demonstrates cross-dataset and cross-domain transfer, with complex tasks also enhancing atomic skills. This work provides actionable training strategies for improving complex reasoning in language models while highlighting limitations and directions for future automation of data generation and expansion of skill sets.

Abstract

Current language models have demonstrated their capability to develop basic reasoning, but struggle in more complicated reasoning tasks that require a combination of atomic skills, such as math word problem requiring skills like arithmetic and unit conversion. Previous methods either do not improve the inherent atomic skills of models or not attempt to generalize the atomic skills to complex reasoning tasks. In this paper, we first propose a probing framework to investigate whether the atomic skill can spontaneously generalize to complex reasoning tasks. Then, we introduce a hierarchical curriculum learning training strategy to achieve better skill generalization. In our experiments, we find that atomic skills can not spontaneously generalize to compositional tasks. By leveraging hierarchical curriculum learning, we successfully induce generalization, significantly improve the performance of open-source LMs on complex reasoning tasks. Promisingly, the skill generalization exhibit effective in cross-dataset and cross-domain scenarios. Complex reasoning can also help enhance atomic skills. Our findings offer valuable guidance for designing better training strategies for complex reasoning tasks.

Laying the Foundation First? Investigating the Generalization from Atomic Skills to Complex Reasoning Tasks

TL;DR

Abstract

Paper Structure (50 sections, 8 figures, 10 tables, 4 algorithms)

This paper contains 50 sections, 8 figures, 10 tables, 4 algorithms.

Introduction
Related Work
Task Generalization
Compositional Generalization
Atomic Skill Learning
Curriculum Learning
Method
Skill Generalization Probing
Task Selection
Probing Skills
Arithmetic Skill.
Unit Conversion Skill.
Skill Training (ST)
How to determine whether skill generalization has been achieved?
Hierarchical Curriculum Learning (HCL)
...and 35 more sections

Figures (8)

Figure 1: An example of LMs' deficiencies on atomic skills when solving complex reasoning tasks. While these atomic skills can be improved through skill training, it remains uncertain whether language models can apply enhanced skills to complex tasks.
Figure 2: Framework of our method. The right part is our probing approach. The left part describes the model training stages in hierarchical curriculum learning.
Figure 3: The distribution of data difficulty across four dimensions. Darker colors mean greater difficulty and larger areas mean more data.
Figure 4: Accuracy(%) of atomic skill on MWP of LLaMa-2. Left figure shows the results on RAW and right figure shows the results on HARD.
Figure 5: Error analysis on Vanilla model (left) and HCL model (right).
...and 3 more figures

Laying the Foundation First? Investigating the Generalization from Atomic Skills to Complex Reasoning Tasks

TL;DR

Abstract

Laying the Foundation First? Investigating the Generalization from Atomic Skills to Complex Reasoning Tasks

Authors

TL;DR

Abstract

Table of Contents

Figures (8)