Critical Thinking for Language Models
Gregor Betz, Christian Voigt, Kyle Richardson
TL;DR
The paper addresses the gap that neural language models struggle with reasoning tasks by proposing a 'critical thinking curriculum' built on a synthetic corpus of deductively valid arguments. It demonstrates that intermediary pre-training on core argument schemes enables transfer to more complex argument types and improves zero-shot performance on GLUE diagnostics and SNLI, indicating broad generalization. While gains are robust for some NLU benchmarks, they do not extend to all reasoning tasks (e.g., ARC, LogiQA), highlighting limits and the need for broader curricula. Overall, the work provides a promising foundation for using synthetic, well-structured argumentative texts to seed reasoning abilities in language models and outlines concrete directions for expanding the curriculum.
Abstract
This paper takes a first step towards a critical thinking curriculum for neural auto-regressive language models. We introduce a synthetic corpus of deductively valid arguments, and generate artificial argumentative texts to train and evaluate GPT-2. Significant transfer learning effects can be observed: Training a model on three simple core schemes allows it to accurately complete conclusions of different, and more complex types of arguments, too. The language models generalize the core argument schemes in a correct way. Moreover, we obtain consistent and promising results for NLU benchmarks. In particular, pre-training on the argument schemes raises zero-shot accuracy on the GLUE diagnostics by up to 15 percentage points. The findings suggest that intermediary pre-training on texts that exemplify basic reasoning abilities (such as typically covered in critical thinking textbooks) might help language models to acquire a broad range of reasoning skills. The synthetic argumentative texts presented in this paper are a promising starting point for building such a "critical thinking curriculum for language models."
