Table of Contents
Fetching ...

PINTO: Faithful Language Reasoning Using Prompt-Generated Rationales

Peifeng Wang, Aaron Chan, Filip Ilievski, Muhao Chen, Xiang Ren

TL;DR

The paper addresses the challenge of transparent, reliable reasoning in language models by separating rationalization from reasoning. It introduces PINTO, a two-stage pipeline where a frozen, prompt-driven rationalizing LM generates choice-specific rationales and a smaller reasoning LM uses these rationales under a counterfactual regularization loss to discourage spurious shortcuts. Across four CSR benchmarks, PINTO improves generalization to in-distribution and out-of-distribution data and yields higher faithfulness of rationales to predictions, with data-efficient benefits in low-resource settings. The work demonstrates that explicit, faithful reasoning guided by perturbed rationales can lead to robust performance while reducing annotation and computation costs.

Abstract

Neural language models (LMs) have achieved impressive results on various language-based reasoning tasks by utilizing latent knowledge encoded in their own pretrained parameters. To make this reasoning process more explicit, recent works retrieve a rationalizing LM's internal knowledge by training or prompting it to generate free-text rationales, which can be used to guide task predictions made by either the same LM or a separate reasoning LM. However, rationalizing LMs require expensive rationale annotation and/or computation, without any assurance that their generated rationales improve LM task performance or faithfully reflect LM decision-making. In this paper, we propose PINTO, an LM pipeline that rationalizes via prompt-based learning, and learns to faithfully reason over rationales via counterfactual regularization. First, PINTO maps out a suitable reasoning process for the task input by prompting a frozen rationalizing LM to generate a free-text rationale. Second, PINTO's reasoning LM is fine-tuned to solve the task using the generated rationale as context, while regularized to output less confident predictions when the rationale is perturbed. Across four datasets, we show that PINTO significantly improves the generalization ability of the reasoning LM, yielding higher performance on both in-distribution and out-of-distribution test sets. Also, we find that PINTO's rationales are more faithful to its task predictions than those generated by competitive baselines.

PINTO: Faithful Language Reasoning Using Prompt-Generated Rationales

TL;DR

The paper addresses the challenge of transparent, reliable reasoning in language models by separating rationalization from reasoning. It introduces PINTO, a two-stage pipeline where a frozen, prompt-driven rationalizing LM generates choice-specific rationales and a smaller reasoning LM uses these rationales under a counterfactual regularization loss to discourage spurious shortcuts. Across four CSR benchmarks, PINTO improves generalization to in-distribution and out-of-distribution data and yields higher faithfulness of rationales to predictions, with data-efficient benefits in low-resource settings. The work demonstrates that explicit, faithful reasoning guided by perturbed rationales can lead to robust performance while reducing annotation and computation costs.

Abstract

Neural language models (LMs) have achieved impressive results on various language-based reasoning tasks by utilizing latent knowledge encoded in their own pretrained parameters. To make this reasoning process more explicit, recent works retrieve a rationalizing LM's internal knowledge by training or prompting it to generate free-text rationales, which can be used to guide task predictions made by either the same LM or a separate reasoning LM. However, rationalizing LMs require expensive rationale annotation and/or computation, without any assurance that their generated rationales improve LM task performance or faithfully reflect LM decision-making. In this paper, we propose PINTO, an LM pipeline that rationalizes via prompt-based learning, and learns to faithfully reason over rationales via counterfactual regularization. First, PINTO maps out a suitable reasoning process for the task input by prompting a frozen rationalizing LM to generate a free-text rationale. Second, PINTO's reasoning LM is fine-tuned to solve the task using the generated rationale as context, while regularized to output less confident predictions when the rationale is perturbed. Across four datasets, we show that PINTO significantly improves the generalization ability of the reasoning LM, yielding higher performance on both in-distribution and out-of-distribution test sets. Also, we find that PINTO's rationales are more faithful to its task predictions than those generated by competitive baselines.
Paper Structure (19 sections, 3 equations, 4 figures, 15 tables)

This paper contains 19 sections, 3 equations, 4 figures, 15 tables.

Figures (4)

  • Figure 1: Rationale-Based Language Reasoning. (a) Examples of reasoning tasks that require implicit knowledge beyond task inputs. (b) Comparison of existing paradigms for providing free-text rationales along with predictions.
  • Figure 2: Overview of PINTO. (1) A frozen medium-scale LM is prompted to generate choice-specific rationales. (2) A small-scale LM is fine-tuned to reason over the generated rationales. (3) We introduce counterfactual regularization in addition to standard training loss to ensure the rationales are leveraged properly. During inference, the rationalizing LM is prompted with a new question to generate rationales, which are provided to the reasoning module to make a prediction.
  • Figure 3: Standard Training vs. Counterfactual Training. For counterfactual regularization, we train the reasoning module with noisy labels when the rationale tokens are either masked or replaced.
  • Figure 4: Low-Resource Learning. Performance (accuracy) of different fine-tuned models in low-resource settings on CSQA.