ClarifyDelphi: Reinforced Clarification Questions with Defeasibility Rewards for Social and Moral Situations

Valentina Pyatkin; Jena D. Hwang; Vivek Srikumar; Ximing Lu; Liwei Jiang; Yejin Choi; Chandra Bhagavatula

ClarifyDelphi: Reinforced Clarification Questions with Defeasibility Rewards for Social and Moral Situations

Valentina Pyatkin, Jena D. Hwang, Vivek Srikumar, Ximing Lu, Liwei Jiang, Yejin Choi, Chandra Bhagavatula

TL;DR

ClarifyDelphi develops a reinforcement-learning framework to generate clarification questions that surface defeasible contexts in social and moral scenarios. By simulating weakening and strengthening answers and optimizing for divergence in Delphi's judgments, it identifies questions that uncover consequential context. The work contributes a large δ-Clarify dataset, demonstrates human-supported gains over baselines in relevance, informativeness, and defeasibility, and provides interactive tools for context-rich moral reasoning. It advances computational approaches to contextual moral judgment and offers resources for cognitive and AI research on defeasible inference.

Abstract

Context is everything, even in commonsense moral reasoning. Changing contexts can flip the moral judgment of an action; "Lying to a friend" is wrong in general, but may be morally acceptable if it is intended to protect their life. We present ClarifyDelphi, an interactive system that learns to ask clarification questions (e.g., why did you lie to your friend?) in order to elicit additional salient contexts of a social or moral situation. We posit that questions whose potential answers lead to diverging moral judgments are the most informative. Thus, we propose a reinforcement learning framework with a defeasibility reward that aims to maximize the divergence between moral judgments of hypothetical answers to a question. Human evaluation demonstrates that our system generates more relevant, informative and defeasible questions compared to competitive baselines. Our work is ultimately inspired by studies in cognitive science that have investigated the flexibility in moral cognition (i.e., the diverse contexts in which moral rules can be bent), and we hope that research in this direction can assist both cognitive and computational investigations of moral judgments.

ClarifyDelphi: Reinforced Clarification Questions with Defeasibility Rewards for Social and Moral Situations

TL;DR

Abstract

Paper Structure (50 sections, 9 equations, 12 figures, 6 tables, 1 algorithm)

This paper contains 50 sections, 9 equations, 12 figures, 6 tables, 1 algorithm.

Introduction
Problem Setup
ClarifyDelphi: A Reinforced Clarification Question Generator
Supervised Question Generation
Defeasible Answer Simulation
Reward
Sentence Fusion
Delphi for Feedback
JS-Divergence
Reward normalization
Proximal Policy Optimization (PPO)
$\delta$-Clarify: a Dataset of Clarification Question
$\textsc{$\delta$-Clarify}\xspace_{gold}$:
$\textsc{$\delta$-Clarify}\xspace_{silver}$:
Dataset Analysis
...and 35 more sections

Figures (12)

Figure 1: The ClarifyDelphi question generation approach is trained via reinforcement learning. The reward simulates a set of possible (defeasible) answers to the questions and, using Delphi for feedback, optimizes for questions leading to maximally diverging answers.
Figure 2: Interaction between a user and ClarifyDelphi. The user inputs a situation and ClarifyDelphi answers with an initial judgement (obtained from Delphi) and a clarification question, which the user then answers.
Figure 3: Proportional distribution (%) of the most frequent question starts in $\textsc{$\delta$-Clarify}\xspace_{gold}$, $\textsc{$\delta$-Clarify}\xspace_{silver}$ and the subset of defeasible questions of $\textsc{$\delta$-Clarify}\xspace_{silver}$.
Figure 4: Number of questions (out of 500) in test set that received an informativeness and relevance rating of $>0$.
Figure 5: Performance of ppo algorithm with different policies: a policy pre-trained on SQUAD and policies pre-trained on different subsets of the $\delta$-Clarify dataset. The scores (higher is better) are averaged every 1000 steps, between 1000 and 6000.
...and 7 more figures

ClarifyDelphi: Reinforced Clarification Questions with Defeasibility Rewards for Social and Moral Situations

TL;DR

Abstract

ClarifyDelphi: Reinforced Clarification Questions with Defeasibility Rewards for Social and Moral Situations

Authors

TL;DR

Abstract

Table of Contents

Figures (12)