Table of Contents
Fetching ...

Solving NLP Problems through Human-System Collaboration: A Discussion-based Approach

Masahiro Kaneko, Graham Neubig, Naoaki Okazaki

TL;DR

This paper introduces a bidirectional human–system discussion framework to solve NLP problems, rooted in natural language inference. It constructs a dialogue-rich dataset of human–human discussions on SNLI-derived problems and trains evaluation prompts enabling systems to discuss, argue, and refine predictions with humans; few-shot-discussion prompts significantly improve both discussion quality and final accuracy, especially on challenging ANLI data. It also shows that pseudo-discussion data can replicate manual discussions at lower cost and that model behavior can both benefit from and be destabilized by human input, underscoring the need for safeguards. Overall, the work demonstrates that interactive dialogue between humans and systems can enhance performance, reliability, and transparency in NLP tasks, marking a step toward more collaborative AI systems.

Abstract

Humans work together to solve common problems by having discussions, explaining, and agreeing or disagreeing with each other. Similarly, if a system can have discussions with humans when solving tasks, it can improve the system's performance and reliability. In previous research on explainability, it has only been possible for the system to make predictions and for humans to ask questions about them rather than having a mutual exchange of opinions. This research aims to create a dataset and computational framework for systems that discuss and refine their predictions through dialogue. Through experiments, we show that the proposed system can have beneficial discussions with humans improving the accuracy by up to 25 points in the natural language inference task.

Solving NLP Problems through Human-System Collaboration: A Discussion-based Approach

TL;DR

This paper introduces a bidirectional human–system discussion framework to solve NLP problems, rooted in natural language inference. It constructs a dialogue-rich dataset of human–human discussions on SNLI-derived problems and trains evaluation prompts enabling systems to discuss, argue, and refine predictions with humans; few-shot-discussion prompts significantly improve both discussion quality and final accuracy, especially on challenging ANLI data. It also shows that pseudo-discussion data can replicate manual discussions at lower cost and that model behavior can both benefit from and be destabilized by human input, underscoring the need for safeguards. Overall, the work demonstrates that interactive dialogue between humans and systems can enhance performance, reliability, and transparency in NLP tasks, marking a step toward more collaborative AI systems.

Abstract

Humans work together to solve common problems by having discussions, explaining, and agreeing or disagreeing with each other. Similarly, if a system can have discussions with humans when solving tasks, it can improve the system's performance and reliability. In previous research on explainability, it has only been possible for the system to make predictions and for humans to ask questions about them rather than having a mutual exchange of opinions. This research aims to create a dataset and computational framework for systems that discuss and refine their predictions through dialogue. Through experiments, we show that the proposed system can have beneficial discussions with humans improving the accuracy by up to 25 points in the natural language inference task.
Paper Structure (13 sections, 2 figures, 9 tables)