Evaluating Theory of (an uncertain) Mind: Predicting the Uncertain Beliefs of Others in Conversation Forecasting

Anthony Sicilia; Malihe Alikhani

Evaluating Theory of (an uncertain) Mind: Predicting the Uncertain Beliefs of Others in Conversation Forecasting

Anthony Sicilia, Malihe Alikhani

TL;DR

This work proposes a new suite of tasks, challenging language models (LMs) to model the uncertainty of others in dialogue, designed around conversation forecasting, wherein an agent forecasts an unobserved outcome to a conversation.

Abstract

Typically, when evaluating Theory of Mind, we consider the beliefs of others to be binary: held or not held. But what if someone is unsure about their own beliefs? How can we quantify this uncertainty? We propose a new suite of tasks, challenging language models (LMs) to model the uncertainty of others in dialogue. We design these tasks around conversation forecasting, wherein an agent forecasts an unobserved outcome to a conversation. Uniquely, we view interlocutors themselves as forecasters, asking an LM to predict the uncertainty of the interlocutors (a probability). We experiment with re-scaling methods, variance reduction strategies, and demographic context, for this regression task, conducting experiments on three dialogue corpora (social, negotiation, task-oriented) with eight LMs. While LMs can explain up to 7% variance in the uncertainty of others, we highlight the difficulty of the tasks and room for future work, especially in practical applications, like anticipating ``false

Evaluating Theory of (an uncertain) Mind: Predicting the Uncertain Beliefs of Others in Conversation Forecasting

TL;DR

Abstract

Paper Structure (54 sections, 11 equations, 1 figure, 7 tables)

This paper contains 54 sections, 11 equations, 1 figure, 7 tables.

Introduction
Conversation Forecasting
Comparing Forecasts with Ground-Truth
The Missing Building Blocks for ToM
Integrating ToM in Forecasting
Other ToM Works
ToM Criteria
New Uncertainty Quantification Tasks
Human Expressions of Uncertainty
Calibration Strategy: "More Than Chance"
Uncertainty Quantification (UQ) Tasks
1st-Order ToM Uncertainty (1TUQ)
2nd-Order ToM Uncertainty (2TUQ)
False Uncertainty (FUnQ)
Corpora and Basic Prompts
...and 39 more sections

Figures (1)

Figure 1: Recognizing uncertainty in others can influence AI dialogue strategies, ultimately improving task-success. Here, an AI assistant recognizes user uncertainty and probes to resolve it, eventually increasing user satisfaction. We formalize tasks to assess language models for their ability to recognize uncertainty in other's beliefs -- an aspect of Theory of Mind. We propose new methods to use language models for this regression task, evaluating eight models across three dialogue corpora.

Evaluating Theory of (an uncertain) Mind: Predicting the Uncertain Beliefs of Others in Conversation Forecasting

TL;DR

Abstract

Evaluating Theory of (an uncertain) Mind: Predicting the Uncertain Beliefs of Others in Conversation Forecasting

Authors

TL;DR

Abstract

Table of Contents

Figures (1)