Table of Contents
Fetching ...

Evaluating Theory of (an uncertain) Mind: Predicting the Uncertain Beliefs of Others in Conversation Forecasting

Anthony Sicilia, Malihe Alikhani

TL;DR

This work proposes a new suite of tasks, challenging language models (LMs) to model the uncertainty of others in dialogue, designed around conversation forecasting, wherein an agent forecasts an unobserved outcome to a conversation.

Abstract

Typically, when evaluating Theory of Mind, we consider the beliefs of others to be binary: held or not held. But what if someone is unsure about their own beliefs? How can we quantify this uncertainty? We propose a new suite of tasks, challenging language models (LMs) to model the uncertainty of others in dialogue. We design these tasks around conversation forecasting, wherein an agent forecasts an unobserved outcome to a conversation. Uniquely, we view interlocutors themselves as forecasters, asking an LM to predict the uncertainty of the interlocutors (a probability). We experiment with re-scaling methods, variance reduction strategies, and demographic context, for this regression task, conducting experiments on three dialogue corpora (social, negotiation, task-oriented) with eight LMs. While LMs can explain up to 7% variance in the uncertainty of others, we highlight the difficulty of the tasks and room for future work, especially in practical applications, like anticipating ``false

Evaluating Theory of (an uncertain) Mind: Predicting the Uncertain Beliefs of Others in Conversation Forecasting

TL;DR

This work proposes a new suite of tasks, challenging language models (LMs) to model the uncertainty of others in dialogue, designed around conversation forecasting, wherein an agent forecasts an unobserved outcome to a conversation.

Abstract

Typically, when evaluating Theory of Mind, we consider the beliefs of others to be binary: held or not held. But what if someone is unsure about their own beliefs? How can we quantify this uncertainty? We propose a new suite of tasks, challenging language models (LMs) to model the uncertainty of others in dialogue. We design these tasks around conversation forecasting, wherein an agent forecasts an unobserved outcome to a conversation. Uniquely, we view interlocutors themselves as forecasters, asking an LM to predict the uncertainty of the interlocutors (a probability). We experiment with re-scaling methods, variance reduction strategies, and demographic context, for this regression task, conducting experiments on three dialogue corpora (social, negotiation, task-oriented) with eight LMs. While LMs can explain up to 7% variance in the uncertainty of others, we highlight the difficulty of the tasks and room for future work, especially in practical applications, like anticipating ``false
Paper Structure (54 sections, 11 equations, 1 figure, 7 tables)

This paper contains 54 sections, 11 equations, 1 figure, 7 tables.

Figures (1)

  • Figure 1: Recognizing uncertainty in others can influence AI dialogue strategies, ultimately improving task-success. Here, an AI assistant recognizes user uncertainty and probes to resolve it, eventually increasing user satisfaction. We formalize tasks to assess language models for their ability to recognize uncertainty in other's beliefs -- an aspect of Theory of Mind. We propose new methods to use language models for this regression task, evaluating eight models across three dialogue corpora.