Table of Contents
Fetching ...

Second-order Theory of Mind for Human Teachers and Robot Learners

Patrick Callaghan, Reid Simmons, Henny Admoni

TL;DR

This paper addresses the problem of misleading feedback in human-robot teaching that inflates cognitive burden. It proposes a Second-order Theory of Mind (ToM-2) realized within an Interactive Partially Observable Markov Decision Process (I-POMDP) to model perceived rationality and the teacher's beliefs about the learner and the learning objective. It introduces a discrete, learnable set of observation functions parameterized by $\beta$ to capture varying degrees of rationality, and Confidence Expressions (CEs) to convey the learner's certainty about features. The evaluation plan includes both simulation experiments and a human user study in a turn-based card-rule domain to assess reductions in rounds to learn, misbelief rates, and cognitive workload. The work aims to improve teaching efficacy by enabling feedback that accounts for the teacher's beliefs about the learner and the objective, with potential practical impact on human-robot teaching interactions.

Abstract

Confusing or otherwise unhelpful learner feedback creates or perpetuates erroneous beliefs that the teacher and learner have of each other, thereby increasing the cognitive burden placed upon the human teacher. For example, the robot's feedback might cause the human to misunderstand what the learner knows about the learning objective or how the learner learns. At the same time -- and in addition to the learning objective -- the learner might misunderstand how the teacher perceives the learner's task knowledge and learning processes. To ease the teaching burden, the learner should provide feedback that accounts for these misunderstandings and elicits efficient teaching from the human. This work endows an AI learner with a Second-order Theory of Mind that models perceived rationality as a source for the erroneous beliefs a teacher and learner may have of one another. It also explores how a learner can ease the teaching burden and improve teacher efficacy if it selects feedback which accounts for its model of the teacher's beliefs about the learner and its learning objective.

Second-order Theory of Mind for Human Teachers and Robot Learners

TL;DR

This paper addresses the problem of misleading feedback in human-robot teaching that inflates cognitive burden. It proposes a Second-order Theory of Mind (ToM-2) realized within an Interactive Partially Observable Markov Decision Process (I-POMDP) to model perceived rationality and the teacher's beliefs about the learner and the learning objective. It introduces a discrete, learnable set of observation functions parameterized by to capture varying degrees of rationality, and Confidence Expressions (CEs) to convey the learner's certainty about features. The evaluation plan includes both simulation experiments and a human user study in a turn-based card-rule domain to assess reductions in rounds to learn, misbelief rates, and cognitive workload. The work aims to improve teaching efficacy by enabling feedback that accounts for the teacher's beliefs about the learner and the objective, with potential practical impact on human-robot teaching interactions.

Abstract

Confusing or otherwise unhelpful learner feedback creates or perpetuates erroneous beliefs that the teacher and learner have of each other, thereby increasing the cognitive burden placed upon the human teacher. For example, the robot's feedback might cause the human to misunderstand what the learner knows about the learning objective or how the learner learns. At the same time -- and in addition to the learning objective -- the learner might misunderstand how the teacher perceives the learner's task knowledge and learning processes. To ease the teaching burden, the learner should provide feedback that accounts for these misunderstandings and elicits efficient teaching from the human. This work endows an AI learner with a Second-order Theory of Mind that models perceived rationality as a source for the erroneous beliefs a teacher and learner may have of one another. It also explores how a learner can ease the teaching burden and improve teacher efficacy if it selects feedback which accounts for its model of the teacher's beliefs about the learner and its learning objective.

Paper Structure

This paper contains 5 sections, 1 figure.

Figures (1)

  • Figure 1: A human teaches a robot the $[r]$ule which dictates how cards are categorized (e.g., according to color). Here, the teacher misunderstands if the robot knows the correct rule $r^*$. The robot's Second-order Theory of Mind enables it to model this misunderstanding and provide feedback using Confidence Expressions (green text).