Influence-Based Reward Modulation for Implicit Communication in Human-Robot Interaction

Haoyang Jiang; Elizabeth A. Croft; Michael G. Burke

Influence-Based Reward Modulation for Implicit Communication in Human-Robot Interaction

Haoyang Jiang, Elizabeth A. Croft, Michael G. Burke

TL;DR

This work addresses implicit communication in human-robot interaction by proposing a model-free framework that modulates social influence with Transfer Entropy ($TE$) via reward augmentation in a partially observable Markov decision process ($POMDP$). The approach computes $TE$ from other agents’ histories and enriches the ego-agent’s reward to encourage (or resist) information transfer, learned through $Q$-learning with a softmax policy. Across simulations, virtual human-agent tests, and real human-robot experiments, boosting influence improves collaboration and, in competitive settings, alters human performance; resisting influence generally hampers collaborative outcomes. The findings demonstrate a practical, model-free mechanism for shaping implicit communication in HRI, with potential applications in cooperative and adversarial social navigation and broad implications for design and ethics in autonomous systems.

Abstract

Communication is essential for successful interaction. In human-robot interaction, implicit communication holds the potential to enhance robots' understanding of human needs, emotions, and intentions. This paper introduces a method to foster implicit communication in HRI without explicitly modelling human intentions or relying on pre-existing knowledge. Leveraging Transfer Entropy, we modulate influence between agents in social interactions in scenarios involving either collaboration or competition. By integrating influence into agents' rewards within a partially observable Markov decision process, we demonstrate that boosting influence enhances collaboration, while resisting influence diminishes performance. Our findings are validated through simulations and real-world experiments with human participants in social navigation settings.

Influence-Based Reward Modulation for Implicit Communication in Human-Robot Interaction

TL;DR

This work addresses implicit communication in human-robot interaction by proposing a model-free framework that modulates social influence with Transfer Entropy (

) via reward augmentation in a partially observable Markov decision process (

). The approach computes

from other agents’ histories and enriches the ego-agent’s reward to encourage (or resist) information transfer, learned through

-learning with a softmax policy. Across simulations, virtual human-agent tests, and real human-robot experiments, boosting influence improves collaboration and, in competitive settings, alters human performance; resisting influence generally hampers collaborative outcomes. The findings demonstrate a practical, model-free mechanism for shaping implicit communication in HRI, with potential applications in cooperative and adversarial social navigation and broad implications for design and ethics in autonomous systems.

Abstract

Paper Structure (23 sections, 23 equations, 13 figures, 5 tables)

This paper contains 23 sections, 23 equations, 13 figures, 5 tables.

Introduction
Related Work
Methodology
Measuring influence
Q-learning
Connections to Legibility
Experiment and Results
Simulation
Human-Agent Experiment
Human-Robot Experiment
Discussion
Limitations and future Work
Conclusion
Appendix
Simulation details
...and 8 more sections

Figures (13)

Figure 1: Humans rely on implicit communication and information exchange to negotiate complex social settings. This work introduces a reward augmentation framework to emulate this implicit communication behaviour. We show that by optimising robot policies to increase information flow, we can influence human behaviour and improve human-robot collaboration.
Figure 2: A corridor dilemma episode: $P1$ is assigned to pass, and $P2$ is assigned to meet. In the first turn, $P1$ chose right, and $P2$ chose straight. After 5 turns, the result could be either pass (as in the top scenario), making $P1$ the winner, or meet (as in the bottom scenario), making $P2$ the winner.
Figure 3: Human-agent experiment results depicting collaboration and competition performance of humans. The first column shows average success rates, the second column displays violin plots of the final 20 rounds, where the mean values are marked, and the last column visualises their distributions.
Figure 4: Simulation results of Non-TE agents against three agent types align with the experimental results with humans (Figure \ref{['fig:cc_us']}). Rows depict collaboration and competition performance. Columns display average success rates, violin plots of the final 20 rounds, where the mean values are marked, and distributions of the last 20 rounds.
Figure 5: Selected survey results from the human-agent experiment.
...and 8 more figures

Influence-Based Reward Modulation for Implicit Communication in Human-Robot Interaction

TL;DR

Abstract

Influence-Based Reward Modulation for Implicit Communication in Human-Robot Interaction

Authors

TL;DR

Abstract

Table of Contents

Figures (13)