Enhancing Human Experience in Human-Agent Collaboration: A Human-Centered Modeling Approach Based on Positive Human Gain

Yiming Gao; Feiyu Liu; Liang Wang; Zhenjie Lian; Dehua Zheng; Weixuan Wang; Wenjin Yang; Siqin Li; Xianliang Wang; Wenhui Chen; Jing Dai; Qiang Fu; Wei Yang; Lanxiao Huang; Wei Liu

Enhancing Human Experience in Human-Agent Collaboration: A Human-Centered Modeling Approach Based on Positive Human Gain

Yiming Gao, Feiyu Liu, Liang Wang, Zhenjie Lian, Dehua Zheng, Weixuan Wang, Wenjin Yang, Siqin Li, Xianliang Wang, Wenhui Chen, Jing Dai, Qiang Fu, Wei Yang, Lanxiao Huang, Wei Liu

TL;DR

This work tackles the gap between task-focused game-AI optimization and human experience in human-agent collaboration by introducing Reinforcement Learning from Human Gain (RLHG). RLHG partitions human contribution into a primitive baseline and a positive gain, enabling agents to learn enhancements that help humans achieve their goals while preserving task performance. The method formalizes a human-centered objective $J(\boldsymbol{\theta})= V^{\boldsymbol{\pi_\theta},\boldsymbol{\pi_H}}(s) + \alpha \cdot V_H^{\boldsymbol{\pi_\theta},\boldsymbol{\pi_H}}(s)$ and reweights policy updates with human gains via $\widehat{A}_H(s,a)$, using a two-stage training schedule: Stage I estimates the human primitive value $V_\phi$, Stage II optimizes for both task and human enhancement with a human-policy embedding. Experiments in Honor of Kings (MOBA) show RLHG improves objective human-goal achievement and elevates subjective gaming experience for players of varying skill, demonstrating practical impact for assistive AI and cooperative gameplay. The results also highlight a trade-off between task mastery and human experience, which can be managed with an adaptive task-gate mechanism guided by the agent’s original value estimates.

Abstract

Existing game AI research mainly focuses on enhancing agents' abilities to win games, but this does not inherently make humans have a better experience when collaborating with these agents. For example, agents may dominate the collaboration and exhibit unintended or detrimental behaviors, leading to poor experiences for their human partners. In other words, most game AI agents are modeled in a "self-centered" manner. In this paper, we propose a "human-centered" modeling scheme for collaborative agents that aims to enhance the experience of humans. Specifically, we model the experience of humans as the goals they expect to achieve during the task. We expect that agents should learn to enhance the extent to which humans achieve these goals while maintaining agents' original abilities (e.g., winning games). To achieve this, we propose the Reinforcement Learning from Human Gain (RLHG) approach. The RLHG approach introduces a "baseline", which corresponds to the extent to which humans primitively achieve their goals, and encourages agents to learn behaviors that can effectively enhance humans in achieving their goals better. We evaluate the RLHG agent in the popular Multi-player Online Battle Arena (MOBA) game, Honor of Kings, by conducting real-world human-agent tests. Both objective performance and subjective preference results show that the RLHG agent provides participants better gaming experience.

Enhancing Human Experience in Human-Agent Collaboration: A Human-Centered Modeling Approach Based on Positive Human Gain

TL;DR

and reweights policy updates with human gains via

, using a two-stage training schedule: Stage I estimates the human primitive value

, Stage II optimizes for both task and human enhancement with a human-policy embedding. Experiments in Honor of Kings (MOBA) show RLHG improves objective human-goal achievement and elevates subjective gaming experience for players of varying skill, demonstrating practical impact for assistive AI and cooperative gameplay. The results also highlight a trade-off between task mastery and human experience, which can be managed with an adaptive task-gate mechanism guided by the agent’s original value estimates.

Abstract

Paper Structure (42 sections, 9 equations, 24 figures, 7 tables, 2 algorithms)

This paper contains 42 sections, 9 equations, 24 figures, 7 tables, 2 algorithms.

Introduction
Background
Game Introduction
Human-Agent Collaboration
Methods
Human-Centered Objective
Effective Human Enhancement
The Algorithm & Practical Implementation
Experiments
Experimental Setup
Human Model-Agent Test
Human-Agent Test
Case Study
Conclusion
Environment Details
...and 27 more sections

Figures (24)

Figure 1: Toy scenario, where an agent and its human partner are on either side of an obstacle. Only the agent is capable of pushing or pulling the obstacle. Their task goal is to obtain the coin. $\Leftarrow$: The agent gets the coin by itself. The task is completed, but the human has no experience. $\Rightarrow$: The agent assists the human to get the coin. Both the task is completed and the experience of the human is enhanced.
Figure 2: (a) The UI of Honor of Kings. (b) In-game goals, based on our participant survey (see Figure \ref{['fig:environment_setup']}(c)). Human players pursue multiple goals for more enjoyable gaming experience.
Figure 3: The RLHG training framework. (a) Human Primitive Value Estimation stage. The human primitive value network $V_\phi$ is trained in the human-agent team settings with the agent's policy $\pi$ frozen. (b) Human Enhancement Training stage. $V_\phi$ is frozen and added to a downstream network $\Delta_\omega$ to learn to estimate the expected positive human gain. $\beta\%$ human-agent team settings are used to learn human enhancement behaviors, and $1-\beta\%$ agent-only team settings are used to maintain the agent's original ability.
Figure 4: Environment Setup.(a) Simulated environment: the human model-agent game tests. (b) Real-world environment: the human-agent game tests. (c) Top 5 goals based on the stats of our participant survey. * denotes the task goal. The participant survey contains 8 initial goals, each participant can vote up to 5 non-repeating goals, and can also add additional goals. 30 participants voluntarily participated in the voting.
Figure 5: The performance of the human model in achieving human goals after teaming up with different agents. (a) The task goal. (b) The top 4 human goals (b.1, b.2, b.3, and b.4). (c) The follow rate metric: the frequency with which an agent follows a human in the entire game. Each agent played 10,000 games. Error bars represent 95% confidence intervals, calculated over games.
...and 19 more figures

Enhancing Human Experience in Human-Agent Collaboration: A Human-Centered Modeling Approach Based on Positive Human Gain

TL;DR

Abstract

Enhancing Human Experience in Human-Agent Collaboration: A Human-Centered Modeling Approach Based on Positive Human Gain

Authors

TL;DR

Abstract

Table of Contents

Figures (24)