A Comparative Study on Reward Models for UI Adaptation with Reinforcement Learning

Daniel Gaspar-Figueiredo; Silvia Abrahão; Marta Fernández-Diego; Emilio Insfran

A Comparative Study on Reward Models for UI Adaptation with Reinforcement Learning

Daniel Gaspar-Figueiredo, Silvia Abrahão, Marta Fernández-Diego, Emilio Insfran

TL;DR

The paper addresses the challenge of defining effective reward signals for reinforcement-learning-based UI adaptation. It proposes a rigorous confirmatory protocol to compare reward models derived solely from predictive HCI models ($HCI$) versus those augmented with Human Feedback ($HCI$&$HF$), using a balanced within-subject crossover design across three domains. The study plans to evaluate UX outcomes—engagement and satisfaction—via predictive-HCI metrics, the UES, and the QUIS, analyzed with Linear Mixed Models that account for period, sequence, and subject effects. If successful, the work will provide empirical guidance on when incorporating Human Feedback into reward models improves adaptive UIs, informing design choices for RL-based UI adaptation systems.

Abstract

Adapting the User Interface (UI) of software systems to user requirements and the context of use is challenging. The main difficulty consists of suggesting the right adaptation at the right time in the right place in order to make it valuable for end-users. We believe that recent progress in Machine Learning techniques provides useful ways in which to support adaptation more effectively. In particular, Reinforcement learning (RL) can be used to personalise interfaces for each context of use in order to improve the user experience (UX). However, determining the reward of each adaptation alternative is a challenge in RL for UI adaptation. Recent research has explored the use of reward models to address this challenge, but there is currently no empirical evidence on this type of model. In this paper, we propose a confirmatory study design that aims to investigate the effectiveness of two different approaches for the generation of reward models in the context of UI adaptation using RL: (1) by employing a reward model derived exclusively from predictive Human-Computer Interaction (HCI) models (HCI), and (2) by employing predictive HCI models augmented by Human Feedback (HCI&HF). The controlled experiment will use an AB/BA crossover design with two treatments: HCI and HCI&HF. We shall determine how the manipulation of these two treatments will affect the UX when interacting with adaptive user interfaces (AUI). The UX will be measured in terms of user engagement and user satisfaction, which will be operationalized by means of predictive HCI models and the Questionnaire for User Interaction Satisfaction (QUIS), respectively. By comparing the performance of two reward models in terms of their ability to adapt to user preferences with the purpose of improving the UX, our study contributes to the understanding of how reward modelling can facilitate UI adaptation using RL.

A Comparative Study on Reward Models for UI Adaptation with Reinforcement Learning

TL;DR

) versus those augmented with Human Feedback (

), using a balanced within-subject crossover design across three domains. The study plans to evaluate UX outcomes—engagement and satisfaction—via predictive-HCI metrics, the UES, and the QUIS, analyzed with Linear Mixed Models that account for period, sequence, and subject effects. If successful, the work will provide empirical guidance on when incorporating Human Feedback into reward models improves adaptive UIs, informing design choices for RL-based UI adaptation systems.

Abstract

Paper Structure (12 sections, 2 figures, 1 table)

This paper contains 12 sections, 2 figures, 1 table.

Introduction
Background and Related Work
Experimental Design
Research questions and hypotheses
Variables
Independent variables
Dependent variables
Subjects
Design
Analysis plan
Threats to validity
Execution Plan

Figures (2)

Figure 1: An interface is adapted by simulating several possible sequences of adaptations and evaluating them using predictive models in HCI. This approach avoids greedy, disadvantageous adaptations, and may anticipate possible user responses even with limited observation data. Figure adapted from Chaslot:2021
Figure 2: The RL agent employing two strategies to obtain reward predictions: predictive HCI models only (orange) and predictive HCI models with human feedback (green)

A Comparative Study on Reward Models for UI Adaptation with Reinforcement Learning

TL;DR

Abstract

A Comparative Study on Reward Models for UI Adaptation with Reinforcement Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (2)