A Comparative Study on Reward Models for UI Adaptation with Reinforcement Learning
Daniel Gaspar-Figueiredo, Silvia Abrahão, Marta Fernández-Diego, Emilio Insfran
TL;DR
The paper addresses the challenge of defining effective reward signals for reinforcement-learning-based UI adaptation. It proposes a rigorous confirmatory protocol to compare reward models derived solely from predictive HCI models ($HCI$) versus those augmented with Human Feedback ($HCI$&$HF$), using a balanced within-subject crossover design across three domains. The study plans to evaluate UX outcomes—engagement and satisfaction—via predictive-HCI metrics, the UES, and the QUIS, analyzed with Linear Mixed Models that account for period, sequence, and subject effects. If successful, the work will provide empirical guidance on when incorporating Human Feedback into reward models improves adaptive UIs, informing design choices for RL-based UI adaptation systems.
Abstract
Adapting the User Interface (UI) of software systems to user requirements and the context of use is challenging. The main difficulty consists of suggesting the right adaptation at the right time in the right place in order to make it valuable for end-users. We believe that recent progress in Machine Learning techniques provides useful ways in which to support adaptation more effectively. In particular, Reinforcement learning (RL) can be used to personalise interfaces for each context of use in order to improve the user experience (UX). However, determining the reward of each adaptation alternative is a challenge in RL for UI adaptation. Recent research has explored the use of reward models to address this challenge, but there is currently no empirical evidence on this type of model. In this paper, we propose a confirmatory study design that aims to investigate the effectiveness of two different approaches for the generation of reward models in the context of UI adaptation using RL: (1) by employing a reward model derived exclusively from predictive Human-Computer Interaction (HCI) models (HCI), and (2) by employing predictive HCI models augmented by Human Feedback (HCI&HF). The controlled experiment will use an AB/BA crossover design with two treatments: HCI and HCI&HF. We shall determine how the manipulation of these two treatments will affect the UX when interacting with adaptive user interfaces (AUI). The UX will be measured in terms of user engagement and user satisfaction, which will be operationalized by means of predictive HCI models and the Questionnaire for User Interaction Satisfaction (QUIS), respectively. By comparing the performance of two reward models in terms of their ability to adapt to user preferences with the purpose of improving the UX, our study contributes to the understanding of how reward modelling can facilitate UI adaptation using RL.
