Table of Contents
Fetching ...

Measuring User Experience Through Speech Analysis: Insights from HCI Interviews

Yong Ma, Xuedong Zhang, Yuchong Zhang, Morten Fjeld

TL;DR

The paper tackles subjective bias in UX evaluation by introducing speech-based objective metrics derived from acoustic, prosodic, and social features to differentiate positive and neutral user experiences during interactive sessions. Using a small in-car VR study, it combines interview-derived sentiment with performance metrics and applies Librosa/OpenSMILE-based feature extraction, achieving strong discriminative power (e.g., SVM accuracy of $86\%$ and KNN $75\%$). The findings show significant differences in RMS, ZCR, jitter, shimmer, and social features like engagement and activity between satisfaction groups, supporting the viability of speech analysis as a bias-resistant UX assessment tool. The work highlights generalization to other HCI domains, discusses confounds and limitations, and outlines future directions including multimodal data integration and real-time feedback for adaptive interfaces.

Abstract

User satisfaction plays a crucial role in user experience (UX) evaluation. Traditionally, UX measurements are based on subjective scales, such as questionnaires. However, these evaluations may suffer from subjective bias. In this paper, we explore the acoustic and prosodic features of speech to differentiate between positive and neutral UX during interactive sessions. By analyzing speech features such as root-mean-square (RMS), zero-crossing rate(ZCR), jitter, and shimmer, we identified significant differences between the positive and neutral user groups. In addition, social speech features such as activity and engagement also show notable variations between these groups. Our findings underscore the potential of speech analysis as an objective and reliable tool for UX measurement, contributing to more robust and bias-resistant evaluation methodologies. This work offers a novel approach to integrating speech features into UX evaluation and opens avenues for further research in HCI.

Measuring User Experience Through Speech Analysis: Insights from HCI Interviews

TL;DR

The paper tackles subjective bias in UX evaluation by introducing speech-based objective metrics derived from acoustic, prosodic, and social features to differentiate positive and neutral user experiences during interactive sessions. Using a small in-car VR study, it combines interview-derived sentiment with performance metrics and applies Librosa/OpenSMILE-based feature extraction, achieving strong discriminative power (e.g., SVM accuracy of and KNN ). The findings show significant differences in RMS, ZCR, jitter, shimmer, and social features like engagement and activity between satisfaction groups, supporting the viability of speech analysis as a bias-resistant UX assessment tool. The work highlights generalization to other HCI domains, discusses confounds and limitations, and outlines future directions including multimodal data integration and real-time feedback for adaptive interfaces.

Abstract

User satisfaction plays a crucial role in user experience (UX) evaluation. Traditionally, UX measurements are based on subjective scales, such as questionnaires. However, these evaluations may suffer from subjective bias. In this paper, we explore the acoustic and prosodic features of speech to differentiate between positive and neutral UX during interactive sessions. By analyzing speech features such as root-mean-square (RMS), zero-crossing rate(ZCR), jitter, and shimmer, we identified significant differences between the positive and neutral user groups. In addition, social speech features such as activity and engagement also show notable variations between these groups. Our findings underscore the potential of speech analysis as an objective and reliable tool for UX measurement, contributing to more robust and bias-resistant evaluation methodologies. This work offers a novel approach to integrating speech features into UX evaluation and opens avenues for further research in HCI.

Paper Structure

This paper contains 25 sections, 1 figure, 1 table.

Figures (1)

  • Figure 1: Comparison of Speech Features Between Two User Satisfaction Groups Using T-Test. The figures include both acoustic features (RMS, ZCR, Jitter, Shimmer) and social speech features (Activity, Engagement).