Table of Contents
Fetching ...

Probabilistic emotion and sentiment modelling of patient-reported experiences

Curtis Murray, Lewis Mitchell, Jonathan Tuke, Mark Mackay

TL;DR

The paper tackles the challenge of extracting meaningful emotional insights from unstructured patient experience narratives by combining metadata network topic modelling with a probabilistic Naive Bayes emotion recommender. It demonstrates that topic-driven features yield superior performance for high-dimensional emotion prediction and binary sentiment classification (notably $F1$ up to 0.921 and strong $nDCG$/$Q$-measure scores) compared with full-text or lexicon baselines. The approach offers interpretability and scalability, providing practical tools (R package persR and Shiny persShiny) to healthcare researchers and practitioners for real-time feedback analysis. The work highlights that patient-caregiver interactions and engagement factors, rather than solely clinical outcomes, predominantly shape emotional experiences, with significant implications for patient-centered care and service improvement.

Abstract

This study introduces a novel methodology for modelling patient emotions from online patient experience narratives. We employed metadata network topic modelling to analyse patient-reported experiences from Care Opinion, revealing key emotional themes linked to patient-caregiver interactions and clinical outcomes. We develop a probabilistic, context-specific emotion recommender system capable of predicting both multilabel emotions and binary sentiments using a naive Bayes classifier using contextually meaningful topics as predictors. The superior performance of our predicted emotions under this model compared to baseline models was assessed using the information retrieval metrics nDCG and Q-measure, and our predicted sentiments achieved an F1 score of 0.921, significantly outperforming standard sentiment lexicons. This method offers a transparent, cost-effective way to understand patient feedback, enhancing traditional collection methods and informing individualised patient care. Our findings are accessible via an R package and interactive dashboard, providing valuable tools for healthcare researchers and practitioners.

Probabilistic emotion and sentiment modelling of patient-reported experiences

TL;DR

The paper tackles the challenge of extracting meaningful emotional insights from unstructured patient experience narratives by combining metadata network topic modelling with a probabilistic Naive Bayes emotion recommender. It demonstrates that topic-driven features yield superior performance for high-dimensional emotion prediction and binary sentiment classification (notably up to 0.921 and strong /-measure scores) compared with full-text or lexicon baselines. The approach offers interpretability and scalability, providing practical tools (R package persR and Shiny persShiny) to healthcare researchers and practitioners for real-time feedback analysis. The work highlights that patient-caregiver interactions and engagement factors, rather than solely clinical outcomes, predominantly shape emotional experiences, with significant implications for patient-centered care and service improvement.

Abstract

This study introduces a novel methodology for modelling patient emotions from online patient experience narratives. We employed metadata network topic modelling to analyse patient-reported experiences from Care Opinion, revealing key emotional themes linked to patient-caregiver interactions and clinical outcomes. We develop a probabilistic, context-specific emotion recommender system capable of predicting both multilabel emotions and binary sentiments using a naive Bayes classifier using contextually meaningful topics as predictors. The superior performance of our predicted emotions under this model compared to baseline models was assessed using the information retrieval metrics nDCG and Q-measure, and our predicted sentiments achieved an F1 score of 0.921, significantly outperforming standard sentiment lexicons. This method offers a transparent, cost-effective way to understand patient feedback, enhancing traditional collection methods and informing individualised patient care. Our findings are accessible via an R package and interactive dashboard, providing valuable tools for healthcare researchers and practitioners.
Paper Structure (43 sections, 40 equations, 10 figures, 5 tables)

This paper contains 43 sections, 40 equations, 10 figures, 5 tables.

Figures (10)

  • Figure 1: Metadata Topic Modelling Overview: Documents are represented in a document-word network where edges between document nodes and word nodes count the number of occurrences of word $v_i$ in document $d_j$ (left). Emotion labels $e_t$ are added to this network in the emotion-document-word network by connecting document nodes to emotion nodes (with an implicit edge count of one). Below this network Document $1$ is depicted as consecutive words $v_1,v_2, v_1, v_4$, and tagged with emotions $e_1, e_2$. We also show the bag-of-words representation as the vector of word counts. Both networks undergo hSBM topic modelling. The hSBM performs community detection, which we visualise through the colouring of nodes to indicate group membership. Topics are the communities of word nodes. We show the mapping from $d_1$, or its bag of words representation $\mathbf{w_1}$ to the document-topic representation $\mathbf{t_1} = p(\mathbf{t} | \mathbf{w_1})$, which is taken as the empirical densities of topic use. To find this, we calculate the denominator as the number of edges out of the document $(4)$, and numerators as the number of edges from the document to each respective topic $(3, 0, 1)$. The corresponding topic-word densities that indicate $p(\mathbf{v} | t_i)$ are again the empirical densities, taken as the number of edges to the topics $(8,1,1)$ as denominators, and the number of edges to each word as the numerators ($(5,3,0,0)$, $(0,0,1,0)$, and $(0,0,0,1)$) for topics $t_1$, $t_2$, and $t_3$ consecutively. In \ref{['appendix:example']} we show the calculation of the posterior distribution $p(E=e | d_1)$.
  • Figure 2: Number of monthly reports to Care Opinion.
  • Figure 3: Number of reports to Care Opinion by Australian states and territories
  • Figure 4: Log-Log Distribution of Emotion Frequencies Stratified by Sentiment in Patient Feedback.
  • Figure 5: Extremes in sentiment-topic associations in the Care Opinion corpus, illustrating the dichotomy of patient-reported experiences: (a) encapsulates critical negative experiences, while (b) reflects commendations. Both extremes centre around patient experience rather than patient outcomes.
  • ...and 5 more figures