Mixed Feelings: Cross-Domain Sentiment Classification of Patient Feedback
Egil Rønningstad, Lilja Charlotte Storset, Petter Mæhlum, Lilja Øvrelid, Erik Velldal
TL;DR
This work addresses cross-domain sentiment classification for Norwegian patient feedback by comparing in-domain (NorPaC) and out-of-domain (NoReC) data across neural and non-neural models. It evaluates four-class polarity (positive, negative, mixed, neutral) and analyzes the effects of joint multi-domain training, domain differences in genre, and data scarcity. Key findings show neural models, especially NorBERT3 Large, achieve strong in-domain performance, while cross-domain data can boost performance when in-domain data are limited but may be detrimental when in-domain data are abundant; joint training offers mixed benefits depending on the target domain. The study provides practical guidance for deploying SA in healthcare text and highlights the value and limits of leveraging general-domain sentiment data for specialized domains.
Abstract
Sentiment analysis of patient feedback from the public health domain can aid decision makers in evaluating the provided services. The current paper focuses on free-text comments in patient surveys about general practitioners and psychiatric healthcare, annotated with four sentence-level polarity classes -- positive, negative, mixed and neutral -- while also attempting to alleviate data scarcity by leveraging general-domain sources in the form of reviews. For several different architectures, we compare in-domain and out-of-domain effects, as well as the effects of training joint multi-domain models.
