Improving User Behavior Prediction: Leveraging Annotator Metadata in Supervised Machine Learning Models
Lynnette Hui Xian Ng, Kokil Jaidka, Kaiyuan Tay, Hansin Ahuja, Niyati Chhaya
TL;DR
The paper tackles the challenge of predicting user behavior from conversational text when crowdsourced labels are noisy. It proposes MSWEEM, a metadata-sensitive ensemble that uses annotator meta-features such as Throughput and Worktime to weight auxiliary label encodings before predicting the target variable. Empirical results show a 14% improvement on held-out Diplomacy data and about 12% on OffMyChest, with meta-features significantly enhancing performance across datasets and annotator cohorts. The work demonstrates the practical value of incorporating annotator behavior signals into NLP workflows, offering actionable guidance for crowdsourcing designs and robust modeling under label-quality uncertainty. It also provides insights into how different annotator cohorts (e.g., Master-qualified workers) contribute to data quality and model performance, informing quality control and data collection strategies.
Abstract
Supervised machine-learning models often underperform in predicting user behaviors from conversational text, hindered by poor crowdsourced label quality and low NLP task accuracy. We introduce the Metadata-Sensitive Weighted-Encoding Ensemble Model (MSWEEM), which integrates annotator meta-features like fatigue and speeding. First, our results show MSWEEM outperforms standard ensembles by 14% on held-out data and 12% on an alternative dataset. Second, we find that incorporating signals of annotator behavior, such as speed and fatigue, significantly boosts model performance. Third, we find that annotators with higher qualifications, such as Master's, deliver more consistent and faster annotations. Given the increasing uncertainty over annotation quality, our experiments show that understanding annotator patterns is crucial for enhancing model accuracy in user behavior prediction.
