Application of Prompt Learning Models in Identifying the Collaborative Problem Solving Skills in an Online Task
Mengxiao Zhu, Xin Wang, Xiantao Wang, Zihang Chen, Wei Huang
TL;DR
This paper addresses the bottleneck of manually coding collaborative problem solving (CPS) behaviors by introducing a prompt‑based learning framework that leverages pre‑trained language models to automatically code CPS subskills from chat and manipulation data. Through three experiments, the authors demonstrate that manually designed prompts with the T5 model yield the best performance on CPS chat coding, and that prompt‑based methods consistently outperform baselines, including finetuned large models, especially in low‑resource settings. The study reports robust results with large training sets (accuracy up to $0.804$, macro F1 up to $0.743$, kappa up to $0.746$) and strong performance on small datasets (e.g., RoBERTa and BERT prompts requiring as little as $6\%$–$11\%$ of data). The findings have practical implications for scalable CPS assessment and real‑time feedback in CPS and broader CSCW contexts, and point to future work on context‑aware classification and generalization to other domains and data streams.
Abstract
Collaborative problem solving (CPS) competence is considered one of the essential 21st-century skills. To facilitate the assessment and learning of CPS competence, researchers have proposed a series of frameworks to conceptualize CPS and explored ways to make sense of the complex processes involved in collaborative problem solving. However, encoding explicit behaviors into subskills within the frameworks of CPS skills is still a challenging task. Traditional studies have relied on manual coding to decipher behavioral data for CPS, but such coding methods can be very time-consuming and cannot support real-time analyses. Scholars have begun to explore approaches for constructing automatic coding models. Nevertheless, the existing models built using machine learning or deep learning techniques depend on a large amount of training data and have relatively low accuracy. To address these problems, this paper proposes a prompt-based learning pre-trained model. The model can achieve high performance even with limited training data. In this study, three experiments were conducted, and the results showed that our model not only produced the highest accuracy, macro F1 score, and kappa values on large training sets, but also performed the best on small training sets of the CPS behavioral data. The application of the proposed prompt-based learning pre-trained model contributes to the CPS skills coding task and can also be used for other CSCW coding tasks to replace manual coding.
