Table of Contents
Fetching ...

LLM-SEM: A Sentiment-Based Student Engagement Metric Using LLMS for E-Learning Platforms

Ali Hamdi, Ahmed Abdelmoneim Mazrou, Mohamed Shaltout

TL;DR

The paper tackles the problem of quantitatively assessing student engagement on e-learning platforms where traditional surveys and metadata-only analyses fall short due to fuzzy sentiment and scalability limits. It introduces LLM-SEM, a data-driven metric that fuses course/lesson video metadata with sentiment predictions from Large Language Models to compute engagement at both course and lesson levels, using $E_v = NV_v + NL_v + P_v$ and aggregating to playlist scores $P_p$ as $P_p = \frac{\sum_{v \in V_p} P_v}{N_p}$. The methodology includes a multi-stage pipeline for data collection, sentiment analysis, polarity scoring, and feature normalization, with a fine-tuned RoBERTa model delivering the best sentiment accuracy ($Acc = 0.86$, $F1 = 0.84$) among evaluated models. The study demonstrates the effectiveness and scalability of LLM-SEM across multilingual sentiment contexts, suggesting practical benefits for content optimization and engagement monitoring on large e-learning platforms.

Abstract

Current methods for analyzing student engagement in e-learning platforms, including automated systems, often struggle with challenges such as handling fuzzy sentiment in text comments and relying on limited metadata. Traditional approaches, such as surveys and questionnaires, also face issues like small sample sizes and scalability. In this paper, we introduce LLM-SEM (Language Model-Based Student Engagement Metric), a novel approach that leverages video metadata and sentiment analysis of student comments to measure engagement. By utilizing recent Large Language Models (LLMs), we generate high-quality sentiment predictions to mitigate text fuzziness and normalize key features such as views and likes. Our holistic method combines comprehensive metadata with sentiment polarity scores to gauge engagement at both the course and lesson levels. Extensive experiments were conducted to evaluate various LLM models, demonstrating the effectiveness of LLM-SEM in providing a scalable and accurate measure of student engagement. We fine-tuned TXLM-RoBERTa using human-annotated sentiment datasets to enhance prediction accuracy and utilized LLama 3B, and Gemma 9B from Ollama.

LLM-SEM: A Sentiment-Based Student Engagement Metric Using LLMS for E-Learning Platforms

TL;DR

The paper tackles the problem of quantitatively assessing student engagement on e-learning platforms where traditional surveys and metadata-only analyses fall short due to fuzzy sentiment and scalability limits. It introduces LLM-SEM, a data-driven metric that fuses course/lesson video metadata with sentiment predictions from Large Language Models to compute engagement at both course and lesson levels, using and aggregating to playlist scores as . The methodology includes a multi-stage pipeline for data collection, sentiment analysis, polarity scoring, and feature normalization, with a fine-tuned RoBERTa model delivering the best sentiment accuracy (, ) among evaluated models. The study demonstrates the effectiveness and scalability of LLM-SEM across multilingual sentiment contexts, suggesting practical benefits for content optimization and engagement monitoring on large e-learning platforms.

Abstract

Current methods for analyzing student engagement in e-learning platforms, including automated systems, often struggle with challenges such as handling fuzzy sentiment in text comments and relying on limited metadata. Traditional approaches, such as surveys and questionnaires, also face issues like small sample sizes and scalability. In this paper, we introduce LLM-SEM (Language Model-Based Student Engagement Metric), a novel approach that leverages video metadata and sentiment analysis of student comments to measure engagement. By utilizing recent Large Language Models (LLMs), we generate high-quality sentiment predictions to mitigate text fuzziness and normalize key features such as views and likes. Our holistic method combines comprehensive metadata with sentiment polarity scores to gauge engagement at both the course and lesson levels. Extensive experiments were conducted to evaluate various LLM models, demonstrating the effectiveness of LLM-SEM in providing a scalable and accurate measure of student engagement. We fine-tuned TXLM-RoBERTa using human-annotated sentiment datasets to enhance prediction accuracy and utilized LLama 3B, and Gemma 9B from Ollama.

Paper Structure

This paper contains 16 sections, 6 equations, 2 figures, 1 table.

Figures (2)

  • Figure 1: A workflow illustrating the process of calculating the Student Engagement Metric (SEM) using course lessons, metadata, comments, and sentiment analysis. Starting from (a) the course lessons, (b) metadata is extracted and normalized using Min-Max normalization (g) to contribute to the final SEM (h). Simultaneously, (c) comments are analyzed by a Large Language Model (LLM) (d) for sentiment analysis (e), leading to (f) multi-level polarity scoring. The scoring is aggregated with the normalized metadata to compute the final student engagement metric for each course or lesson.
  • Figure 2: Schema of the data collection structure showing the relationships between Playlists, Videos, and Comments. Each playlist contains multiple videos, identified by a foreign key Playlist_ID. Videos are linked to comments through the Video_ID foreign key, and each video has associated metadata such as views, likes, and duration. This relational structure allows efficient organization and processing of data for analysis.