Table of Contents
Fetching ...

ClickSight: Interpreting Student Clickstreams to Reveal Insights on Learning Strategies via LLMs

Bahar Radmehr, Ekaterina Shved, Fatma Betül Güreş, Adish Singla, Tanja Käser

TL;DR

ClickSight presents an in-context LLM-based pipeline that interprets student clickstreams through predefined learning strategies, addressing the challenge of high-dimensional educational interactions. The method processes raw clickstreams from two domains (PharmaSim and Beer's Law Lab) and uses four prompting strategies with an optional self-refinement step, guided by a rubric-based evaluation by domain experts. Results show that LLMs can produce theory-aligned interpretations, with zero-shot prompting often yielding the best overall quality and self-refinement offering mixed benefits depending on the environment. The work demonstrates the potential of LLMs to extract interpretable, theory-driven insights from educational interaction data, enabling scalable, generalizable analysis for instructors and researchers.

Abstract

Clickstream data from digital learning environments offer valuable insights into students' learning behaviors, but are challenging to interpret due to their high dimensionality and granularity. Prior approaches have relied mainly on handcrafted features, expert labeling, clustering, or supervised models, therefore often lacking generalizability and scalability. In this work, we introduce ClickSight, an in-context Large Language Model (LLM)-based pipeline that interprets student clickstreams to reveal their learning strategies. ClickSight takes raw clickstreams and a list of learning strategies as input and generates textual interpretations of students' behaviors during interaction. We evaluate four different prompting strategies and investigate the impact of self-refinement on interpretation quality. Our evaluation spans two open-ended learning environments and uses a rubric-based domain-expert evaluation. Results show that while LLMs can reasonably interpret learning strategies from clickstreams, interpretation quality varies by prompting strategy, and self-refinement offers limited improvement. ClickSight demonstrates the potential of LLMs to generate theory-driven insights from educational interaction data.

ClickSight: Interpreting Student Clickstreams to Reveal Insights on Learning Strategies via LLMs

TL;DR

ClickSight presents an in-context LLM-based pipeline that interprets student clickstreams through predefined learning strategies, addressing the challenge of high-dimensional educational interactions. The method processes raw clickstreams from two domains (PharmaSim and Beer's Law Lab) and uses four prompting strategies with an optional self-refinement step, guided by a rubric-based evaluation by domain experts. Results show that LLMs can produce theory-aligned interpretations, with zero-shot prompting often yielding the best overall quality and self-refinement offering mixed benefits depending on the environment. The work demonstrates the potential of LLMs to extract interpretable, theory-driven insights from educational interaction data, enabling scalable, generalizable analysis for instructors and researchers.

Abstract

Clickstream data from digital learning environments offer valuable insights into students' learning behaviors, but are challenging to interpret due to their high dimensionality and granularity. Prior approaches have relied mainly on handcrafted features, expert labeling, clustering, or supervised models, therefore often lacking generalizability and scalability. In this work, we introduce ClickSight, an in-context Large Language Model (LLM)-based pipeline that interprets student clickstreams to reveal their learning strategies. ClickSight takes raw clickstreams and a list of learning strategies as input and generates textual interpretations of students' behaviors during interaction. We evaluate four different prompting strategies and investigate the impact of self-refinement on interpretation quality. Our evaluation spans two open-ended learning environments and uses a rubric-based domain-expert evaluation. Results show that while LLMs can reasonably interpret learning strategies from clickstreams, interpretation quality varies by prompting strategy, and self-refinement offers limited improvement. ClickSight demonstrates the potential of LLMs to generate theory-driven insights from educational interaction data.

Paper Structure

This paper contains 9 sections, 2 figures, 2 tables.

Figures (2)

  • Figure 1: The ClickSight pipeline interprets student clickstreams in three stages. (1) clickstream data from PharmaSim and Beer's Law Lab are structured, and context and learning strategies are gathered; (2) an LLM is prompted using one of four prompting strategies (Zero-shot, Chain-of-Thought, Meta-Prompting, Chain-of-Prompts) with optional self-refinement to generate interpretations; (3) human experts assess the outputs using a quality rubric.
  • Figure 2: Overall scores of Initial and Self-Refined interpretations generated using Zero-shot and Chain-of-Prompts prompting strategies in PharmaSim and Beer's Law Lab.