TutorLLM: Customizing Learning Recommendations with Knowledge Tracing and Retrieval-Augmented Generation
Zhaoxing Li, Vahid Yazdanpanah, Jindi Wang, Wen Gu, Lei Shi, Alexandra I. Cristea, Sarah Kiden, Sebastian Stein
TL;DR
Problem: LLM-based educational tools risk hallucinations and insufficient personalization. Approach: TutorLLM combines MLFBK Knowledge Tracing, a Scraper, and Retrieval-Augmented Generation via GPT-4 to deliver context-aware explanations and personalized learning recommendations, with a Chrome plugin interface. Contributions: first integration of KT with LLMs for personalized learning; practical Chrome plugin; two-week field study with 30 undergraduates showing a 10% increase in SUS and a 5% increase in quiz scores, alongside higher engagement. Significance: demonstrates KT-informed retrieval and state-aware personalization in real educational tasks and highlights the need for larger-scale validation and privacy-aware deployment.
Abstract
The integration of AI in education offers significant potential to enhance learning efficiency. Large Language Models (LLMs), such as ChatGPT, Gemini, and Llama, allow students to query a wide range of topics, providing unprecedented flexibility. However, LLMs face challenges, such as handling varying content relevance and lack of personalization. To address these challenges, we propose TutorLLM, a personalized learning recommender LLM system based on Knowledge Tracing (KT) and Retrieval-Augmented Generation (RAG). The novelty of TutorLLM lies in its unique combination of KT and RAG techniques with LLMs, which enables dynamic retrieval of context-specific knowledge and provides personalized learning recommendations based on the student's personal learning state. Specifically, this integration allows TutorLLM to tailor responses based on individual learning states predicted by the Multi-Features with Latent Relations BERT-based KT (MLFBK) model and to enhance response accuracy with a Scraper model. The evaluation includes user assessment questionnaires and performance metrics, demonstrating a 10% improvement in user satisfaction and a 5\% increase in quiz scores compared to using general LLMs alone.
