EduBot -- Can LLMs Solve Personalized Learning and Programming Assignments?
Yibin Wang, Jiaxi Xie, Lakshminarayanan Subramanian
TL;DR
EduBot tackles the problem of solving personalized programming assignments by combining conceptual knowledge teaching with recursive prompt-driven programming and automated debugging, all without fine-tuning LLMs. The approach uses a scenario-based workflow, a LSTM-based classifier to separate conceptual vs coding sub-tasks, and buildup prompts to iteratively refine code and resolve runtime issues, guiding users toward an optimal solution $A^*(Q)$. A benchmark of 20 scenarios totaling 79 sub-tasks across algorithms, ML, and real-world problems demonstrates robust performance, with GPT-4-o1 achieving complete sub-task completion and the fastest times in many categories. The findings suggest that pre-trained LLMs can be leveraged for scalable, multi-step reasoning and personalized programming assistance in educational and real-world settings, reducing the need for extensive fine-tuning.
Abstract
The prevalence of Large Language Models (LLMs) is revolutionizing the process of writing code. General and code LLMs have shown impressive performance in generating standalone functions and code-completion tasks with one-shot queries. However, the ability to solve comprehensive programming tasks with recursive requests and bug fixes remains questionable. In this paper, we propose EduBot, an intelligent automated assistant system that combines conceptual knowledge teaching, end-to-end code development, personalized programming through recursive prompt-driven methods, and debugging with limited human interventions powered by LLMs. We show that EduBot can solve complicated programming tasks consisting of sub-tasks with increasing difficulties ranging from conceptual to coding questions by recursive automatic prompt-driven systems without finetuning on LLMs themselves. To further evaluate EduBot's performance, we design and conduct a benchmark suite consisting of 20 scenarios in algorithms, machine learning, and real-world problems. The result shows that EduBot can complete most scenarios in less than 20 minutes. Based on the benchmark suites, we perform a comparative study to take different LLMs as the backbone and to verify EduBot's compatibility and robustness across LLMs with varying capabilities. We believe that EduBot is an exploratory approach to explore the potential of pre-trained LLMs in multi-step reasoning and code generation for solving personalized assignments with knowledge learning and code generation.
