Knowledge Tagging System on Math Questions via LLMs with Flexible Demonstration Retriever
Hang Li, Tianlong Xu, Jiliang Tang, Qingsong Wen
TL;DR
This paper tackles automatic knowledge tagging for math questions in intelligent education by introducing KnowTS, a large-language-model–based system that excels in zero-shot and few-shot settings using only knowledge definition text. It further advances performance and efficiency with FlexSDR, an RL-based demonstration retriever that adaptively selects and sequences demonstrations and employs an early-stop mechanism to minimize input length. Through extensive experiments on the MathKnowCT dataset, KnowTS achieves strong zero-shot results and substantial few-shot gains, while FlexSDR consistently enhances demonstration efficiency and tagging accuracy across multiple LLMs. The findings demonstrate that adaptive demonstration retrieval, guided by reinforcement learning, can greatly reduce annotation requirements and enable scalable, high-quality knowledge tagging in educational technologies.
Abstract
Knowledge tagging for questions plays a crucial role in contemporary intelligent educational applications, including learning progress diagnosis, practice question recommendations, and course content organization. Traditionally, these annotations are always conducted by pedagogical experts, as the task requires not only a strong semantic understanding of both question stems and knowledge definitions but also deep insights into connecting question-solving logic with corresponding knowledge concepts. With the recent emergence of advanced text encoding algorithms, such as pre-trained language models, many researchers have developed automatic knowledge tagging systems based on calculating the semantic similarity between the knowledge and question embeddings. In this paper, we explore automating the task using Large Language Models (LLMs), in response to the inability of prior encoding-based methods to deal with the hard cases which involve strong domain knowledge and complicated concept definitions. By showing the strong performance of zero- and few-shot results over math questions knowledge tagging tasks, we demonstrate LLMs' great potential in conquering the challenges faced by prior methods. Furthermore, by proposing a reinforcement learning-based demonstration retriever, we successfully exploit the great potential of different-sized LLMs in achieving better performance results while keeping the in-context demonstration usage efficiency high.
