Table of Contents
Fetching ...

Knowledge Tagging with Large Language Model based Multi-Agent System

Hang Li, Tianlong Xu, Ethan Chang, Qingsong Wen

TL;DR

The paper tackles automatic knowledge tagging for educational questions by introducing an LLM-based multi-agent system (MAS) that decomposes complex knowledge definitions into sub-tasks handled by specialized agents. The four agents—task planner, semantic judger, numerical judger, and question solver—collaborate, with numerical constraints executed via Python tooling to ensure precise judgments. Experiments on MathKnowCT show that the MAS can achieve competitive accuracy and precision, particularly for base-sized LLMs, and offer cost-efficient advantages over large single-model baselines. Industrial deployment in K-12 settings demonstrates substantial labeling cost savings and improved content quality and diagnostic capabilities, underscoring the practical value of automated, scalable knowledge tagging in education.

Abstract

Knowledge tagging for questions is vital in modern intelligent educational applications, including learning progress diagnosis, practice question recommendations, and course content organization. Traditionally, these annotations have been performed by pedagogical experts, as the task demands not only a deep semantic understanding of question stems and knowledge definitions but also a strong ability to link problem-solving logic with relevant knowledge concepts. With the advent of advanced natural language processing (NLP) algorithms, such as pre-trained language models and large language models (LLMs), pioneering studies have explored automating the knowledge tagging process using various machine learning models. In this paper, we investigate the use of a multi-agent system to address the limitations of previous algorithms, particularly in handling complex cases involving intricate knowledge definitions and strict numerical constraints. By demonstrating its superior performance on the publicly available math question knowledge tagging dataset, MathKnowCT, we highlight the significant potential of an LLM-based multi-agent system in overcoming the challenges that previous methods have encountered. Finally, through an in-depth discussion of the implications of automating knowledge tagging, we underscore the promising results of deploying LLM-based algorithms in educational contexts.

Knowledge Tagging with Large Language Model based Multi-Agent System

TL;DR

The paper tackles automatic knowledge tagging for educational questions by introducing an LLM-based multi-agent system (MAS) that decomposes complex knowledge definitions into sub-tasks handled by specialized agents. The four agents—task planner, semantic judger, numerical judger, and question solver—collaborate, with numerical constraints executed via Python tooling to ensure precise judgments. Experiments on MathKnowCT show that the MAS can achieve competitive accuracy and precision, particularly for base-sized LLMs, and offer cost-efficient advantages over large single-model baselines. Industrial deployment in K-12 settings demonstrates substantial labeling cost savings and improved content quality and diagnostic capabilities, underscoring the practical value of automated, scalable knowledge tagging in education.

Abstract

Knowledge tagging for questions is vital in modern intelligent educational applications, including learning progress diagnosis, practice question recommendations, and course content organization. Traditionally, these annotations have been performed by pedagogical experts, as the task demands not only a deep semantic understanding of question stems and knowledge definitions but also a strong ability to link problem-solving logic with relevant knowledge concepts. With the advent of advanced natural language processing (NLP) algorithms, such as pre-trained language models and large language models (LLMs), pioneering studies have explored automating the knowledge tagging process using various machine learning models. In this paper, we investigate the use of a multi-agent system to address the limitations of previous algorithms, particularly in handling complex cases involving intricate knowledge definitions and strict numerical constraints. By demonstrating its superior performance on the publicly available math question knowledge tagging dataset, MathKnowCT, we highlight the significant potential of an LLM-based multi-agent system in overcoming the challenges that previous methods have encountered. Finally, through an in-depth discussion of the implications of automating knowledge tagging, we underscore the promising results of deploying LLM-based algorithms in educational contexts.
Paper Structure (22 sections, 1 equation, 3 figures, 4 tables)

This paper contains 22 sections, 1 equation, 3 figures, 4 tables.

Figures (3)

  • Figure 1: An summary of the existing algorithm for automatic tagging task.
  • Figure 2: An overview of the proposed LLM-based multi-agent system for knowledge tagging. The semantic and numerical constraints in knowledge definition and decomposed sub-tasks are marked with corresponding colors.
  • Figure 3: An example of step-wise outputs from different LLM-based agents.