QACP: An Annotated Question Answering Dataset for Assisting Chinese Python Programming Learners

Rui Xiao; Lu Han; Xiaoying Zhou; Jiong Wang; Na Zong; Pengyu Zhang

QACP: An Annotated Question Answering Dataset for Assisting Chinese Python Programming Learners

Rui Xiao, Lu Han, Xiaoying Zhou, Jiong Wang, Na Zong, Pengyu Zhang

TL;DR

QACP tackles data scarcity for Chinese Python education by producing a large, annotated, single-turn Q&A dataset drawn from real learner questions. It provides 10,960 annotated Q&As collected from 50,247 questions with quality-controlled annotations across accessible explanations, analogies, and code examples. The authors benchmark diverse Chinese-capable LLMs on two tasks and find that general LLMs struggle with professional Python content, with GPT-4 performing best but still meaningful gaps. The work establishes a foundation for developing a specialized Chinese Python teaching assistant and highlights the critical role of high-quality educational data in model performance.

Abstract

In online learning platforms, particularly in rapidly growing computer programming courses, addressing the thousands of students' learning queries requires considerable human cost. The creation of intelligent assistant large language models (LLMs) tailored for programming education necessitates distinct data support. However, in real application scenarios, the data resources for training such LLMs are relatively scarce. Therefore, to address the data scarcity in intelligent educational systems for programming, this paper proposes a new Chinese question-and-answer dataset for Python learners. To ensure the authenticity and reliability of the sources of the questions, we collected questions from actual student questions and categorized them according to various dimensions such as the type of questions and the type of learners. This annotation principle is designed to enhance the effectiveness and quality of online programming education, providing a solid data foundation for developing the programming teaching assists (TA). Furthermore, we conducted comprehensive evaluations of various LLMs proficient in processing and generating Chinese content, highlighting the potential limitations of general LLMs as intelligent teaching assistants in computer programming courses.

QACP: An Annotated Question Answering Dataset for Assisting Chinese Python Programming Learners

TL;DR

Abstract

Paper Structure (16 sections, 1 equation, 3 figures, 4 tables)

This paper contains 16 sections, 1 equation, 3 figures, 4 tables.

Introduction
Related Work
Enhancing LLMs' Generation Capabilities with Data Support
LLMs as Programming Teaching Assistants
Dataset Construction
Query Collection
Annotation Principle and Quality Control
Data Analysis
Experiment
Benchmark Models
Data Sampling
Evaluation Principles
Experiment Results and Analysis
Python Problem-Solving Ability Evaluation of LLMs
Python Answer Reasoning Capability Analysis of LLMs
...and 1 more sections

Figures (3)

Figure 1: The example of GPT-4 in responding the Python-related questions (Test time is January 8, 2024).
Figure 2: The data construction pipeline of the QACP dataset.
Figure 3: The distribution of knowledge points for the collected Python questions.

QACP: An Annotated Question Answering Dataset for Assisting Chinese Python Programming Learners

TL;DR

Abstract

QACP: An Annotated Question Answering Dataset for Assisting Chinese Python Programming Learners

Authors

TL;DR

Abstract

Table of Contents

Figures (3)