Table of Contents
Fetching ...

AI-Driven Grading and Moderation for Collaborative Projects in Computer Science Education

Songmei Yu, Andrew Zagula

TL;DR

Collaborative group projects in CS education are hard to grade fairly at scale. The paper introduces an AI-driven grading system that jointly evaluates overall project quality and individual contributions using repository mining, communication analytics, and machine learning. The architecture includes PQAM for project quality, ICA for individual contribution analytics, and GE for final grade computation, validated in a Fall 2024 pilot with 20 students. Results show strong alignment with instructor assessments ($r = 0.91$), high student-perceived fairness and transparency, and a 45% reduction in instructor grading effort, with ethical considerations and future enhancements discussed.

Abstract

Collaborative group projects are integral to computer science education, as they foster teamwork, problem-solving skills, and industry-relevant competencies. However, assessing individual contributions within group settings has long been a challenge. Traditional assessment strategies, such as the equal distribution of grades or subjective peer assessments, often fall short in terms of fairness, objectivity, and scalability, particularly in large classrooms. This paper introduces a semi-automated, AI-assisted grading system that evaluates both project quality and individual effort using repository mining, communication analytics, and machine learning models. The system comprises modules for project evaluation, contribution analysis, and grade computation, integrating seamlessly with platforms like GitHub. A pilot deployment in a senior-level course demonstrated high alignment with instructor assessments, increased student satisfaction, and reduced instructor grading effort. We conclude by discussing implementation considerations, ethical implications, and proposed enhancements to broaden applicability.

AI-Driven Grading and Moderation for Collaborative Projects in Computer Science Education

TL;DR

Collaborative group projects in CS education are hard to grade fairly at scale. The paper introduces an AI-driven grading system that jointly evaluates overall project quality and individual contributions using repository mining, communication analytics, and machine learning. The architecture includes PQAM for project quality, ICA for individual contribution analytics, and GE for final grade computation, validated in a Fall 2024 pilot with 20 students. Results show strong alignment with instructor assessments (), high student-perceived fairness and transparency, and a 45% reduction in instructor grading effort, with ethical considerations and future enhancements discussed.

Abstract

Collaborative group projects are integral to computer science education, as they foster teamwork, problem-solving skills, and industry-relevant competencies. However, assessing individual contributions within group settings has long been a challenge. Traditional assessment strategies, such as the equal distribution of grades or subjective peer assessments, often fall short in terms of fairness, objectivity, and scalability, particularly in large classrooms. This paper introduces a semi-automated, AI-assisted grading system that evaluates both project quality and individual effort using repository mining, communication analytics, and machine learning models. The system comprises modules for project evaluation, contribution analysis, and grade computation, integrating seamlessly with platforms like GitHub. A pilot deployment in a senior-level course demonstrated high alignment with instructor assessments, increased student satisfaction, and reduced instructor grading effort. We conclude by discussing implementation considerations, ethical implications, and proposed enhancements to broaden applicability.

Paper Structure

This paper contains 38 sections, 1 equation, 3 figures.

Figures (3)

  • Figure 1: Structure of the Project Quality Assessment Module (PQAM), highlighting submodules contributing to the Project Quality Score.
  • Figure 2: Architecture of the Individual Contribution Analyzer (ICA) illustrating how repository and communication artifacts are processed through analytical components to produce the normalized individual contribution score.
  • Figure 3: Workflow of the AI-assisted grading system. Steps 1–4 (blue) form the automated pipeline; anomalies (amber) trigger manual review and dashboard oversight (gray-blue), while normal cases proceed directly.