Table of Contents
Fetching ...

PyGress: Tool for Analyzing the Progression of Code Proficiency in Python OSS Projects

Rujiphart Charatvaraphan, Bunradar Chatchaiyadech, Thitirat Sukijprasert, Chaiyong Ragkhitwetsagul, Morakot Choetkiertikul, Raula Gaikovina Kula, Thanwadee Sunetnanta, Kenichi Matsumoto

TL;DR

PyGress addresses the lack of automated tooling to measure how Python code proficiency evolves in OSS by mapping code constructs to CEFR levels using the pycefr analyzer. The method analyzes complete GitHub commit histories, computes per-commit and per-contributor proficiency deltas, and visualizes project-wide and individual progress with interactive spider charts and timeline sliders. The paper demonstrates the approach on multiple OSS projects, revealing distributions skewed toward $A1$/$A2$ with period-specific high-proficiency contributions, and discusses implications for maintenance and bus-factor risk. This work provides a practical framework for understanding skill distribution in Python OSS and informs contributor mentoring and project sustainability.

Abstract

Assessing developer proficiency in open-source software (OSS) projects is essential for understanding project dynamics, especially for expertise. This paper presents PyGress, a web-based tool designed to automatically evaluate and visualize Python code proficiency using pycefr, a Python code proficiency analyzer. By submitting a GitHub repository link, the system extracts commit histories, analyzes source code proficiency across CEFR-aligned levels (A1 to C2), and generates visual summaries of individual and project-wide proficiency. The PyGress tool visualizes per-contributor proficiency distribution and tracks project code proficiency progression over time. PyGress offers an interactive way to explore contributor coding levels in Python OSS repositories. The video demonstration of the PyGress tool can be found at https://youtu.be/hxoeK-ggcWk, and the source code of the tool is publicly available at https://github.com/MUICT-SERU/PyGress.

PyGress: Tool for Analyzing the Progression of Code Proficiency in Python OSS Projects

TL;DR

PyGress addresses the lack of automated tooling to measure how Python code proficiency evolves in OSS by mapping code constructs to CEFR levels using the pycefr analyzer. The method analyzes complete GitHub commit histories, computes per-commit and per-contributor proficiency deltas, and visualizes project-wide and individual progress with interactive spider charts and timeline sliders. The paper demonstrates the approach on multiple OSS projects, revealing distributions skewed toward / with period-specific high-proficiency contributions, and discusses implications for maintenance and bus-factor risk. This work provides a practical framework for understanding skill distribution in Python OSS and informs contributor mentoring and project sustainability.

Abstract

Assessing developer proficiency in open-source software (OSS) projects is essential for understanding project dynamics, especially for expertise. This paper presents PyGress, a web-based tool designed to automatically evaluate and visualize Python code proficiency using pycefr, a Python code proficiency analyzer. By submitting a GitHub repository link, the system extracts commit histories, analyzes source code proficiency across CEFR-aligned levels (A1 to C2), and generates visual summaries of individual and project-wide proficiency. The PyGress tool visualizes per-contributor proficiency distribution and tracks project code proficiency progression over time. PyGress offers an interactive way to explore contributor coding levels in Python OSS repositories. The video demonstration of the PyGress tool can be found at https://youtu.be/hxoeK-ggcWk, and the source code of the tool is publicly available at https://github.com/MUICT-SERU/PyGress.

Paper Structure

This paper contains 14 sections, 5 figures, 2 tables, 1 algorithm.

Figures (5)

  • Figure 1: Current state-of-the-art only depicts contributions over time. This example is from the Django-silk Project
  • Figure 2: Approach for analyzing code proficiency changes
  • Figure 3: System architecture of PyGress
  • Figure 4: django-silk: Aggregated proficiencies--project level
  • Figure 5: django-silk: Individual contributor's proficiency