Visual Analysis of GitHub Issues to Gain Insights
Rifat Ara Proma, Paul Rosen
TL;DR
The paper tackles the gap in GitHub’s textual issue–commit presentation by introducing a D3.js-based visualization prototype that links issue timelines, related commits, and updated files. It presents three interconnected interfaces (Timeline View, Issue Graph, Summary of Updated Files) and details data collection from the GitHub API to enable these visuals. Through case studies of freeCodeCamp, Hyprland, and Javascript, plus a small user study, it demonstrates that the visualizations can improve interpretation speed and clarity of development patterns. The work highlights practical benefits for issue management and bottleneck identification, while outlining avenues for future enhancement such as comment analysis and support for reopened issues. Overall, the tool offers a concrete, interactive approach to enhance repository understanding and planning decisions for developers.
Abstract
Version control systems are integral to software development, with GitHub emerging as a popular online platform due to its comprehensive project management tools, including issue tracking and pull requests. However, GitHub lacks a direct link between issues and commits, making it difficult for developers to understand how specific issues are resolved. Although GitHub's Insights page provides some visualization for repository data, the representation of issues and commits related data in a textual format hampers quick evaluation of issue management. This paper presents a prototype web application that generates visualizations to offer insights into issue timelines and reveals different factors related to issues. It focuses on the lifecycle of issues and depicts vital information to enhance users' understanding of development patterns in their projects. We demonstrate the effectiveness of our approach through case studies involving three open-source GitHub repositories. Furthermore, we conducted a user evaluation to validate the efficacy of our prototype in conveying crucial repository information more efficiently and rapidly.
