Table of Contents
Fetching ...

Visual Analysis of GitHub Issues to Gain Insights

Rifat Ara Proma, Paul Rosen

TL;DR

The paper tackles the gap in GitHub’s textual issue–commit presentation by introducing a D3.js-based visualization prototype that links issue timelines, related commits, and updated files. It presents three interconnected interfaces (Timeline View, Issue Graph, Summary of Updated Files) and details data collection from the GitHub API to enable these visuals. Through case studies of freeCodeCamp, Hyprland, and Javascript, plus a small user study, it demonstrates that the visualizations can improve interpretation speed and clarity of development patterns. The work highlights practical benefits for issue management and bottleneck identification, while outlining avenues for future enhancement such as comment analysis and support for reopened issues. Overall, the tool offers a concrete, interactive approach to enhance repository understanding and planning decisions for developers.

Abstract

Version control systems are integral to software development, with GitHub emerging as a popular online platform due to its comprehensive project management tools, including issue tracking and pull requests. However, GitHub lacks a direct link between issues and commits, making it difficult for developers to understand how specific issues are resolved. Although GitHub's Insights page provides some visualization for repository data, the representation of issues and commits related data in a textual format hampers quick evaluation of issue management. This paper presents a prototype web application that generates visualizations to offer insights into issue timelines and reveals different factors related to issues. It focuses on the lifecycle of issues and depicts vital information to enhance users' understanding of development patterns in their projects. We demonstrate the effectiveness of our approach through case studies involving three open-source GitHub repositories. Furthermore, we conducted a user evaluation to validate the efficacy of our prototype in conveying crucial repository information more efficiently and rapidly.

Visual Analysis of GitHub Issues to Gain Insights

TL;DR

The paper tackles the gap in GitHub’s textual issue–commit presentation by introducing a D3.js-based visualization prototype that links issue timelines, related commits, and updated files. It presents three interconnected interfaces (Timeline View, Issue Graph, Summary of Updated Files) and details data collection from the GitHub API to enable these visuals. Through case studies of freeCodeCamp, Hyprland, and Javascript, plus a small user study, it demonstrates that the visualizations can improve interpretation speed and clarity of development patterns. The work highlights practical benefits for issue management and bottleneck identification, while outlining avenues for future enhancement such as comment analysis and support for reopened issues. Overall, the tool offers a concrete, interactive approach to enhance repository understanding and planning decisions for developers.

Abstract

Version control systems are integral to software development, with GitHub emerging as a popular online platform due to its comprehensive project management tools, including issue tracking and pull requests. However, GitHub lacks a direct link between issues and commits, making it difficult for developers to understand how specific issues are resolved. Although GitHub's Insights page provides some visualization for repository data, the representation of issues and commits related data in a textual format hampers quick evaluation of issue management. This paper presents a prototype web application that generates visualizations to offer insights into issue timelines and reveals different factors related to issues. It focuses on the lifecycle of issues and depicts vital information to enhance users' understanding of development patterns in their projects. We demonstrate the effectiveness of our approach through case studies involving three open-source GitHub repositories. Furthermore, we conducted a user evaluation to validate the efficacy of our prototype in conveying crucial repository information more efficiently and rapidly.
Paper Structure (19 sections, 7 figures)

This paper contains 19 sections, 7 figures.

Figures (7)

  • Figure 1: Timeline View illustrating the status of recent issues from the freeCodeCamp repository. The horizontal axis represents the date, while the bars' width corresponds to the issues' duration. Open issues are colored purple, while closed issues are colored green. Furthermore, open issues are accompanied by an arrow on the right to signify their ongoing nature and continuity. Additional information is revealed through tooltips when hovering the mouse over the bars.
  • Figure 2: Visualization of the issues created between May 24, 2023 - June 18, 2023, in freeCodeCamp repository using our approach. The Timeline View, depicted in (a), showcases the associated label colors of the issues through alternating striped bars. (b) displays the Issue Graph corresponding to a closed issue with the longest resolution time. (c) shows the Summary of Updated Files view. On the left is a donut chart illustrating the top 5 files with the highest number of updated lines, while the sixth wedge represents the cumulative sum of updated lines in the remaining files. In the middle, a histogram provides insights into the distribution of files across different ranges of updates. Finally, the interface on the right enables users to filter or modify the donut chart based on their preferences. Gray boxes represent annotations in all the figures, while a mouse icon and light yellow boxes indicate tooltips.
  • Figure 3: Visualizations of the issues created between May 15, 2023, and June 18, 2023, in Hyprland repository. The Timeline View shows a significant number of recent issues, where (a) indicates that the majority of the issues remain open, while (b) highlights that a significant portion of them are classified as bug reports. The top updated files are shown in (c) and (d). Finally, the Issue Graph, in (e), provides insight into a closed issue with the longest resolution time. Analyzing the commit messages made it possible to identify the specific commit responsible for fixing the issue, the contributor who made the commit, and the associated updated file.
  • Figure 4: Issue Graph of an open issue titled--search bar at top of page unable to find challenge from freeCodeCamp repository. The graph includes the "closed by" and "assignee" nodes for demonstration purposes. It should be noted that the issue did not have an assignee, and since it was open, the "closed by" information is not applicable. All other elements of the graph reflect real data.
  • Figure 5: Summary of Updated Files of freeCodeCamp repository. The default view shows a donut chart (top left) with the most updated files. After clicking the bar of the histogram (bottom), the donut chart gets updated (top right).
  • ...and 2 more figures