Table of Contents
Fetching ...

Characterising Developer Sentiment in Software Components: An Exploratory Study of Gentoo

Tien Rahayu Tulili, Ayushi Rastogi, Andrea Capiluppi

TL;DR

The study addresses how developer sentiment, captured at sentence level, relates to the evolution of software components in Gentoo over a 23-year span ($2001$–$2023$). It combines sentence-level sentiment analysis on mailing-list messages with path/grain–level commit data to identify DWNs and DWPs and to quantify emotion across components, revealing a long-term decrease in negative emotional expression. The results show that certain grains and paths (e.g., core development areas) experience persistent negativity and that higher activity on negative paths correlates with greater commit counts, while overall sentiment has shifted toward more constructive communication. The findings offer practical guidance for fostering healthier OSS collaboration and provide a framework for extending sentiment–development analyses to other projects and richer emotion taxonomies in future work.

Abstract

Collaborative software development happens in teams, that cooperate on shared artefacts, and discuss development on online platforms. Due to the complexity of development and the variety of teams, software components often act as effective containers for parallel work and teams. Past research has shown how communication between team members, especially in an open-source environment, can become extremely toxic, and lead to members leaving the development team. This has a direct effect on the evolution and maintenance of the project in which the former members were active in. The purpose of our study is two-fold: first, we propose an approach to evaluate, at a finer granularity, the positive and negative emotions in the communication between developers; and second, we aim to characterise a project's development paths, or components, as more or less impacted by the emotions. Our analysis evaluates single sentences rather than whole messages as the finest granularity of communication. The previous study found that the high positivity or negativity at the sentence level may indirectly impact the writer him/herself, or the reader. In this way, we could highlight specific paths of Gentoo as the most affected by negative emotions, and show how negative emotions have evolved and changed along the same paths. By joining the analysis of the mailing lists, from which we derive the sentiment of the developers, with the information derived from the development logs, we obtained a longitudinal picture of how development paths have been historically affected by positive or negative emotions. Our study shows that, in recent years, negative emotions have generally decreased in the communication between Gentoo developers. We also show how file paths, as collaborative software development artefacts, were more or less impacted by the emotions of the developers.

Characterising Developer Sentiment in Software Components: An Exploratory Study of Gentoo

TL;DR

The study addresses how developer sentiment, captured at sentence level, relates to the evolution of software components in Gentoo over a 23-year span (). It combines sentence-level sentiment analysis on mailing-list messages with path/grain–level commit data to identify DWNs and DWPs and to quantify emotion across components, revealing a long-term decrease in negative emotional expression. The results show that certain grains and paths (e.g., core development areas) experience persistent negativity and that higher activity on negative paths correlates with greater commit counts, while overall sentiment has shifted toward more constructive communication. The findings offer practical guidance for fostering healthier OSS collaboration and provide a framework for extending sentiment–development analyses to other projects and richer emotion taxonomies in future work.

Abstract

Collaborative software development happens in teams, that cooperate on shared artefacts, and discuss development on online platforms. Due to the complexity of development and the variety of teams, software components often act as effective containers for parallel work and teams. Past research has shown how communication between team members, especially in an open-source environment, can become extremely toxic, and lead to members leaving the development team. This has a direct effect on the evolution and maintenance of the project in which the former members were active in. The purpose of our study is two-fold: first, we propose an approach to evaluate, at a finer granularity, the positive and negative emotions in the communication between developers; and second, we aim to characterise a project's development paths, or components, as more or less impacted by the emotions. Our analysis evaluates single sentences rather than whole messages as the finest granularity of communication. The previous study found that the high positivity or negativity at the sentence level may indirectly impact the writer him/herself, or the reader. In this way, we could highlight specific paths of Gentoo as the most affected by negative emotions, and show how negative emotions have evolved and changed along the same paths. By joining the analysis of the mailing lists, from which we derive the sentiment of the developers, with the information derived from the development logs, we obtained a longitudinal picture of how development paths have been historically affected by positive or negative emotions. Our study shows that, in recent years, negative emotions have generally decreased in the communication between Gentoo developers. We also show how file paths, as collaborative software development artefacts, were more or less impacted by the emotions of the developers.
Paper Structure (23 sections, 7 figures, 2 tables)

This paper contains 23 sections, 7 figures, 2 tables.

Figures (7)

  • Figure 1: Total number of messages containing positive sentences (blue), and messages containing negative sentences (red).
  • Figure 2: Boxplots of the top 10 of active developers, DWNs, DWPs regarding a relative number of commits done (a) and of positive and negative sentences written (b).
  • Figure 3: Heatmaps of number of negative and positive messages containing negative/positive sentences by grains with standard normalisation (z-score) applied yearly
  • Figure 4: Heatmaps of number of negative and positive messages containing negative/positive sentences by paths with standard normalisation applied yearly
  • Figure 5: bar graphs of the number of commits of top ten negative paths and top ten (red-color bars) positive paths (blue-color bars)
  • ...and 2 more figures