Table of Contents
Fetching ...

The Strength of Weak Ties Between Open-Source Developers

Hongbo Fang, Patrick Park, James Evans, James Herbsleb, Bogdan Vasilescu

TL;DR

This study extends Granovetter's strength-of-weak-ties theory to open-source software by examining how developers' past interactions across three GitHub-based networks relate to the innovativeness of their subsequent code. It introduces a novel software-innovativeness measure based on atypical package recombinations and estimates knowledge-access diversity via Node2Vec embeddings, coupled with PCA to separate strong and weak tie effects. The key finding is that topical diversity accessed through weak ties (not interaction volume) is positively associated with future project innovativeness, with robustness to alternative core definitions and interaction windows. The work has practical implications for platform design and highlights lurking (starring) as a meaningful channel for creative inspiration in software projects, while contributing a scalable methodology for measuring creative potential in large OSS ecosystems.

Abstract

In a real-world social network, weak ties (reflecting low-intensity, infrequent interactions) act as bridges and connect people to different social circles, giving them access to diverse information and opportunities that are not available within one's immediate, close-knit vicinity. Weak ties can be crucial for creativity and innovation, as they introduce ideas and approaches that people can then combine in novel ways, leading to innovative solutions. Do weak ties facilitate creativity in software in similar ways? This paper suggests that the answer is "yes." Concretely, we study the correlation between developers' knowledge acquisition through three distinct interaction networks on GitHub and the innovativeness of the projects they develop, across over 37,000 Python projects hosted on GitHub. Our findings suggest that the topical diversity of projects in which developers engage, rather than the volume, correlates positively with the innovativeness of their future code. Notably, exposure through weak interactions (e.g., starring) emerges as a stronger predictor of future novelty than via strong ones (e.g., committing)

The Strength of Weak Ties Between Open-Source Developers

TL;DR

This study extends Granovetter's strength-of-weak-ties theory to open-source software by examining how developers' past interactions across three GitHub-based networks relate to the innovativeness of their subsequent code. It introduces a novel software-innovativeness measure based on atypical package recombinations and estimates knowledge-access diversity via Node2Vec embeddings, coupled with PCA to separate strong and weak tie effects. The key finding is that topical diversity accessed through weak ties (not interaction volume) is positively associated with future project innovativeness, with robustness to alternative core definitions and interaction windows. The work has practical implications for platform design and highlights lurking (starring) as a meaningful channel for creative inspiration in software projects, while contributing a scalable methodology for measuring creative potential in large OSS ecosystems.

Abstract

In a real-world social network, weak ties (reflecting low-intensity, infrequent interactions) act as bridges and connect people to different social circles, giving them access to diverse information and opportunities that are not available within one's immediate, close-knit vicinity. Weak ties can be crucial for creativity and innovation, as they introduce ideas and approaches that people can then combine in novel ways, leading to innovative solutions. Do weak ties facilitate creativity in software in similar ways? This paper suggests that the answer is "yes." Concretely, we study the correlation between developers' knowledge acquisition through three distinct interaction networks on GitHub and the innovativeness of the projects they develop, across over 37,000 Python projects hosted on GitHub. Our findings suggest that the topical diversity of projects in which developers engage, rather than the volume, correlates positively with the innovativeness of their future code. Notably, exposure through weak interactions (e.g., starring) emerges as a stronger predictor of future novelty than via strong ones (e.g., committing)

Paper Structure

This paper contains 18 sections, 7 figures, 9 tables.

Figures (7)

  • Figure 1: t-SNE visualization of the embedding space for weak ties (details in Section \ref{['sec:network-variables']}), depicting all the weak ties of the https://github.com/opengeoscience/geonotebook Python project in our sample. We highlight some that seem influential for the design of the focal project.
  • Figure 2: Two core developers Green and Orange started contributing to a focal project $A$ on different dates. We record an edge from $A$ to $B$ (i.e., $B$ could be a source of knowledge for $A$), because Green interacted with project $B$ in the previous year. We don't record an edge from $A$ to $C$, because Orange interacted with $C$ too far into the past.
  • Figure 3: Illustration of the four possible combinations of low-high values for $PC1_{\text{diversity}}$ (corresponding to average tie diversity) and $PC2_{\text{diversity}}$ (corresponding to the relative diversity attributable to weak ties).
  • Figure 4: "Awesome" projects have statistically significantly higher atypicality scores than the rest of our sample. The red dots represent the distribution means.
  • Figure 5: The effects of the two diversity of knowledge variables ($\textit{Div}_{ave}$ testing H$_{\ref{['hyp2']}}$ and $\textit{Div}_{weakness}$ testing H$_{\ref{['hyp3']}}$) are generally robust. Error bars denote 95% confidence intervals.
  • ...and 2 more figures