GitHub Stargazers | Building Graph- and Edge-level Prediction Algorithms for Developer Social Networks
Karishma Thakrar, Aniket Chauhan
TL;DR
The paper tackles segmenting GitHub developer networks into web development and machine learning communities and predicting potential collaborations. It employs Graph Convolutional Networks (GCNs) for graph classification and GraphSAGE for edge-level link prediction, with a Random Forest applied to GCN embeddings to boost performance. Results indicate moderate classification performance (AUC ≈ 0.74) and effective edge recommendations across graphs, highlighting the utility of graph-based analysis for open-source communities. The study provides a scalable framework for market analysis and engagement recommendations within developer networks, with clear paths for incorporating richer features and temporal dynamics in future work.
Abstract
Analyzing social networks formed by developers provides valuable insights for market segmentation, trend analysis, and community engagement. In this study, we explore the GitHub Stargazers dataset to classify developer communities and predict potential collaborations using graph neural networks (GNNs). By modeling 12,725 developer networks, we segment communities based on their focus on web development or machine learning repositories, leveraging graph attributes and node embeddings. Furthermore, we propose an edge-level recommendation algorithm that predicts new connections between developers using similarity measures. Our experimental results demonstrate the effectiveness of our approach in accurately segmenting communities and improving connection predictions, offering valuable insights for understanding open-source developer networks.
