Knowledge Islands: Visualizing Developers Knowledge Concentration
Otávio Cury, Guilherme Avelino
TL;DR
Knowledge Islands tackles the risk of knowledge concentration in software projects by providing an open-source, web-based tool that analyzes development history to identify experts and compute the Truck Factor across repositories, modules, and files. It uses the Degree of Expertise score $DOE(d,f(v))$ and an AVL-style greedy Truck Factor algorithm to generate hierarchical visualizations of knowledge islands, enabling managers to mitigate risk and guide onboarding. The tool is built on a modern stack (React frontend, Spring Boot backend, PostgreSQL, JGit) and exposes REST endpoints for asynchronous cloning and analysis of GitHub repositories. Future work includes GitHub OAuth integration, enhanced visualizations, and support for additional knowledge models, aiming to improve usability and coverage for practical software maintenance and resilience. Overall, Knowledge Islands provides a concrete, data-driven means to strengthen project continuity by pinpointing critical knowledge holders and files.
Abstract
Current software development is often a cooperative activity, where different situations can arise that put the existence of a project at risk. One common and extensively studied issue in the software engineering literature is the concentration of a significant portion of knowledge about the source code in a few developers on a team. In this scenario, the departure of one of these key developers could make it impossible to continue the project. This work presents Knowledge Islands, a tool that visualizes the concentration of knowledge in a software repository using a state-of-the-art knowledge model. Key features of Knowledge Islands include user authentication, cloning, and asynchronous analysis of user repositories, identification of the expertise of the team's developers, calculation of the Truck Factor for all folders and source code files, and identification of the main developers and repository files. This open-source tool enables practitioners to analyze GitHub projects, determine where knowledge is concentrated within the development team, and implement measures to maintain project health. The source code of Knowledge Islands is available in a public repository, and there is a presentation about the tool in video.
