A Hierarchical-DBSCAN Method for Extracting Microservices from Monolithic Applications
Khaled Sellami, Mohamed Aymen Saied, Ali Ouni
TL;DR
The paper tackles the challenge of migrating monolithic applications to microservices by automating the extraction of microservices from source code. It introduces a Hierarchical-DBSCAN approach that fuses static-structure signals (call relations) with semantic domain signals (TF-IDF of domain terms) into a Class Similarity score, and applies an $\epsilon$-DBSCAN variant to generate a hierarchical decomposition with outlier detection. Empirical evaluation on multiple open-source projects shows the method achieves cohesive microservices with fewer inter-service interactions, often matching or surpassing human-designed decompositions and outperforming several baselines on key metrics, though some cases exhibit higher Non-Extreme Distribution values. The hierarchical output provides a tunable, interpretable view of candidate microservices that supports developer-driven refinement, making automated decomposition more practical and scalable for real-world migrations.
Abstract
The microservices architectural style offers many advantages such as scalability, reusability and ease of maintainability. As such microservices has become a common architectural choice when developing new applications. Hence, to benefit from these advantages, monolithic applications need to be redesigned in order to migrate to a microservice based architecture. Due to the inherent complexity and high costs related to this process, it is crucial to automate this task. In this paper, we propose a method that can identify potential microservices from a given monolithic application. Our method takes as input the source code of the source application in order to measure the similarities and dependencies between all of the classes in the system using their interactions and the domain terminology employed within the code. These similarity values are then used with a variant of a density-based clustering algorithm to generate a hierarchical structure of the recommended microservices while identifying potential outlier classes. We provide an empirical evaluation of our approach through different experimental settings including a comparison with existing human-designed microservices and a comparison with 5 baselines. The results show that our method succeeds in generating microservices that are overall more cohesive and that have fewer interactions in-between them with up to 0.9 of precision score when compared to human-designed microservices.
