An Empirical Study on Challenges of Event Management in Microservice Architectures
Rodrigo Laigner, Ana Carolina Almeida, Wesley K. G. Assunção, Yongluan Zhou
TL;DR
This study addresses the challenges of event management in event-driven microservice architectures by performing an empirical, two-pronged analysis: broad Stack Overflow mining to identify state-of-practice patterns and a manual qualitative study of 628 questions to uncover concrete challenges. It reveals that practitioners frequently employ patterns such as Messaging, Event Sourcing, Domain Event, and CQRS, while contending with non-functional requirements like Consistency, Decoupling, and Scalability, alongside functional needs like propagation of state and multi-service workflows. The identified challenges span Safety and Liveness, Event Schema Management, Performance, Observability, and Security, including issues with publishing guarantees, event replay, schema evolution, large payloads, and weak delivery semantics. The paper offers actionable implications for developers, tool providers, and researchers, and contributes a publicly available dataset to spur further study of event management in microservices.
Abstract
Microservices emerged as a popular architectural style over the last decade. Although microservices are designed to be self-contained, they must communicate to realize business capabilities, creating dependencies among their data and functionalities. Developers then resort to asynchronous, event-based communication to fulfill such dependencies while reducing coupling. However, developers are often oblivious to the inherent challenges of the asynchronous and event-based paradigm, leading to frustrations and ultimately making them reconsider the adoption of microservices. To make matters worse, there is a scarcity of literature on the practices and challenges of designing, implementing, testing, monitoring, and troubleshooting event-based microservices. To fill this gap, this paper provides the first comprehensive characterization of event management practices and challenges in microservices based on a repository mining study of over 8000 Stack Overflow questions. Moreover, 628 relevant questions were randomly sampled for an in-depth manual investigation of challenges. We find that developers encounter many problems, including large event payloads, modeling event schemas, auditing event flows, and ordering constraints in processing events. This suggests that developers are not sufficiently served by state-of-the-practice technologies. We provide actionable implications to developers, technology providers, and researchers to advance event management in microservices.
