Siren Federate: Bridging document, relational, and graph models for exploratory graph analysis
Georgeta Bordea, Stephane Campinas, Matteo Catena, Renaud Delbru
TL;DR
Siren Federate tackles the challenge of interactive exploratory analysis over billions of heterogeneous knowledge graph items by unifying document IR, relational, and graph processing within Elasticsearch. It introduces distributed join algorithms, adaptive query planning, query plan folding, semantic caching, and a novel Semi-Join Decomposition technique to mitigate intermediate result explosion in path queries. The system validates these ideas through large-scale benchmarks, the LDBC Finbench, and real-world deployment insights, showing scalable, sub-second to second latency under heavy data and concurrency. By bridging multiple data models and enabling efficient path and graph analytics at scale, Siren Federate provides a practical pathway for investigative intelligence platforms that require rich search, graph exploration, and multi-modal data handling.
Abstract
Investigative workflows require interactive exploratory analysis on large heterogeneous knowledge graphs. Current databases show limitations in enabling such task. This paper discusses the architecture of Siren Federate, a system that efficiently supports exploratory graph analysis by bridging document-oriented, relational and graph models. Technical contributions include distributed join algorithms, adaptive query planning, query plan folding, semantic caching, and semi-join decomposition for path query. Semi-join decomposition addresses the exponential growth of intermediate results in path-based queries. Experiments show that Siren Federate exhibits low latency and scales well with the amount of data, the number of users, and the number of computing nodes.
