Topological relations in water quality monitoring
Bruno Chaves Figueiredo, Maria Alexandra Oliveira, João Nuno Silva
TL;DR
This work addresses the challenge of integrating heterogeneous water-quality data within complex river and infrastructure networks under the EU Water Framework Directive. It introduces a graph-based framework and a metagraph data model to connect water systems, water nodes, quality stations, treatments, land use, and basins, enabling topological and hydrological analyses through graph traversals. Implemented on Neo4j with a Python/Flask backend and a React frontend, the system provides data ingestion, visualization, and 42 API endpoints for powerful queries, achieving sub-10ms response times for path-finding and network queries on the EFMA dataset (tens of thousands of nodes/edges). The results demonstrate that graph databases offer superior flexibility and performance for interconnected hydrological data, supporting rapid contamination-source tracing and watershed-level monitoring, with practical implications for real-world water-quality management and decision-making.
Abstract
The Alqueva Multi-Purpose Project (EFMA) is a massive abduction and storage infrastructure system in the Alentejo, which has a water quality monitoring network with almost thousands of water quality stations distributed across three subsystems: Alqueva, Pedrogão, and Ardila. Identification of pollution sources in complex infrastructure systems, such as the EFMA, requires recognition of water flow direction and delimitation of areas being drained to specific sampling points. The transfer channels in the EFMA infrastructure artificially connect several water bodies that do not share drainage basins, which further complicates the interpretation of water quality data because the water does not flow exclusively downstream and is not restricted to specific basins. The existing user-friendly GIS tools do not facilitate the exploration and visualisation of water quality data in spatial-temporal dimensions, such as defining temporal relationships between monitoring campaigns, nor do they allow the establishment of topological and hydrological relationships between different sampling points. This thesis work proposes a framework capable of aggregating many types of information in a GIS environment, visualising large water quality-related datasets and, a graph data model to integrate and relate water quality between monitoring stations and land use. The graph model allows to exploit the relationship between water quality in a watercourse and reservoirs associated with infrastructures. The graph data model and the developed framework demonstrated encouraging results and has proven to be preferred when compared to relational databases.
