CAWAL: A novel unified analytics framework for enterprise web applications and multi-server environments
Özkan Canay, Ümit Kocabıçak
TL;DR
CAWAL addresses data ownership, privacy, and cross-domain tracking challenges in web analytics by delivering an on-premises framework that unifies application logging with web analytics in multi-server environments. It combines a lean server-side data collection model, a dedicated data schema (log_session, log_page, log_analytics), and nightly ETL-based analytics stored in a data warehouse, enabling robust session reconciliation across domains and SPAs/ PWAs/IoT interactions. In a real enterprise deployment, CAWAL demonstrated tangible performance advantages over Open Web Analytics and Matomo, reducing response times by approximately $24\%$ and $85\%$, respectively, while enhancing data governance and multi-server scalability. These results suggest CAWAL's practical value for organizations prioritizing data sovereignty and integrated software diagnostics, though broader multi-domain validation is needed to generalize findings.
Abstract
In web analytics, cloud-based solutions have limitations in data ownership and privacy, whereas client-side user tracking tools face challenges such as data accuracy and a lack of server-side metrics. This paper presents the Combined Analytics and Web Application Log (CAWAL) framework as an alternative model and an on-premises framework, offering web analytics with application logging integration. CAWAL enables precise data collection and cross-domain tracking in web farms while complying with data ownership and privacy regulations. The framework also improves software diagnostics and troubleshooting by incorporating application-specific data into analytical processes. Integrated into an enterprise-grade web application, CAWAL has demonstrated superior performance, achieving approximately 24% and 85% lower response times compared to Open Web Analytics (OWA) and Matomo, respectively. The empirical evaluation demonstrates that the framework eliminates certain limitations in existing tools and provides a robust data infrastructure for enhanced web analytics.
