Table of Contents
Fetching ...

Understanding Web Application Workloads and Their Applications: Systematic Literature Review and Characterization

Roozbeh Aghili, Qiaolin Qin, Heng Li, Foutse Khomh

TL;DR

The paper tackles the gap in understanding web application workloads by conducting a systematic literature review of studies using public web workloads and by characterizing these workloads. It identifies 78 articles and 12 publicly available datasets, revealing three daily and three weekly workload patterns that are non-monotonic and best captured by polynomial models. The authors develop a complete characterization pipeline—data extraction, aggregation to daily and weekly granularity, standardization, smoothing, variability analysis, and K-Means clustering—resulting in centroid models and insights into time dependence across days and seasons. These findings inform realistic workload generation and proactive resource provisioning, and the authors advocate sharing newer datasets to reflect current web dynamics.

Abstract

Web applications, accessible via web browsers over the Internet, facilitate complex functionalities without local software installation. In the context of web applications, a workload refers to the number of user requests sent by users or applications to the underlying system. Existing studies have leveraged web application workloads to achieve various objectives, such as workload prediction and auto-scaling. However, these studies are conducted in an ad hoc manner, lacking a systematic understanding of the characteristics of web application workloads. In this study, we first conduct a systematic literature review to identify and analyze existing studies leveraging web application workloads. Our analysis sheds light on their workload utilization, analysis techniques, and high-level objectives. We further systematically analyze the characteristics of the web application workloads identified in the literature review. Our analysis centers on characterizing these workloads at two distinct temporal granularities: daily and weekly. We successfully identify and categorize three daily and three weekly patterns within the workloads. By providing a statistical characterization of these workload patterns, our study highlights the uniqueness of each pattern, paving the way for the development of realistic workload generation and resource provisioning techniques that can benefit a range of applications and research areas.

Understanding Web Application Workloads and Their Applications: Systematic Literature Review and Characterization

TL;DR

The paper tackles the gap in understanding web application workloads by conducting a systematic literature review of studies using public web workloads and by characterizing these workloads. It identifies 78 articles and 12 publicly available datasets, revealing three daily and three weekly workload patterns that are non-monotonic and best captured by polynomial models. The authors develop a complete characterization pipeline—data extraction, aggregation to daily and weekly granularity, standardization, smoothing, variability analysis, and K-Means clustering—resulting in centroid models and insights into time dependence across days and seasons. These findings inform realistic workload generation and proactive resource provisioning, and the authors advocate sharing newer datasets to reflect current web dynamics.

Abstract

Web applications, accessible via web browsers over the Internet, facilitate complex functionalities without local software installation. In the context of web applications, a workload refers to the number of user requests sent by users or applications to the underlying system. Existing studies have leveraged web application workloads to achieve various objectives, such as workload prediction and auto-scaling. However, these studies are conducted in an ad hoc manner, lacking a systematic understanding of the characteristics of web application workloads. In this study, we first conduct a systematic literature review to identify and analyze existing studies leveraging web application workloads. Our analysis sheds light on their workload utilization, analysis techniques, and high-level objectives. We further systematically analyze the characteristics of the web application workloads identified in the literature review. Our analysis centers on characterizing these workloads at two distinct temporal granularities: daily and weekly. We successfully identify and categorize three daily and three weekly patterns within the workloads. By providing a statistical characterization of these workload patterns, our study highlights the uniqueness of each pattern, paving the way for the development of realistic workload generation and resource provisioning techniques that can benefit a range of applications and research areas.
Paper Structure (40 sections, 6 equations, 10 figures, 5 tables)

This paper contains 40 sections, 6 equations, 10 figures, 5 tables.

Figures (10)

  • Figure 1: Overview of our study
  • Figure 2: Comparison of literature publication years and corresponding workload datasets years
  • Figure 3: Cumulative number of papers per objective over the years
  • Figure 4: Daily granularity of Wikipedia and Worldcup98 workloads over a one-day time span
  • Figure 5: Weekly granularity of Wikipedia and Worldcup98 workloads over a one-week time span
  • ...and 5 more figures