Table of Contents
Fetching ...

Software Testing at the Network Layer: Automated HTTP API Quality Assessment and Security Analysis of Production Web Applications

Ali Hassaan Mughal, Muhammad Bilal

TL;DR

This study addresses the lack of systematic testing of production HTTP API call quality and its security implications by capturing and analyzing production traffic from 18 sites with Playwright to produce 108 HAR files. It introduces eight heuristic detectors that quantify anti-patterns in API call patterns and aggregates them into a $0$-$100$ composite quality score, demonstrated across diverse architectures from minimal server-rendered to heavy JS SPAs. Key findings reveal a wide quality spectrum, pervasive redundant calls and missing cache headers, and substantial third-party overhead with clear security consequences such as supply-chain exposure and cache-poisoning risks. The work provides an open, reproducible framework and dataset for researchers and practitioners to audit, extend, and monitor HTTP API call quality in production environments, with practical guidance for improving client-side caching, deduplication, and third-party governance. Future directions include longitudinal analyses, integration with UI-layer tests, and expansion of security detectors aligned with OWASP API Security Top 10.

Abstract

Modern web applications rely heavily on client-side API calls to fetch data, render content, and communicate with backend services. However, the quality of these network interactions (redundant requests, missing cache headers, oversized payloads, and excessive third-party dependencies) is rarely tested in a systematic way. Moreover, many of these quality deficiencies carry security implications: missing cache headers enable cache poisoning, excessive third-party dependencies expand the supply-chain attack surface, and error responses risk leaking server internals. In this study, we present an automated software testing framework that captures and analyzes the complete HTTP traffic of 18 production websites spanning 11 categories (e-commerce, news, government, developer tools, travel, and more). Using automated browser instrumentation via Playwright, we record 108 HAR (HTTP Archive) files across 3 independent runs per page, then apply 8 heuristic-based anti-pattern detectors to produce a composite quality score (0-100) for each site. Our results reveal a wide quality spectrum: minimalist server-rendered sites achieve perfect scores of 100, while content-heavy commercial sites score as low as 56.8. We identify redundant API calls and missing cache headers as the two most pervasive anti-patterns, each affecting 67% of sites, while third-party overhead exceeds 20% on 72% of sites. One utility site makes 2,684 requests per page load, which is 447x more than the most minimal site. To protect site reputations, all identities are anonymized using category-based pseudonyms. We provide all analysis scripts, anonymized results, and reproducibility instructions as an open artifact. This work establishes an empirical baseline for HTTP API call quality across the modern web and offers a reproducible testing framework that researchers and practitioners can apply to their own applications.

Software Testing at the Network Layer: Automated HTTP API Quality Assessment and Security Analysis of Production Web Applications

TL;DR

This study addresses the lack of systematic testing of production HTTP API call quality and its security implications by capturing and analyzing production traffic from 18 sites with Playwright to produce 108 HAR files. It introduces eight heuristic detectors that quantify anti-patterns in API call patterns and aggregates them into a - composite quality score, demonstrated across diverse architectures from minimal server-rendered to heavy JS SPAs. Key findings reveal a wide quality spectrum, pervasive redundant calls and missing cache headers, and substantial third-party overhead with clear security consequences such as supply-chain exposure and cache-poisoning risks. The work provides an open, reproducible framework and dataset for researchers and practitioners to audit, extend, and monitor HTTP API call quality in production environments, with practical guidance for improving client-side caching, deduplication, and third-party governance. Future directions include longitudinal analyses, integration with UI-layer tests, and expansion of security detectors aligned with OWASP API Security Top 10.

Abstract

Modern web applications rely heavily on client-side API calls to fetch data, render content, and communicate with backend services. However, the quality of these network interactions (redundant requests, missing cache headers, oversized payloads, and excessive third-party dependencies) is rarely tested in a systematic way. Moreover, many of these quality deficiencies carry security implications: missing cache headers enable cache poisoning, excessive third-party dependencies expand the supply-chain attack surface, and error responses risk leaking server internals. In this study, we present an automated software testing framework that captures and analyzes the complete HTTP traffic of 18 production websites spanning 11 categories (e-commerce, news, government, developer tools, travel, and more). Using automated browser instrumentation via Playwright, we record 108 HAR (HTTP Archive) files across 3 independent runs per page, then apply 8 heuristic-based anti-pattern detectors to produce a composite quality score (0-100) for each site. Our results reveal a wide quality spectrum: minimalist server-rendered sites achieve perfect scores of 100, while content-heavy commercial sites score as low as 56.8. We identify redundant API calls and missing cache headers as the two most pervasive anti-patterns, each affecting 67% of sites, while third-party overhead exceeds 20% on 72% of sites. One utility site makes 2,684 requests per page load, which is 447x more than the most minimal site. To protect site reputations, all identities are anonymized using category-based pseudonyms. We provide all analysis scripts, anonymized results, and reproducibility instructions as an open artifact. This work establishes an empirical baseline for HTTP API call quality across the modern web and offers a reproducible testing framework that researchers and practitioners can apply to their own applications.
Paper Structure (41 sections, 1 equation, 6 figures, 3 tables)

This paper contains 41 sections, 1 equation, 6 figures, 3 tables.

Figures (6)

  • Figure 1: Overview of the automated testing pipeline. Five phases transform 30 candidate websites into quality scores and security assessments. A validation pass (dashed arrows) independently verifies data integrity across all phases.
  • Figure 2: API call quality scores for 18 production websites. Colors indicate score tiers: red ($<$60), orange (60--75), green (75--90), blue (90+). The dashed line marks the median (74.8).
  • Figure 3: Anti-pattern prevalence heatmap. Darker cells indicate more severe issues. Each value represents the average count or percentage across all captures.
  • Figure 4: Request volume vs. quality score. Point color indicates third-party request percentage (blue = low, red = high). High-quality sites cluster in the bottom-left (few requests, low third-party).
  • Figure 5: Third-party request distribution by category. Analytics and ads dominate the third-party ecosystem for most commercial sites.
  • ...and 1 more figures