Table of Contents
Fetching ...

The Web unpacked: a quantitative analysis of global Web usage

Henrique S. Xavier

TL;DR

This paper presents a comprehensive analysis of global web usage patterns based on data from SimilarWeb, a leading source for estimating web traffic, and reveals a significant concentration of web traffic, with a diminutive number of top websites capturing the majority of visits.

Abstract

This paper presents a comprehensive analysis of global web usage patterns based on data from SimilarWeb, a leading source for estimating web traffic. Leveraging a dataset comprising over 250,000 websites, we estimate the total web traffic and investigate its distribution among domains and industry sectors. We detail the characteristics of the top 116 domains, which comprise an estimated one-third of all web traffic. Our analysis scrutinizes various attributes of these domains, including their content sources and types, access requirements, offline presence, and ownership features. Our analysis reveals a significant concentration of web traffic, with a diminutive number of top websites capturing the majority of visits. Search engines, news and media, social networks, streaming, and adult content emerge as primary attractors of web traffic, which is also highly concentrated on platforms and USA-owned websites. Much of the traffic goes to for-profit but mostly free-of-charge websites, highlighting the dominance of business models not based on paywalls.

The Web unpacked: a quantitative analysis of global Web usage

TL;DR

This paper presents a comprehensive analysis of global web usage patterns based on data from SimilarWeb, a leading source for estimating web traffic, and reveals a significant concentration of web traffic, with a diminutive number of top websites capturing the majority of visits.

Abstract

This paper presents a comprehensive analysis of global web usage patterns based on data from SimilarWeb, a leading source for estimating web traffic. Leveraging a dataset comprising over 250,000 websites, we estimate the total web traffic and investigate its distribution among domains and industry sectors. We detail the characteristics of the top 116 domains, which comprise an estimated one-third of all web traffic. Our analysis scrutinizes various attributes of these domains, including their content sources and types, access requirements, offline presence, and ownership features. Our analysis reveals a significant concentration of web traffic, with a diminutive number of top websites capturing the majority of visits. Search engines, news and media, social networks, streaming, and adult content emerge as primary attractors of web traffic, which is also highly concentrated on platforms and USA-owned websites. Much of the traffic goes to for-profit but mostly free-of-charge websites, highlighting the dominance of business models not based on paywalls.
Paper Structure (9 sections, 10 figures, 3 tables)

This paper contains 9 sections, 10 figures, 3 tables.

Figures (10)

  • Figure 1: Histogram of MoM for the 8,000 most visited websites. The values were clipped at 200% to improve plot readability.
  • Figure 2: MoM vs. rank position. The dots represent individual domains, and lines represent moving statistical measures. Outliers beyond $\pm 40\%$ are not shown.
  • Figure 3: Average monthly visits as a function of the domain's position in the rank. Light bands represent a $2\sigma$ variation in the visits from month to month.
  • Figure 4: Best estimate of the cumulative traffic share of domains as a function of position in the rank (red line). The violet band represents the systematic uncertainty.
  • Figure 5: Cummulative share of monthly visits aggregated by web industry, from the most to the least popular.
  • ...and 5 more figures