Table of Contents
Fetching ...

A Privacy Measure Turned Upside Down? Investigating the Use of HTTP Client Hints on the Web

Stephan Wiefling, Marian Hönscheid, Luigi Lo Iacono

TL;DR

HTTP Client Hints (CHs) were proposed to replace the browser's high-entropy user agent string with controlled disclosures. This study conducts the first long-term, large-scale analysis of CH usage on the Web, combining historical start-page data with login-page crawls to assess adoption, data detail, and third-party involvement, including risk-based authentication contexts. The findings reveal generally low CH adoption on start pages but higher usage on login pages and by third parties, with RBA websites extracting greater detail and trackers enabling cross-site profiling; privacy concerns are amplified by the lack of user controls and transparency. The work highlights practical implications for browser vendors, policymakers, and researchers, and provides open data to support replication and further investigation into CH-related privacy leakage and remediation strategies.

Abstract

HTTP client hints are a set of standardized HTTP request headers designed to modernize and potentially replace the traditional user agent string. While the user agent string exposes a wide range of information about the client's browser and device, client hints provide a controlled and structured approach for clients to selectively disclose their capabilities and preferences to servers. Essentially, client hints aim at more effective and privacy-friendly disclosure of browser or client properties than the user agent string. We present a first long-term study of the use of HTTP client hints in the wild. We found that despite being implemented in almost all web browsers, server-side usage of client hints remains generally low. However, in the context of third-party websites, which are often linked to trackers, the adoption rate is significantly higher. This is concerning because client hints allow the retrieval of more data from the client than the user agent string provides, and there are currently no mechanisms for users to detect or control this potential data leakage. Our work provides valuable insights for web users, browser vendors, and researchers by exposing potential privacy violations via client hints and providing help in developing remediation strategies as well as further research.

A Privacy Measure Turned Upside Down? Investigating the Use of HTTP Client Hints on the Web

TL;DR

HTTP Client Hints (CHs) were proposed to replace the browser's high-entropy user agent string with controlled disclosures. This study conducts the first long-term, large-scale analysis of CH usage on the Web, combining historical start-page data with login-page crawls to assess adoption, data detail, and third-party involvement, including risk-based authentication contexts. The findings reveal generally low CH adoption on start pages but higher usage on login pages and by third parties, with RBA websites extracting greater detail and trackers enabling cross-site profiling; privacy concerns are amplified by the lack of user controls and transparency. The work highlights practical implications for browser vendors, policymakers, and researchers, and provides open data to support replication and further investigation into CH-related privacy leakage and remediation strategies.

Abstract

HTTP client hints are a set of standardized HTTP request headers designed to modernize and potentially replace the traditional user agent string. While the user agent string exposes a wide range of information about the client's browser and device, client hints provide a controlled and structured approach for clients to selectively disclose their capabilities and preferences to servers. Essentially, client hints aim at more effective and privacy-friendly disclosure of browser or client properties than the user agent string. We present a first long-term study of the use of HTTP client hints in the wild. We found that despite being implemented in almost all web browsers, server-side usage of client hints remains generally low. However, in the context of third-party websites, which are often linked to trackers, the adoption rate is significantly higher. This is concerning because client hints allow the retrieval of more data from the client than the user agent string provides, and there are currently no mechanisms for users to detect or control this potential data leakage. Our work provides valuable insights for web users, browser vendors, and researchers by exposing potential privacy violations via client hints and providing help in developing remediation strategies as well as further research.
Paper Structure (38 sections, 6 figures, 5 tables)

This paper contains 38 sections, 6 figures, 5 tables.

Figures (6)

  • Figure 1: Overview of the collected data and the results of our data analysis, based on the data from December 2023.
  • Figure 2: Observed HTTP CH adoption rates on start pages over time, grouped by rankings inside the Tranco list. The data is taken from the HTTP Archive crawling data, that crawled the whole Web each month.
  • Figure 3: (a) Percentage of websites showing identical HTTP CH behavior on their login page compared to the start page. (b) Percentage of websites that use HTTP CHs based on their rankings inside the Tranco list. We calculated the usage based on the data from December 2023.
  • Figure 4: Top 20 HTTP CHs used among all websites and trackers that used HTTP CHs in the study. (*: Experimental HTTP CH)
  • Figure 5: Top 25 categories of the Tranco 8M websites that used HTTP CHs, their percentage, their number of valid HTTP CHs requested, and the median level of detail they requested with the HTTP CHs. There is a maximum of 19 possible HTTP CHs. We also counted deprecated HTTP CHs as valid here, as they can still be valid in some browsers.
  • ...and 1 more figures