Table of Contents
Fetching ...

Hey there! You are using WhatsApp: Enumerating Three Billion Accounts for Security and Privacy

Gabriel K. Gegenhuber, Philipp É. Frenzel, Maximilian Günther, Johanna Ullrich, Aljosha Judmayer

TL;DR

WhatsApp’s presence-discovery mechanism based on phone numbers enables large-scale enumeration with substantial privacy and security implications despite end-to-end encryption. The authors develop a global, end-to-end measurement using reverse-engineered XMPP endpoints to enumerate 63B candidate numbers and verify 3.5B active accounts, collecting public keys and limited metadata. They analyze the resulting census for OS distribution, device activity, profile data availability, and extensive X25519 key reuse, revealing privacy risks and potential fraud. The study also contrasts with the 2021 Facebook leak to illustrate data longevity and documents remediation progress by WhatsApp, including cardinality-based rate limiting and data-visibility restrictions. Taken together, the work informs operators, researchers, and policymakers about practical mitigations needed to curb enumeration risks in large-scale encrypted messaging systems.

Abstract

WhatsApp, with 3.5 billion active accounts as of early 2025, is the world's largest instant messaging platform. Given its massive user base, WhatsApp plays a critical role in global communication. To initiate conversations, users must first discover whether their contacts are registered on the platform. This is achieved by querying WhatsApp's servers with mobile phone numbers extracted from the user's address book (if they allowed access). This architecture inherently enables phone number enumeration, as the service must allow legitimate users to query contact availability. While rate limiting is a standard defense against abuse, we revisit the problem and show that WhatsApp remains highly vulnerable to enumeration at scale. In our study, we were able to probe over a hundred million phone numbers per hour without encountering blocking or effective rate limiting. Our findings demonstrate not only the persistence but the severity of this vulnerability. We further show that nearly half of the phone numbers disclosed in the 2021 Facebook data leak are still active on WhatsApp, underlining the enduring risks associated with such exposures. Moreover, we were able to perform a census of WhatsApp users, providing a glimpse on the macroscopic insights a large messaging service is able to generate even though the messages themselves are end-to-end encrypted. Using the gathered data, we also discovered the re-use of certain X25519 keys across different devices and phone numbers, indicating either insecure (custom) implementations, or fraudulent activity. In this updated version of the paper, we also provide insights into the collaborative remediation process through which we confirmed that the underlying rate-limiting issue had been resolved.

Hey there! You are using WhatsApp: Enumerating Three Billion Accounts for Security and Privacy

TL;DR

WhatsApp’s presence-discovery mechanism based on phone numbers enables large-scale enumeration with substantial privacy and security implications despite end-to-end encryption. The authors develop a global, end-to-end measurement using reverse-engineered XMPP endpoints to enumerate 63B candidate numbers and verify 3.5B active accounts, collecting public keys and limited metadata. They analyze the resulting census for OS distribution, device activity, profile data availability, and extensive X25519 key reuse, revealing privacy risks and potential fraud. The study also contrasts with the 2021 Facebook leak to illustrate data longevity and documents remediation progress by WhatsApp, including cardinality-based rate limiting and data-visibility restrictions. Taken together, the work informs operators, researchers, and policymakers about practical mitigations needed to curb enumeration risks in large-scale encrypted messaging systems.

Abstract

WhatsApp, with 3.5 billion active accounts as of early 2025, is the world's largest instant messaging platform. Given its massive user base, WhatsApp plays a critical role in global communication. To initiate conversations, users must first discover whether their contacts are registered on the platform. This is achieved by querying WhatsApp's servers with mobile phone numbers extracted from the user's address book (if they allowed access). This architecture inherently enables phone number enumeration, as the service must allow legitimate users to query contact availability. While rate limiting is a standard defense against abuse, we revisit the problem and show that WhatsApp remains highly vulnerable to enumeration at scale. In our study, we were able to probe over a hundred million phone numbers per hour without encountering blocking or effective rate limiting. Our findings demonstrate not only the persistence but the severity of this vulnerability. We further show that nearly half of the phone numbers disclosed in the 2021 Facebook data leak are still active on WhatsApp, underlining the enduring risks associated with such exposures. Moreover, we were able to perform a census of WhatsApp users, providing a glimpse on the macroscopic insights a large messaging service is able to generate even though the messages themselves are end-to-end encrypted. Using the gathered data, we also discovered the re-use of certain X25519 keys across different devices and phone numbers, indicating either insecure (custom) implementations, or fraudulent activity. In this updated version of the paper, we also provide insights into the collaborative remediation process through which we confirmed that the underlying rate-limiting issue had been resolved.

Paper Structure

This paper contains 19 sections, 15 figures, 10 tables.

Figures (15)

  • Figure 1: WhatsApp users wrt. to continent, Android vs. iOS use, and profile picture for 3.5 B users.
  • Figure 2: WhatsApp adoption per continent. Percentage shares are calculated by dividing the number of discovered WhatsApp accounts by the respective population size (per capita) of each continent.
  • Figure 3: WhatsApp Use per Capita: At 95 % in South America and 80 % in Europe, a majority of citizens have an active WhatsApp account.
  • Figure 4: The OS-specific characteristic initialization values (Table \ref{['tab:characteristic-key-ids']}) can be observed at scale. The empirical cumulative distribution functions (eCDFs) are left-skewed, reflecting the 0-based initialization patterns of the different IDs which are typical for the operating systems Android (0-initialized signed prekey ID) and iOS (0-initialized one-time prekey ID). The eCDF in the top figure corresponds to Android devices, which account for 81 % of the observed user base, while the bottom figure represents iOS devices, comprising the remaining 19 %. The spike in this figure is investigated in Section \ref{['sec:prekey_bundle_analysis']}.
  • Figure 5: Distribution of prekey bundle age across our retrieved data. Over 90 % of users have updated their keys within the past month. Consistent with WhatsApp's account deletion policy, most accounts with keys older than 120 days have been automatically removed.
  • ...and 10 more figures