Table of Contents
Fetching ...

Exploring Hidden Geographic Disparities in Android Apps

M. Alecci, P. Jiménez, J. Samhi, T. Bissyandé, J. Klein

TL;DR

Addresses how Android apps vary by geographic region, revealing GeoTwins (region-specific variants with different package names) and unexpected base.apk differences across regions. The authors constructed a global, multi-region crawling pipeline and a static-analysis framework to compare thousands of APKs and produced a public GeoTwins dataset (81,963 pairs). Key findings show that GeoTwins are largely similar but exhibit nontrivial differences in Smali/files and, separately, that base.apk can vary by region despite expectations of uniformity. These results imply potential biases in security/privacy research, inform regulatory transparency, and provide a resource for developers to understand regional differences.

Abstract

While mobile app evolution has been widely studied, geographical variation in app behavior remains largely unexplored. This paper presents a large-scale study of location-based Android app differentiation, uncovering two important and underexamined phenomena with security and fairness implications. First, we introduce GeoTwins: apps that are functionally similar and share branding but are released under different package names across countries. Despite their similarity, GeoTwins often diverge in requested permissions, third-party libraries, and privacy disclosures. Second, we examine the Android App Bundle ecosystem and reveal unexpected regional differences in supposedly consistent base.apk files. Contrary to common assumptions, even base.apk files vary by region, exposing hidden customizations that may affect app behavior or security. These discrepancies have concrete consequences. Geographically distinct variants can lead the same app to be labeled benign in one malware study but suspicious in another, depending on the region of download. Such hidden variation undermines reproducibility and introduces geographic bias into assessments of security, privacy, and functionality. It also raises ethical concerns about transparency and consent: visually identical Google Play listings may mask subtle but important differences. To study these issues, we built a distributed app collection pipeline spanning multiple regions and analyzed thousands of apps. We also release a dataset of 81,963 GeoTwins to support future work. Our findings reveal systemic regional disparities in mobile software, with implications for researchers, developers, platform architects, and policymakers.

Exploring Hidden Geographic Disparities in Android Apps

TL;DR

Addresses how Android apps vary by geographic region, revealing GeoTwins (region-specific variants with different package names) and unexpected base.apk differences across regions. The authors constructed a global, multi-region crawling pipeline and a static-analysis framework to compare thousands of APKs and produced a public GeoTwins dataset (81,963 pairs). Key findings show that GeoTwins are largely similar but exhibit nontrivial differences in Smali/files and, separately, that base.apk can vary by region despite expectations of uniformity. These results imply potential biases in security/privacy research, inform regulatory transparency, and provide a resource for developers to understand regional differences.

Abstract

While mobile app evolution has been widely studied, geographical variation in app behavior remains largely unexplored. This paper presents a large-scale study of location-based Android app differentiation, uncovering two important and underexamined phenomena with security and fairness implications. First, we introduce GeoTwins: apps that are functionally similar and share branding but are released under different package names across countries. Despite their similarity, GeoTwins often diverge in requested permissions, third-party libraries, and privacy disclosures. Second, we examine the Android App Bundle ecosystem and reveal unexpected regional differences in supposedly consistent base.apk files. Contrary to common assumptions, even base.apk files vary by region, exposing hidden customizations that may affect app behavior or security. These discrepancies have concrete consequences. Geographically distinct variants can lead the same app to be labeled benign in one malware study but suspicious in another, depending on the region of download. Such hidden variation undermines reproducibility and introduces geographic bias into assessments of security, privacy, and functionality. It also raises ethical concerns about transparency and consent: visually identical Google Play listings may mask subtle but important differences. To study these issues, we built a distributed app collection pipeline spanning multiple regions and analyzed thousands of apps. We also release a dataset of 81,963 GeoTwins to support future work. Our findings reveal systemic regional disparities in mobile software, with implications for researchers, developers, platform architects, and policymakers.

Paper Structure

This paper contains 15 sections, 1 equation, 10 figures, 4 tables.

Figures (10)

  • Figure 1: Unison League Google Play pages.
  • Figure 2: Google Play app page differences: (a) “Install” button present when the app is available in the region, (b) “Install” button absent when the app is not available.
  • Figure 3: Cisana TV+ app icons associated to different countries.
  • Figure 4: WhatsApp icon from an older version compared to the latest version.
  • Figure 5: Distribution of apps available exclusively in one location.
  • ...and 5 more figures