Table of Contents
Fetching ...

I Can Tell Your Secrets: Inferring Privacy Attributes from Mini-app Interaction History in Super-apps

Yifeng Cai, Ziqi Zhang, Mengyu Yao, Junlin Liu, Xiaoke Zhao, Xinyi Fu, Ruoyu Li, Zhe Li, Xiangqun Chen, Yao Guo, Ding Li

TL;DR

This work reveals a new privacy leakage channel in super-apps via mini-app interaction history, introducing Mini-H and Op-H as data streams naturally collected by vendors. It proposes THEFT, a Transformer-based inference attack trained on a small set of leaked privacy attributes to predict multiple attributes with high confidence, calibrated to align confidence with accuracy. On AliPay data, THEFT achieves an average 65.2% attribute inference accuracy, and a high-confidence subset (t_d = 0.9) yields 95.5% accuracy for 16.1% of samples, underscoring practical risk from insider access to non-sensitive logs. The authors engage industry stakeholders, showing real-world awareness gaps and prompting privacy-policy updates, thereby highlighting the need for proactive privacy safeguards and standards in the evolving super-app ecosystem.

Abstract

Super-apps have emerged as comprehensive platforms integrating various mini-apps to provide diverse services. While super-apps offer convenience and enriched functionality, they can introduce new privacy risks. This paper reveals a new privacy leakage source in super-apps: mini-app interaction history, including mini-app usage history (Mini-H) and operation history (Op-H). Mini-H refers to the history of mini-apps accessed by users, such as their frequency and categories. Op-H captures user interactions within mini-apps, including button clicks, bar drags, and image views. Super-apps can naturally collect these data without instrumentation due to the web-based feature of mini-apps. We identify these data types as novel and unexplored privacy risks through a literature review of 30 papers and an empirical analysis of 31 super-apps. We design a mini-app interaction history-oriented inference attack (THEFT), to exploit this new vulnerability. Using THEFT, the insider threats within the low-privilege business department of the super-app vendor acting as the adversary can achieve more than 95.5% accuracy in inferring privacy attributes of over 16.1% of users. THEFT only requires a small training dataset of 200 users from public breached databases on the Internet. We also engage with super-app vendors and a standards association to increase industry awareness and commitment to protect this data. Our contributions are significant in identifying overlooked privacy risks, demonstrating the effectiveness of a new attack, and influencing industry practices toward better privacy protection in the super-app ecosystem.

I Can Tell Your Secrets: Inferring Privacy Attributes from Mini-app Interaction History in Super-apps

TL;DR

This work reveals a new privacy leakage channel in super-apps via mini-app interaction history, introducing Mini-H and Op-H as data streams naturally collected by vendors. It proposes THEFT, a Transformer-based inference attack trained on a small set of leaked privacy attributes to predict multiple attributes with high confidence, calibrated to align confidence with accuracy. On AliPay data, THEFT achieves an average 65.2% attribute inference accuracy, and a high-confidence subset (t_d = 0.9) yields 95.5% accuracy for 16.1% of samples, underscoring practical risk from insider access to non-sensitive logs. The authors engage industry stakeholders, showing real-world awareness gaps and prompting privacy-policy updates, thereby highlighting the need for proactive privacy safeguards and standards in the evolving super-app ecosystem.

Abstract

Super-apps have emerged as comprehensive platforms integrating various mini-apps to provide diverse services. While super-apps offer convenience and enriched functionality, they can introduce new privacy risks. This paper reveals a new privacy leakage source in super-apps: mini-app interaction history, including mini-app usage history (Mini-H) and operation history (Op-H). Mini-H refers to the history of mini-apps accessed by users, such as their frequency and categories. Op-H captures user interactions within mini-apps, including button clicks, bar drags, and image views. Super-apps can naturally collect these data without instrumentation due to the web-based feature of mini-apps. We identify these data types as novel and unexplored privacy risks through a literature review of 30 papers and an empirical analysis of 31 super-apps. We design a mini-app interaction history-oriented inference attack (THEFT), to exploit this new vulnerability. Using THEFT, the insider threats within the low-privilege business department of the super-app vendor acting as the adversary can achieve more than 95.5% accuracy in inferring privacy attributes of over 16.1% of users. THEFT only requires a small training dataset of 200 users from public breached databases on the Internet. We also engage with super-app vendors and a standards association to increase industry awareness and commitment to protect this data. Our contributions are significant in identifying overlooked privacy risks, demonstrating the effectiveness of a new attack, and influencing industry practices toward better privacy protection in the super-app ecosystem.

Paper Structure

This paper contains 22 sections, 1 equation, 10 figures, 6 tables.

Figures (10)

  • Figure 1: Samples of mini-app interaction history.
  • Figure 2: The pipeline of THEFT consists of three steps: attack model training, confidence calibration, and online inference.
  • Figure 3: Attack effectiveness of THEFT. (a) Overall inference accuracy for each type of privacy. (b)-(h) For each privacy type and confidence interval, the proportion of users predicted to be within the interval ($P_{int}$), the proportion of correctly predicted users ($P_{conf}$), and inference accuracy ($acc_{int} = P_{conf} / P_{int} \times 100\%$).
  • Figure 4: Detailed analysis on the privacy of gender.
  • Figure 5: Detailed analysis on the privacy of location.
  • ...and 5 more figures