Table of Contents
Fetching ...

User Interaction Data in Apps: Comparing Policy Claims to Implementations

Feiyang Tang, Bjarte M. Østvold

TL;DR

This work tackles the mismatch between privacy policy claims and actual collection of user interaction data in mobile apps by introducing a two-pronged automated framework: NLP-based policy claim extraction/classification and static analysis of app code to identify UI-driven data collection via analytics libraries. It validates the approach on 100 Android apps and deep-dives into four case studies, revealing substantial transparency gaps (Interaction Consistency 58%, Context Consistency 32%, with coverage up to 86% and 71%, respectively) and frequent non-specific policy language. The findings emphasize that anonymized interaction data can still enable profiling or re-identification, underscoring the need for clearer, more transparent privacy disclosures and governance of non-personal data practices. The proposed pipeline and taxonomy offer a scalable method to audit policy-practice alignment and promote trust in mobile app privacy governance.

Abstract

As mobile app usage continues to rise, so does the generation of extensive user interaction data, which includes actions such as swiping, zooming, or the time spent on a screen. Apps often collect a large amount of this data and claim to anonymize it, yet concerns arise regarding the adequacy of these measures. In many cases, the so-called anonymized data still has the potential to profile and, in some instances, re-identify individual users. This situation is compounded by a lack of transparency, leading to potential breaches of user trust. Our work investigates the gap between privacy policies and actual app behavior, focusing on the collection and handling of user interaction data. We analyzed the top 100 apps across diverse categories using static analysis methods to evaluate the alignment between policy claims and implemented data collection techniques. Our findings highlight the lack of transparency in data collection and the associated risk of re-identification, raising concerns about user privacy and trust. This study emphasizes the importance of clear communication and enhanced transparency in privacy practices for mobile app development.

User Interaction Data in Apps: Comparing Policy Claims to Implementations

TL;DR

This work tackles the mismatch between privacy policy claims and actual collection of user interaction data in mobile apps by introducing a two-pronged automated framework: NLP-based policy claim extraction/classification and static analysis of app code to identify UI-driven data collection via analytics libraries. It validates the approach on 100 Android apps and deep-dives into four case studies, revealing substantial transparency gaps (Interaction Consistency 58%, Context Consistency 32%, with coverage up to 86% and 71%, respectively) and frequent non-specific policy language. The findings emphasize that anonymized interaction data can still enable profiling or re-identification, underscoring the need for clearer, more transparent privacy disclosures and governance of non-personal data practices. The proposed pipeline and taxonomy offer a scalable method to audit policy-practice alignment and promote trust in mobile app privacy governance.

Abstract

As mobile app usage continues to rise, so does the generation of extensive user interaction data, which includes actions such as swiping, zooming, or the time spent on a screen. Apps often collect a large amount of this data and claim to anonymize it, yet concerns arise regarding the adequacy of these measures. In many cases, the so-called anonymized data still has the potential to profile and, in some instances, re-identify individual users. This situation is compounded by a lack of transparency, leading to potential breaches of user trust. Our work investigates the gap between privacy policies and actual app behavior, focusing on the collection and handling of user interaction data. We analyzed the top 100 apps across diverse categories using static analysis methods to evaluate the alignment between policy claims and implemented data collection techniques. Our findings highlight the lack of transparency in data collection and the associated risk of re-identification, raising concerns about user privacy and trust. This study emphasizes the importance of clear communication and enhanced transparency in privacy practices for mobile app development.
Paper Structure (22 sections, 4 figures, 5 tables)