Appformer: A Novel Framework for Mobile App Usage Prediction Leveraging Progressive Multi-Modal Data Fusion and Feature Extraction
Chuike Sun, Junzhou Chen, Yue Zhao, Hao Han, Ruihai Jing, Guang Tan, Di Wu
TL;DR
Appformer tackles mobile app usage prediction by fusing multi-modal data with a progressive fusion scheme and extracting temporal features via a Transformer-like encoder–decoder. It introduces a privacy-conscious preprocessing pipeline, a Cross-Modal Data Fusion Module, and a time-augmented feature extraction strategy, augmented by POI clustering with K-Modes. Through extensive experiments on PAULCI and DUGN benchmarks, it achieves state-of-the-art results and validates the contribution of each module via ablations, time encoding improvements, and POI clustering analyses. The work advances practical personalized app recommendations by offering a modular, scalable framework capable of integrating diverse signals while maintaining user privacy and enabling future architectural enhancements.
Abstract
This article presents Appformer, a novel mobile application prediction framework inspired by the efficiency of Transformer-like architectures in processing sequential data through self-attention mechanisms. Combining a Multi-Modal Data Progressive Fusion Module with a sophisticated Feature Extraction Module, Appformer leverages the synergies of multi-modal data fusion and data mining techniques while maintaining user privacy. The framework employs Points of Interest (POIs) associated with base stations, optimizing them through comprehensive comparative experiments to identify the most effective clustering method. These refined inputs are seamlessly integrated into the initial phases of cross-modal data fusion, where temporal units are encoded via word embeddings and subsequently merged in later stages. The Feature Extraction Module, employing Transformer-like architectures specialized for time series analysis, adeptly distils comprehensive features. It meticulously fine-tunes the outputs from the fusion module, facilitating the extraction of high-calibre, multi-modal features, thus guaranteeing a robust and efficient extraction process. Extensive experimental validation confirms Appformer's effectiveness, attaining state-of-the-art (SOTA) metrics in mobile app usage prediction, thereby signifying a notable progression in this field.
