Table of Contents
Fetching ...

Forward Once for All: Structural Parameterized Adaptation for Efficient Cloud-coordinated On-device Recommendation

Kairui Fu, Zheqi Lv, Shengyu Zhang, Fan Wu, Kun Kuang

TL;DR

Forward-OFA addresses the privacy and bandwidth challenges of cloud-centric recommendations by constructing device-specific subnetworks for on-device reranking. It jointly learns where to assemble network blocks (structure) and how to parameterize those blocks (parameters) through a structure controller and a structural mapper, enabling adaptation in a single forward pass without on-device backpropagation. A sparsity constraint and a hypernetwork-based parameter mapper mitigate gradient conflicts and enforce efficiency, achieving strong recommendations with reduced FLOPs and model size. The approach demonstrates clear advantages across four real-world datasets, offering a practical, privacy-preserving path toward architecture-aware on-device recommendation with potential for broader neural architecture search integration.

Abstract

In cloud-centric recommender system, regular data exchanges between user devices and cloud could potentially elevate bandwidth demands and privacy risks. On-device recommendation emerges as a viable solution by performing reranking locally to alleviate these concerns. Existing methods primarily focus on developing local adaptive parameters, while potentially neglecting the critical role of tailor-made model architecture. Insights from broader research domains suggest that varying data distributions might favor distinct architectures for better fitting. In addition, imposing a uniform model structure across heterogeneous devices may result in risking inefficacy on less capable devices or sub-optimal performance on those with sufficient capabilities. In response to these gaps, our paper introduces Forward-OFA, a novel approach for the dynamic construction of device-specific networks (both structure and parameters). Forward-OFA employs a structure controller to selectively determine whether each block needs to be assembled for a given device. However, during the training of the structure controller, these assembled heterogeneous structures are jointly optimized, where the co-adaption among blocks might encounter gradient conflicts. To mitigate this, Forward-OFA is designed to establish a structure-guided mapping of real-time behaviors to the parameters of assembled networks. Structure-related parameters and parallel components within the mapper prevent each part from receiving heterogeneous gradients from others, thus bypassing the gradient conflicts for coupled optimization. Besides, direct mapping enables Forward-OFA to achieve adaptation through only one forward pass, allowing for swift adaptation to changing interests and eliminating the requirement for on-device backpropagation. Experiments on real-world datasets demonstrate the effectiveness and efficiency of Forward-OFA.

Forward Once for All: Structural Parameterized Adaptation for Efficient Cloud-coordinated On-device Recommendation

TL;DR

Forward-OFA addresses the privacy and bandwidth challenges of cloud-centric recommendations by constructing device-specific subnetworks for on-device reranking. It jointly learns where to assemble network blocks (structure) and how to parameterize those blocks (parameters) through a structure controller and a structural mapper, enabling adaptation in a single forward pass without on-device backpropagation. A sparsity constraint and a hypernetwork-based parameter mapper mitigate gradient conflicts and enforce efficiency, achieving strong recommendations with reduced FLOPs and model size. The approach demonstrates clear advantages across four real-world datasets, offering a practical, privacy-preserving path toward architecture-aware on-device recommendation with potential for broader neural architecture search integration.

Abstract

In cloud-centric recommender system, regular data exchanges between user devices and cloud could potentially elevate bandwidth demands and privacy risks. On-device recommendation emerges as a viable solution by performing reranking locally to alleviate these concerns. Existing methods primarily focus on developing local adaptive parameters, while potentially neglecting the critical role of tailor-made model architecture. Insights from broader research domains suggest that varying data distributions might favor distinct architectures for better fitting. In addition, imposing a uniform model structure across heterogeneous devices may result in risking inefficacy on less capable devices or sub-optimal performance on those with sufficient capabilities. In response to these gaps, our paper introduces Forward-OFA, a novel approach for the dynamic construction of device-specific networks (both structure and parameters). Forward-OFA employs a structure controller to selectively determine whether each block needs to be assembled for a given device. However, during the training of the structure controller, these assembled heterogeneous structures are jointly optimized, where the co-adaption among blocks might encounter gradient conflicts. To mitigate this, Forward-OFA is designed to establish a structure-guided mapping of real-time behaviors to the parameters of assembled networks. Structure-related parameters and parallel components within the mapper prevent each part from receiving heterogeneous gradients from others, thus bypassing the gradient conflicts for coupled optimization. Besides, direct mapping enables Forward-OFA to achieve adaptation through only one forward pass, allowing for swift adaptation to changing interests and eliminating the requirement for on-device backpropagation. Experiments on real-world datasets demonstrate the effectiveness and efficiency of Forward-OFA.
Paper Structure (31 sections, 12 equations, 7 figures, 8 tables, 2 algorithms)

This paper contains 31 sections, 12 equations, 7 figures, 8 tables, 2 algorithms.

Figures (7)

  • Figure 1: (a): Brief comparison between Forward-OFA and other methods used in on-device recommendation. (b): Each device has its own specific behaviors which change frequently while cloud has access to all the historical data of all devices. Distribution shifts among them and within each device make models trained with data on cloud degrade on some devices. (c): Large networks are conducive to exploring complex user interests, while simple networks are suitable enough for intuitive users. (d): Most users own mobile devices that don’t have a lot of computing resources and the computing resources available for the current recommendation task will change in real time due to the presence of other apps.
  • Figure 2: When backpropagation, conflict gradients from two sequences with different interests will prevent shared blocks(red blocks) from being updated correctly.
  • Figure 3: Illustrations of all components in Forward-OFA. (a): At the beginning of each session when interest changes dramatically, cloud sends candidate item embeddings to device and those embeddings will be cached for its recommendation in this session. (b): The structure controller consists of an extractor and a lightweight layer for searching the suitable path(distribution vector). The vector will be used in (c) to make structure-related parameters and alleviate the gradient conflict. (c): A mapper to assign personalized and structural parameters, aiming at removing gradient conflicts during training. (d): Each device does not necessarily own the whole network, but only a sub-model to acquire efficiency.
  • Figure 4: Analysis conducted on Movielens-10M using a 6-layer SASRec. (a) The number of times each block is selected by the device in the test set. (b) Distribution of users with different numbers of blocks.
  • Figure 5: The influence of the coefficient $\lambda$ of the sparsity constraint. The first and second columns are the experimental results on Movielens and Amazon-game respectively. (a) and (b) report the results on NDCG and Hit while (c) and (d) represent the results on FLOPs and params.
  • ...and 2 more figures