Table of Contents
Fetching ...

Adaptive Weighted Loss for Sequential Recommendations on Sparse Domains

Akshay Mittal, Vinay Venkatesh, Krishna Kandi, Shalini Sudarshan

TL;DR

This paper tackles the problem of catering to power users in sparse domains within sequential recommendation by introducing a data-driven Dynamic Weighted Loss. It replaces a fixed, uniform domain weight with per-domain weights computed from a sparsity-aware score, applied through a two-stage process that includes Domain Sparsity Measurement and Adaptive Loss Application, while preserving a log-Q corrected sampled softmax. The authors provide theoretical guarantees (convergence, stability, and complexity) and validate the approach on four datasets, showing substantial gains in sparse domains (e.g., Film-Noir: Recall@10 +52.4%; NDCG@10 +74.5%) and robust performance in dense domains, with negligible overhead. The work advances practical recommender systems by enabling a single model to learn effectively across heterogeneous domains without heavy data augmentation or cross-domain transfer, offering a scalable path to more personalized, domain-aware recommendations.

Abstract

The effectiveness of single-model sequential recommendation architectures, while scalable, is often limited when catering to "power users" in sparse or niche domains. Our previous research, PinnerFormerLite, addressed this by using a fixed weighted loss to prioritize specific domains. However, this approach can be sub-optimal, as a single, uniform weight may not be sufficient for domains with very few interactions, where the training signal is easily diluted by the vast, generic dataset. This paper proposes a novel, data-driven approach: a Dynamic Weighted Loss function with comprehensive theoretical foundations and extensive empirical validation. We introduce an adaptive algorithm that adjusts the loss weight for each domain based on its sparsity in the training data, assigning a higher weight to sparser domains and a lower weight to denser ones. This ensures that even rare user interests contribute a meaningful gradient signal, preventing them from being overshadowed. We provide rigorous theoretical analysis including convergence proofs, complexity analysis, and bounds analysis to establish the stability and efficiency of our approach. Our comprehensive empirical validation across four diverse datasets (MovieLens, Amazon Electronics, Yelp Business, LastFM Music) with state-of-the-art baselines (SIGMA, CALRec, SparseEnNet) demonstrates that this dynamic weighting system significantly outperforms all comparison methods, particularly for sparse domains, achieving substantial lifts in key metrics like Recall at 10 and NDCG at 10 while maintaining performance on denser domains and introducing minimal computational overhead.

Adaptive Weighted Loss for Sequential Recommendations on Sparse Domains

TL;DR

This paper tackles the problem of catering to power users in sparse domains within sequential recommendation by introducing a data-driven Dynamic Weighted Loss. It replaces a fixed, uniform domain weight with per-domain weights computed from a sparsity-aware score, applied through a two-stage process that includes Domain Sparsity Measurement and Adaptive Loss Application, while preserving a log-Q corrected sampled softmax. The authors provide theoretical guarantees (convergence, stability, and complexity) and validate the approach on four datasets, showing substantial gains in sparse domains (e.g., Film-Noir: Recall@10 +52.4%; NDCG@10 +74.5%) and robust performance in dense domains, with negligible overhead. The work advances practical recommender systems by enabling a single model to learn effectively across heterogeneous domains without heavy data augmentation or cross-domain transfer, offering a scalable path to more personalized, domain-aware recommendations.

Abstract

The effectiveness of single-model sequential recommendation architectures, while scalable, is often limited when catering to "power users" in sparse or niche domains. Our previous research, PinnerFormerLite, addressed this by using a fixed weighted loss to prioritize specific domains. However, this approach can be sub-optimal, as a single, uniform weight may not be sufficient for domains with very few interactions, where the training signal is easily diluted by the vast, generic dataset. This paper proposes a novel, data-driven approach: a Dynamic Weighted Loss function with comprehensive theoretical foundations and extensive empirical validation. We introduce an adaptive algorithm that adjusts the loss weight for each domain based on its sparsity in the training data, assigning a higher weight to sparser domains and a lower weight to denser ones. This ensures that even rare user interests contribute a meaningful gradient signal, preventing them from being overshadowed. We provide rigorous theoretical analysis including convergence proofs, complexity analysis, and bounds analysis to establish the stability and efficiency of our approach. Our comprehensive empirical validation across four diverse datasets (MovieLens, Amazon Electronics, Yelp Business, LastFM Music) with state-of-the-art baselines (SIGMA, CALRec, SparseEnNet) demonstrates that this dynamic weighting system significantly outperforms all comparison methods, particularly for sparse domains, achieving substantial lifts in key metrics like Recall at 10 and NDCG at 10 while maintaining performance on denser domains and introducing minimal computational overhead.

Paper Structure

This paper contains 28 sections, 1 figure, 3 tables.

Figures (1)

  • Figure 1: PinnerFormerLite Architecture with Dynamic Domain-Specific Weighting. The architecture processes user interaction sequences (s1-s4) through an embedding layer with multiple attention mechanisms, generating user representations that are fed to an output layer. The dynamic weighted loss component (containing adaptive weights $\hat{w}_1$ and $\hat{w}_2$) creates a feedback loop that adjusts the embedding layer based on domain sparsity, ensuring balanced learning across dense and sparse domains.