Multi-Head Self-Attending Neural Tucker Factorization

Yikai Hou; Peng Tang

Multi-Head Self-Attending Neural Tucker Factorization

Yikai Hou, Peng Tang

TL;DR

The work addresses QoS prediction under dynamic temporal patterns by proposing MSNTucF, a neural Tucker factorization framework augmented with a multi-head self-attention module to learn nonlinear spatiotemporal interactions in high-dimensional incomplete tensors. The model constructs mode embeddings for indices (i, j, k), forms a Tucker interaction tensor, flattens it into a vector, and applies a multi-head self-attention mechanism to produce refined interactions, with final predictions obtained via a sigmoid-activated linear layer. Empirical results on real QoS datasets show MSNTucF consistently outperforms state-of-the-art baselines in tensor completion, with analysis highlighting the beneficial roles of both the number of attention heads and the number of self-attention loops. The findings underscore the potential of integrating tensor factorization with neural attention to enhance prediction in HDI tensor settings, and point to future work incorporating temporal models like LSTMs to further bolster temporal dynamics capture.

Abstract

Quality-of-service (QoS) data exhibit dynamic temporal patterns that are crucial for accurately predicting missing values. These patterns arise from the evolving interactions between users and services, making it essential to capture the temporal dynamics inherent in such data for improved prediction performance. As the size and complexity of QoS datasets increase, existing models struggle to provide accurate predictions, highlighting the need for more flexible and dynamic methods to better capture the underlying patterns in large-scale QoS data. To address this issue, we introduce a neural network-based tensor factorization approach tailored for learning spatiotemporal representations of high-dimensional and incomplete (HDI) tensors, namely the Multi-head Self-attending Neural Tucker Factorization (MSNTucF). The model is elaborately designed for modeling intricate nonlinear spatiotemporal feature interaction patterns hidden in real world data with a two-fold idea. It first employs a neural network structure to generalize the traditional framework of Tucker factorization and then proposes to leverage a multi-head self-attending module to enforce nonlinear latent interaction learning. In empirical studies on two dynamic QoS datasets from real applications, the proposed MSNTucF model demonstrates superior performance compared to state-of-the-art benchmark models in estimating missing observations. This highlights its ability to learn non-linear spatiotemporal representations of HDI tensors.

Multi-Head Self-Attending Neural Tucker Factorization

TL;DR

Abstract

Multi-Head Self-Attending Neural Tucker Factorization

Authors

TL;DR

Abstract

Table of Contents

Figures (2)