Table of Contents
Fetching ...

Label Attention Network for Temporal Sets Prediction: You Were Looking at a Wrong Self-Attention

Elizaveta Kovtun, Galina Boeva, Andrey Shulga, Alexey Zaytsev

TL;DR

The proposed model is called Label-Attention NETwork, or LANET, and provides an implementation of LANET to encourage its wider usage and contemplate causal relationships between labels in this work, as well as a thorough study of LANET components' influence on performance.

Abstract

Most user-related data can be represented as a sequence of events associated with a timestamp and a collection of categorical labels. For example, the purchased basket of goods and the time of buying fully characterize the event of the store visit. Anticipation of the label set for the future event called the problem of temporal sets prediction, holds significant value, especially in such high-stakes industries as finance and e-commerce. A fundamental challenge of this task is the joint consideration of the temporal nature of events and label relations within sets. The existing models fail to capture complex time and label dependencies due to ineffective representation of historical information initially. We aim to address this shortcoming by presenting the framework with a specific way to aggregate the observed information into time- and set structure-aware views prior to transferring it into main architecture blocks. Our strong emphasis on input arrangement facilitates the subsequent efficient learning of label interactions. The proposed model is called Label-Attention NETwork, or LANET. We conducted experiments on four different datasets and made a comparison with four established models, including SOTA, in this area. The experimental results suggest that LANET provides significantly better quality than any other model, achieving an improvement up to $65 \%$ in terms of weighted F1 metric compared to the closest competitor. Moreover, we contemplate causal relationships between labels in our work, as well as a thorough study of LANET components' influence on performance. We provide an implementation of LANET to encourage its wider usage.

Label Attention Network for Temporal Sets Prediction: You Were Looking at a Wrong Self-Attention

TL;DR

The proposed model is called Label-Attention NETwork, or LANET, and provides an implementation of LANET to encourage its wider usage and contemplate causal relationships between labels in this work, as well as a thorough study of LANET components' influence on performance.

Abstract

Most user-related data can be represented as a sequence of events associated with a timestamp and a collection of categorical labels. For example, the purchased basket of goods and the time of buying fully characterize the event of the store visit. Anticipation of the label set for the future event called the problem of temporal sets prediction, holds significant value, especially in such high-stakes industries as finance and e-commerce. A fundamental challenge of this task is the joint consideration of the temporal nature of events and label relations within sets. The existing models fail to capture complex time and label dependencies due to ineffective representation of historical information initially. We aim to address this shortcoming by presenting the framework with a specific way to aggregate the observed information into time- and set structure-aware views prior to transferring it into main architecture blocks. Our strong emphasis on input arrangement facilitates the subsequent efficient learning of label interactions. The proposed model is called Label-Attention NETwork, or LANET. We conducted experiments on four different datasets and made a comparison with four established models, including SOTA, in this area. The experimental results suggest that LANET provides significantly better quality than any other model, achieving an improvement up to in terms of weighted F1 metric compared to the closest competitor. Moreover, we contemplate causal relationships between labels in our work, as well as a thorough study of LANET components' influence on performance. We provide an implementation of LANET to encourage its wider usage.
Paper Structure (28 sections, 7 equations, 6 figures, 4 tables)

This paper contains 28 sections, 7 equations, 6 figures, 4 tables.

Figures (6)

  • Figure 1: Visual representation of temporal sets prediction problem. The sequence of events that are characterized by timestamps $t_1, t_2, t_3$ and an arbitrary number of labels denoted with colored circles. Our goal is to predict label set for the next event based on the previous sets.
  • Figure 2: LANET architecture for temporal sets prediction. The key part is to aggregate historical information into representative views that will be transferred into the Transformer encoder block. The output of the model is a vector of confidence scores, whose components are associated with the prospect of a corresponding label to be a member of the next-event set.
  • Figure 3: The dependence of LANET quality on the embedding size.
  • Figure 4: The dependence of LANET quality on the number of heads.
  • Figure 5: The dependence of LANET quality on the number of encoder layers.
  • ...and 1 more figures