Table of Contents
Fetching ...

TIMeSynC: Temporal Intent Modelling with Synchronized Context Encodings for Financial Service Applications

Dwipam Katariya, Juan Manuel Origgi, Yage Wang, Thomas Caputo

TL;DR

The paper addresses predicting user intent in financial services from heterogeneous, multi-domain temporal data. It introduces TIMeSynC, an encoder–decoder transformer that uses TimeAliBi and a multi-dimensional time encoder to synchronize context across dynamic and static features, flattening data across domains, fields, and time for joint learning. Experiments on large-scale financial data show TIMeSynC outperforming SASRec baselines and tabular-context variants, with ablations highlighting the importance of field-name and product embeddings. The approach enables more accurate next-action predictions, targeted marketing, and improved user experiences in multi-channel financial services, while noting potential encoder-context window growth and the need for broader validation.

Abstract

Users engage with financial services companies through multiple channels, often interacting with mobile applications, web platforms, call centers, and physical locations to service their accounts. The resulting interactions are recorded at heterogeneous temporal resolutions across these domains. This multi-channel data can be combined and encoded to create a comprehensive representation of the customer's journey for accurate intent prediction. This demands sequential learning solutions. NMT transformers achieve state-of-the-art sequential representation learning by encoding context and decoding for the next best action to represent long-range dependencies. However, three major challenges exist while combining multi-domain sequences within an encoder-decoder transformers architecture for intent prediction applications: a) aligning sequences with different sampling rates b) learning temporal dynamics across multi-variate, multi-domain sequences c) combining dynamic and static sequences. We propose an encoder-decoder transformer model to address these challenges for contextual and sequential intent prediction in financial servicing applications. Our experiments show significant improvement over the existing tabular method.

TIMeSynC: Temporal Intent Modelling with Synchronized Context Encodings for Financial Service Applications

TL;DR

The paper addresses predicting user intent in financial services from heterogeneous, multi-domain temporal data. It introduces TIMeSynC, an encoder–decoder transformer that uses TimeAliBi and a multi-dimensional time encoder to synchronize context across dynamic and static features, flattening data across domains, fields, and time for joint learning. Experiments on large-scale financial data show TIMeSynC outperforming SASRec baselines and tabular-context variants, with ablations highlighting the importance of field-name and product embeddings. The approach enables more accurate next-action predictions, targeted marketing, and improved user experiences in multi-channel financial services, while noting potential encoder-context window growth and the need for broader validation.

Abstract

Users engage with financial services companies through multiple channels, often interacting with mobile applications, web platforms, call centers, and physical locations to service their accounts. The resulting interactions are recorded at heterogeneous temporal resolutions across these domains. This multi-channel data can be combined and encoded to create a comprehensive representation of the customer's journey for accurate intent prediction. This demands sequential learning solutions. NMT transformers achieve state-of-the-art sequential representation learning by encoding context and decoding for the next best action to represent long-range dependencies. However, three major challenges exist while combining multi-domain sequences within an encoder-decoder transformers architecture for intent prediction applications: a) aligning sequences with different sampling rates b) learning temporal dynamics across multi-variate, multi-domain sequences c) combining dynamic and static sequences. We propose an encoder-decoder transformer model to address these challenges for contextual and sequential intent prediction in financial servicing applications. Our experiments show significant improvement over the existing tabular method.

Paper Structure

This paper contains 13 sections, 3 equations, 3 figures, 4 tables.

Figures (3)

  • Figure 1: Model Architecture. Field Value embeddings are added with Product and Field Name embeddings before employing the encoder block. For both the decoder and encoder, a Time encoder and TimeAliBi are applied for learning absolute and relative time dynamics. Depending upon the self-attention or cross-attention, a causal mask(K(t), Q(t)) is applied with respect to the time corresponding to Key(K) and the Query(Q). In blue the encodings used as input of the Encoder block; in Red the TimeAliBi self-attention module; in purple the input encodings to the Decoder module; in yellow and green the decoder TimeAlibi cross-attention modules, and at last in white the head layers. This figure only notes changes w.r.t. to the vanilla transformers vaswani
  • Figure 2: Illustrative customer journey across domains at their respective timelines
  • Figure 3: Illustrative Example. Data Flattening and Tokenization of the Encoder Context