Towards Financially Inclusive Credit Products Through Financial Time Series Clustering

Tristan Bester; Benjamin Rosman

Towards Financially Inclusive Credit Products Through Financial Time Series Clustering

Tristan Bester, Benjamin Rosman

TL;DR

The work tackles financial inclusion by enabling customer segmentation without annotated data through time-series clustering of transaction histories. It develops a taxonomy of four component classes for deep representation learning-based clustering (autoencoder architecture, dimensionality reduction, pretext loss, clustering loss) and systematically evaluates their combinations on the Berka dataset to identify strong configurations. The authors introduce Financial Transaction History Clustering (FTHC), a CNN-based autoencoder with a Deep Temporal Clustering objective using Euclidean distance, which outperforms state-of-the-art methods on key clustering metrics. The study demonstrates that stable, well-tuned clustering can yield meaningful, human-interpretable segments, supporting tailored financial products for marginalised groups and enhancing the practical impact of financial inclusion efforts.

Abstract

Financial inclusion ensures that individuals have access to financial products and services that meet their needs. As a key contributing factor to economic growth and investment opportunity, financial inclusion increases consumer spending and consequently business development. It has been shown that institutions are more profitable when they provide marginalised social groups access to financial services. Customer segmentation based on consumer transaction data is a well-known strategy used to promote financial inclusion. While the required data is available to modern institutions, the challenge remains that segment annotations are usually difficult and/or expensive to obtain. This prevents the usage of time series classification models for customer segmentation based on domain expert knowledge. As a result, clustering is an attractive alternative to partition customers into homogeneous groups based on the spending behaviour encoded within their transaction data. In this paper, we present a solution to one of the key challenges preventing modern financial institutions from providing financially inclusive credit, savings and insurance products: the inability to understand consumer financial behaviour, and hence risk, without the introduction of restrictive conventional credit scoring techniques. We present a novel time series clustering algorithm that allows institutions to understand the financial behaviour of their customers. This enables unique product offerings to be provided based on the needs of the customer, without reliance on restrictive credit practices.

Towards Financially Inclusive Credit Products Through Financial Time Series Clustering

TL;DR

Abstract

Paper Structure (38 sections, 6 equations, 9 figures, 2 tables)

This paper contains 38 sections, 6 equations, 9 figures, 2 tables.

Introduction
Background
Clustering
Specificity of the Time Dimension
Neural Network Architectures
Fully Connected Neural Network Layers (FCNN)
Convolutional Neural Network Layers (CNN)
Recurrent Neural Network Layers (RNN)
Deep Representation Learning
Autoencoder Architectures
Autoencoder Training
Encoder Optimisation for Clustering
Dimensionality Reduction
Principal Component Analysis (PCA)
Uniform Manifold Approximation and Projection (UMAP)
...and 23 more sections

Figures (9)

Figure 1: Percentage of invalid clusterings produced across all combinations. Results are shown for each clustering layer variant.
Figure 2: The effect of varied learning rates in the cluster optimisation phase. In the lower plot, it can be seen that the cluster centroids remain in their initial positions while the latent representations all converge to a single representation. Consequently, all data points are assigned to the same cluster. The stability of the upper model is clear from the converged latent space representation.
Figure 3: Average clustering performance associated with each autoencoder architecture.
Figure 4: Average clustering performance associated with each pretext loss function.
Figure 5: Average clustering performance associated with each clustering loss function.
...and 4 more figures

Towards Financially Inclusive Credit Products Through Financial Time Series Clustering

TL;DR

Abstract

Towards Financially Inclusive Credit Products Through Financial Time Series Clustering

Authors

TL;DR

Abstract

Table of Contents

Figures (9)