Towards Financially Inclusive Credit Products Through Financial Time Series Clustering
Tristan Bester, Benjamin Rosman
TL;DR
The work tackles financial inclusion by enabling customer segmentation without annotated data through time-series clustering of transaction histories. It develops a taxonomy of four component classes for deep representation learning-based clustering (autoencoder architecture, dimensionality reduction, pretext loss, clustering loss) and systematically evaluates their combinations on the Berka dataset to identify strong configurations. The authors introduce Financial Transaction History Clustering (FTHC), a CNN-based autoencoder with a Deep Temporal Clustering objective using Euclidean distance, which outperforms state-of-the-art methods on key clustering metrics. The study demonstrates that stable, well-tuned clustering can yield meaningful, human-interpretable segments, supporting tailored financial products for marginalised groups and enhancing the practical impact of financial inclusion efforts.
Abstract
Financial inclusion ensures that individuals have access to financial products and services that meet their needs. As a key contributing factor to economic growth and investment opportunity, financial inclusion increases consumer spending and consequently business development. It has been shown that institutions are more profitable when they provide marginalised social groups access to financial services. Customer segmentation based on consumer transaction data is a well-known strategy used to promote financial inclusion. While the required data is available to modern institutions, the challenge remains that segment annotations are usually difficult and/or expensive to obtain. This prevents the usage of time series classification models for customer segmentation based on domain expert knowledge. As a result, clustering is an attractive alternative to partition customers into homogeneous groups based on the spending behaviour encoded within their transaction data. In this paper, we present a solution to one of the key challenges preventing modern financial institutions from providing financially inclusive credit, savings and insurance products: the inability to understand consumer financial behaviour, and hence risk, without the introduction of restrictive conventional credit scoring techniques. We present a novel time series clustering algorithm that allows institutions to understand the financial behaviour of their customers. This enables unique product offerings to be provided based on the needs of the customer, without reliance on restrictive credit practices.
