Table of Contents
Fetching ...

Generating In-store Customer Journeys from Scratch with GPT Architectures

Taizo Horikomi, Takayuki Mizuno

TL;DR

The paper tackles generating realistic in-store customer journeys that couple trajectories with purchasing actions. It adopts a Transformer-based approach by training a GPT-2 small model from scratch on text-like encodings of indoor locations and purchases, using a six-level hierarchical grid to encode positions and DBSCAN-based zone mapping for purchases. Across Store A for pre-training and Store B for cross-store fine-tuning, the method achieves accurate trajectory generation and zone-level purchase distributions, with a low divergence from real data (e.g., JS divergence ~0.0097) and superior performance to LSTM and SVM baselines. Importantly, fine-tuning enables substantial data efficiency, where only about 100 samples can match the performance of training on tens of thousands of samples, signaling potential reductions in data collection costs for retail analytics and simulation. The work lays a foundation for scalable, data-efficient simulation of in-store dynamics and suggests future enhancements via additional background tokens and multimodal outputs.

Abstract

We propose a method that can generate customer trajectories and purchasing behaviors in retail stores simultaneously using Transformer-based deep learning structure. Utilizing customer trajectory data, layout diagrams, and retail scanner data obtained from a retail store, we trained a GPT-2 architecture from scratch to generate indoor trajectories and purchase actions. Additionally, we explored the effectiveness of fine-tuning the pre-trained model with data from another store. Results demonstrate that our method reproduces in-store trajectories and purchase behaviors more accurately than LSTM and SVM models, with fine-tuning significantly reducing the required training data.

Generating In-store Customer Journeys from Scratch with GPT Architectures

TL;DR

The paper tackles generating realistic in-store customer journeys that couple trajectories with purchasing actions. It adopts a Transformer-based approach by training a GPT-2 small model from scratch on text-like encodings of indoor locations and purchases, using a six-level hierarchical grid to encode positions and DBSCAN-based zone mapping for purchases. Across Store A for pre-training and Store B for cross-store fine-tuning, the method achieves accurate trajectory generation and zone-level purchase distributions, with a low divergence from real data (e.g., JS divergence ~0.0097) and superior performance to LSTM and SVM baselines. Importantly, fine-tuning enables substantial data efficiency, where only about 100 samples can match the performance of training on tens of thousands of samples, signaling potential reductions in data collection costs for retail analytics and simulation. The work lays a foundation for scalable, data-efficient simulation of in-store dynamics and suggests future enhancements via additional background tokens and multimodal outputs.

Abstract

We propose a method that can generate customer trajectories and purchasing behaviors in retail stores simultaneously using Transformer-based deep learning structure. Utilizing customer trajectory data, layout diagrams, and retail scanner data obtained from a retail store, we trained a GPT-2 architecture from scratch to generate indoor trajectories and purchase actions. Additionally, we explored the effectiveness of fine-tuning the pre-trained model with data from another store. Results demonstrate that our method reproduces in-store trajectories and purchase behaviors more accurately than LSTM and SVM models, with fine-tuning significantly reducing the required training data.
Paper Structure (12 sections, 3 equations, 8 figures)

This paper contains 12 sections, 3 equations, 8 figures.

Figures (8)

  • Figure 1: Textualization of Location Information and Modeling of Purchase Behavior. (a) Division of the entire store into 50cm-grid meshes. $\blacktriangle$ denotes the starting point (entrance), × denotes the ending point (cash register). (b) DBSCAN clustering. blacksquare denotes a location belonging to a cluster. (c) (d) Identification of zones in which the purchased items were placed using the layout diagram. The rectangles surrounding the trajectories represent zones. (e) Estimation of purchase location. $\bullet$ denotes a location where a purchase action was taken. (f) Textualization of the whole in-store customer journey.
  • Figure 2: Example of Generated Trajectory with Purchase Behavior: $\blacktriangle$ denotes the starting point (entrance), $\blacklozenge$ denotes a inputted location, "." denotes a generated location, $\bullet$ denotes a purchase, and × denotes the ending point (checkout register).
  • Figure 3: Comparison of In-Store Traffic Heatmaps (Left: Actual Data, Right: Generated Results).
  • Figure 4: Examples of three customer trajectories generated by LSTM. $\blacklozenge$ denotes a inputted location, "." denotes a generated location.
  • Figure 5: Comparison of Purchase Counts by Zone. The numbers of purchased items per visit in each of the 61 zones in Store A are compared between the test data and the generated results.
  • ...and 3 more figures