Spatial-Temporal Cross-View Contrastive Pre-training for Check-in Sequence Representation Learning

Letian Gong; Huaiyu Wan; Shengnan Guo; Xiucheng Li; Yan Lin; Erwen Zheng; Tianyi Wang; Zeyu Zhou; Youfang Lin

Spatial-Temporal Cross-View Contrastive Pre-training for Check-in Sequence Representation Learning

Letian Gong, Huaiyu Wan, Shengnan Guo, Xiucheng Li, Yan Lin, Erwen Zheng, Tianyi Wang, Zeyu Zhou, Youfang Lin

TL;DR

STCCR addresses the challenge of learning universal representations for check-in sequences under temporal uncertainty and spatial diversity. It introduces a Spatial Topic Module with contrastive clustering, a Temporal Intention Module with an angular-margin contrastive loss, and a Spatial-Temporal Cross-View module to fuse semantics into a unified space. The method yields superior performance across LP, TUL, and TP on three real-world datasets, and ablations confirm the contribution of each component. The approach offers scalable, transferable representations for downstream location-based services and mobility analytics.

Abstract

The rapid growth of location-based services (LBS) has yielded massive amounts of data on human mobility. Effectively extracting meaningful representations for user-generated check-in sequences is pivotal for facilitating various downstream services. However, the user-generated check-in data are simultaneously influenced by the surrounding objective circumstances and the user's subjective intention. Specifically, the temporal uncertainty and spatial diversity exhibited in check-in data make it difficult to capture the macroscopic spatial-temporal patterns of users and to understand the semantics of user mobility activities. Furthermore, the distinct characteristics of the temporal and spatial information in check-in sequences call for an effective fusion method to incorporate these two types of information. In this paper, we propose a novel Spatial-Temporal Cross-view Contrastive Representation (STCCR) framework for check-in sequence representation learning. Specifically, STCCR addresses the above challenges by employing self-supervision from "spatial topic" and "temporal intention" views, facilitating effective fusion of spatial and temporal information at the semantic level. Besides, STCCR leverages contrastive clustering to uncover users' shared spatial topics from diverse mobility activities, while employing angular momentum contrast to mitigate the impact of temporal uncertainty and noise. We extensively evaluate STCCR on three real-world datasets and demonstrate its superior performance across three downstream tasks.

Spatial-Temporal Cross-View Contrastive Pre-training for Check-in Sequence Representation Learning

TL;DR

Abstract

Paper Structure (30 sections, 20 equations, 9 figures, 4 tables)

This paper contains 30 sections, 20 equations, 9 figures, 4 tables.

Introduction
Related work
Mobility Data Mining
Pretraining and Contrastive Learning
Preliminaries
Definitions
Problem Statement
spatial-temporal Cross-view Contrastive Framework
Spatial Topic Module
Geographical Location Information Encoding
POI Category representation
Spatial Cluster Contrastive Block
Temporal Intention Module
Timestamp Embedding
Temporal Angular Contrastive Block
...and 15 more sections

Figures (9)

Figure 1: shows the temporal uncertainty is influenced by the subjective intention and objective factors. Users' arrival times tend to be in a range of intervals rather than a precise planned time.
Figure 2: shows the check-in sequences of a user on working days and weekends, and distinct icons indicate different POIs. Although the check-in POIs on the four days differ, the semantics reflected by the check-in sequences on the two working days (or the two weekends) are similar.
Figure 3: The model architecture of STCCR. (a)The Spatial Topic Module employs contrast clustering to encode the POIs' id, latitude, longitude, and category. (b)The Temporal Intention Module combines user and time information to obtain the temporal intention patterns of users. (c)The ST Cross-View Contrastive Module aligns spatial and temporal information into a unified semantic space using project heads, facilitating the integration of spatial-temporal information at the macroscopic semantic level.
Figure 4: Spatial cluster contrastive block. We capture spatial topic of user activity through reweighted contrast and cluster consistency manners. In order to improve the clustering effect, we maintain a queue of historical sequences that participate in the computation of the current batch.
Figure 5: Angular Margin. Better Uniformity refers to the ability of a model to learn shared representations among similar samples, resulting in improved consistency within the feature space. Better Alignment signifies the model's capability to map different views or variations of the same sample to nearby positions in the feature space, achieving enhanced alignment.
...and 4 more figures

Theorems & Definitions (2)

Definition 1
Definition 2

Spatial-Temporal Cross-View Contrastive Pre-training for Check-in Sequence Representation Learning

TL;DR

Abstract

Spatial-Temporal Cross-View Contrastive Pre-training for Check-in Sequence Representation Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (9)

Theorems & Definitions (2)