Table of Contents
Fetching ...

PTCL: Pseudo-Label Temporal Curriculum Learning for Label-Limited Dynamic Graph

Shengtao Zhang, Haokai Zhang, Shiqi Lou, Zicheng Wang, Zinan Zeng, Yilin Wang, Minnan Luo

TL;DR

This work tackles label-limited dynamic node classification where only final-timestamp labels are accessible. It introduces PTCL, a variational EM framework with a time-aware backbone and a decoder, augmented by Temporal Curriculum Learning to weight pseudo-labels according to their temporal distance to final labels. The authors also contribute the CoOAG dataset and the FLiD framework to standardize evaluation and experimentation. Empirical results across Wikipedia, Reddit, Dsub, and CoOAG show consistent improvements over strong baselines, with pseudo-labels and curriculum learning driving notable gains and efficient convergence. Overall, PTCL demonstrates how temporal pseudo-labels can reveal evolving node behavior under realistic annotation constraints, offering a practical path for dynamic graph learning in real-world scenarios.

Abstract

Dynamic node classification is critical for modeling evolving systems like financial transactions and academic collaborations. In such systems, dynamically capturing node information changes is critical for dynamic node classification, which usually requires all labels at every timestamp. However, it is difficult to collect all dynamic labels in real-world scenarios due to high annotation costs and label uncertainty (e.g., ambiguous or delayed labels in fraud detection). In contrast, final timestamp labels are easier to obtain as they rely on complete temporal patterns and are usually maintained as a unique label for each user in many open platforms, without tracking the history data. To bridge this gap, we propose PTCL(Pseudo-label Temporal Curriculum Learning), a pioneering method addressing label-limited dynamic node classification where only final labels are available. PTCL introduces: (1) a temporal decoupling architecture separating the backbone (learning time-aware representations) and decoder (strictly aligned with final labels), which generate pseudo-labels, and (2) a Temporal Curriculum Learning strategy that prioritizes pseudo-labels closer to the final timestamp by assigning them higher weights using an exponentially decaying function. We contribute a new academic dataset (CoOAG), capturing long-range research interest in dynamic graph. Experiments across real-world scenarios demonstrate PTCL's consistent superiority over other methods adapted to this task. Beyond methodology, we propose a unified framework FLiD (Framework for Label-Limited Dynamic Node Classification), consisting of a complete preparation workflow, training pipeline, and evaluation standards, and supporting various models and datasets. The code can be found at https://github.com/3205914485/FLiD.

PTCL: Pseudo-Label Temporal Curriculum Learning for Label-Limited Dynamic Graph

TL;DR

This work tackles label-limited dynamic node classification where only final-timestamp labels are accessible. It introduces PTCL, a variational EM framework with a time-aware backbone and a decoder, augmented by Temporal Curriculum Learning to weight pseudo-labels according to their temporal distance to final labels. The authors also contribute the CoOAG dataset and the FLiD framework to standardize evaluation and experimentation. Empirical results across Wikipedia, Reddit, Dsub, and CoOAG show consistent improvements over strong baselines, with pseudo-labels and curriculum learning driving notable gains and efficient convergence. Overall, PTCL demonstrates how temporal pseudo-labels can reveal evolving node behavior under realistic annotation constraints, offering a practical path for dynamic graph learning in real-world scenarios.

Abstract

Dynamic node classification is critical for modeling evolving systems like financial transactions and academic collaborations. In such systems, dynamically capturing node information changes is critical for dynamic node classification, which usually requires all labels at every timestamp. However, it is difficult to collect all dynamic labels in real-world scenarios due to high annotation costs and label uncertainty (e.g., ambiguous or delayed labels in fraud detection). In contrast, final timestamp labels are easier to obtain as they rely on complete temporal patterns and are usually maintained as a unique label for each user in many open platforms, without tracking the history data. To bridge this gap, we propose PTCL(Pseudo-label Temporal Curriculum Learning), a pioneering method addressing label-limited dynamic node classification where only final labels are available. PTCL introduces: (1) a temporal decoupling architecture separating the backbone (learning time-aware representations) and decoder (strictly aligned with final labels), which generate pseudo-labels, and (2) a Temporal Curriculum Learning strategy that prioritizes pseudo-labels closer to the final timestamp by assigning them higher weights using an exponentially decaying function. We contribute a new academic dataset (CoOAG), capturing long-range research interest in dynamic graph. Experiments across real-world scenarios demonstrate PTCL's consistent superiority over other methods adapted to this task. Beyond methodology, we propose a unified framework FLiD (Framework for Label-Limited Dynamic Node Classification), consisting of a complete preparation workflow, training pipeline, and evaluation standards, and supporting various models and datasets. The code can be found at https://github.com/3205914485/FLiD.

Paper Structure

This paper contains 45 sections, 12 equations, 5 figures, 7 tables, 1 algorithm.

Figures (5)

  • Figure 1: A present of a financial system. The graph represents a dynamic financial system where nodes represent entities such as users, payment cards, and financial institutions, while edges represent transactional relationships. Over time, user behavior is tracked through a sequence of transactions, and some users' labels(account status) may eventually be identified as fraudulent.
  • Figure 2: Overview of our proposed method. PTCL consists of a Variational EM process with a dynamic graph backbone and a decoder. During the warmup phase, the dynamic graph backbone is trained on a link prediction task, where the dynamic graph structure serves as the target. After warmup, in each M-step, the backbone receives final timestamp labels, pseudo-labels, and the dynamic graph structure as input, while the decoder is trained in the E-step to refine pseudo-labels. Additionally, the Temporal Curriculum Learning strategy prioritizes pseudo-labels based on their temporal proximity to the final timestamp labels, ensuring higher-quality training.
  • Figure 3: Architecture of different baselines and PTCL. "B" stands for backbone, and "D" stands for decoder.
  • Figure 4: Histogram of pseudo-labels consistency.
  • Figure 5: Convergence curves for 5 backbones. Star markers ($\star$) denote peak performance; circled points($\bullet$)indicate surpassing baselines. Dashed lines show baseline AUC.