Table of Contents
Fetching ...

GraphMatch: Fusing Language and Graph Representations in a Dynamic Two-Sided Work Marketplace

Mikołaj Sacha, Hammad Jafri, Mattie Terzolo, Ayan Sinha, Andrew Rabinovich

TL;DR

GraphMatch addresses the challenge of recommending matches in text-rich, dynamic two-sided marketplaces by unifying pre-trained language models with graph neural networks. It introduces a two-stage TextMatch/Text-graph pipeline, where domain-adapted textual embeddings feed a temporal, subgraph-based GNN, enhanced by adversarial negative mining and in-batch contrastive learning. Extensive experiments on Upwork-scale data show GraphMatch outperforms LM-only, GNN-only, and non-temporal fusion baselines, with strong results on both typical and cold-start scenarios. The work also outlines a practical real-time deployment blueprint, including feature stores, graph databases, and inference services, demonstrating the approach’s viability in production settings.

Abstract

Recommending matches in a text-rich, dynamic two-sided marketplace presents unique challenges due to evolving content and interaction graphs. We introduce GraphMatch, a new large-scale recommendation framework that fuses pre-trained language models with graph neural networks to overcome these challenges. Unlike prior approaches centered on standalone models, GraphMatch is a comprehensive recipe built on powerful text encoders and GNNs working in tandem. It employs adversarial negative sampling alongside point-in-time subgraph training to learn representations that capture both the fine-grained semantics of evolving text and the time-sensitive structure of the graph. We evaluated extensively on interaction data from Upwork, a leading labor marketplace, at large scale, and discuss our approach towards low-latency inference suitable for real-time use. In our experiments, GraphMatch outperforms language-only and graph-only baselines on matching tasks while being efficient at runtime. These results demonstrate that unifying language and graph representations yields a highly effective solution to text-rich, dynamic two-sided recommendations, bridging the gap between powerful pretrained LMs and large-scale graphs in practice.

GraphMatch: Fusing Language and Graph Representations in a Dynamic Two-Sided Work Marketplace

TL;DR

GraphMatch addresses the challenge of recommending matches in text-rich, dynamic two-sided marketplaces by unifying pre-trained language models with graph neural networks. It introduces a two-stage TextMatch/Text-graph pipeline, where domain-adapted textual embeddings feed a temporal, subgraph-based GNN, enhanced by adversarial negative mining and in-batch contrastive learning. Extensive experiments on Upwork-scale data show GraphMatch outperforms LM-only, GNN-only, and non-temporal fusion baselines, with strong results on both typical and cold-start scenarios. The work also outlines a practical real-time deployment blueprint, including feature stores, graph databases, and inference services, demonstrating the approach’s viability in production settings.

Abstract

Recommending matches in a text-rich, dynamic two-sided marketplace presents unique challenges due to evolving content and interaction graphs. We introduce GraphMatch, a new large-scale recommendation framework that fuses pre-trained language models with graph neural networks to overcome these challenges. Unlike prior approaches centered on standalone models, GraphMatch is a comprehensive recipe built on powerful text encoders and GNNs working in tandem. It employs adversarial negative sampling alongside point-in-time subgraph training to learn representations that capture both the fine-grained semantics of evolving text and the time-sensitive structure of the graph. We evaluated extensively on interaction data from Upwork, a leading labor marketplace, at large scale, and discuss our approach towards low-latency inference suitable for real-time use. In our experiments, GraphMatch outperforms language-only and graph-only baselines on matching tasks while being efficient at runtime. These results demonstrate that unifying language and graph representations yields a highly effective solution to text-rich, dynamic two-sided recommendations, bridging the gap between powerful pretrained LMs and large-scale graphs in practice.

Paper Structure

This paper contains 40 sections, 7 equations, 3 figures, 3 tables, 1 algorithm.

Figures (3)

  • Figure 1: Flowchart of the multi-stage training of TextMatch and GraphMatch, with models and datasets at different stages.
  • Figure 2: GraphMatch embeds freelancers, clients or job posts using a sampled text-attributed graph representing their work history. In the illustration, we predict the embeddings of the emboldened job post (left) and freelancer (right) nodes using the surrounding graph. We compare GraphMatch embedding vectors using cosine similarity to predict the match probability between two entities.
  • Figure 3: We store dynamic node features in two tables. The main table (left) contains a single row per node, with its history start index and the number of versions. The node history table (right) stores all available versions of features per each node, grouped by node and sorted by timestamp within each node. Given any timestamp, node type, and node ID, we first query the main table to retrieve the history index and the number of versions. Next, we run a binary search over the relevant rows in the feature history table, finding the point-in-time correct feature values in $O(log_{2}n)$ time.