The Value of Graph-based Encoding in NBA Salary Prediction

Junhao Su; David Grimsman; Christopher Archibald

The Value of Graph-based Encoding in NBA Salary Prediction

Junhao Su, David Grimsman, Christopher Archibald

TL;DR

This paper shows that building a knowledge graph with on and off court data, embedding that graph in a vector space, and including that vector in the tabular data allows the supervised learning to better understand the landscape of factors that affect salary.

Abstract

Market valuations for professional athletes is a difficult problem, given the amount of variability in performance and location from year to year. In the National Basketball Association (NBA), a straightforward way to address this problem is to build a tabular data set and use supervised machine learning to predict a player's salary based on the player's performance in the previous year. For younger players, whose contracts are mostly built on draft position, this approach works well, however it can fail for veterans or those whose salaries are on the high tail of the distribution. In this paper, we show that building a knowledge graph with on and off court data, embedding that graph in a vector space, and including that vector in the tabular data allows the supervised learning to better understand the landscape of factors that affect salary. We compare several graph embedding algorithms and show that such a process is vital to NBA salary prediction.

The Value of Graph-based Encoding in NBA Salary Prediction

TL;DR

Abstract

Paper Structure (25 sections, 2 equations, 2 figures, 4 tables)

This paper contains 25 sections, 2 equations, 2 figures, 4 tables.

INTRODUCTION
Related Work
Economic Valuation and Relational Capital in Sports
Graph Representation Learning and Structural Vulnerabilities
METHODOLOGY
Data Sources and Baselines
Data and Knowledge Graph Construction
Graph Embedding Methods
Tabular Baselines & Static Embeddings
Graph Neural Networks
Evaluation Protocols
Tri-State Rescue and Misguidance Protocol
Qualitative Analysis and Example Selection
Quantitative Feature Profiling of Outliers
Implementation Details
...and 10 more sections

Figures (2)

Figure 1: Schema of the Heterogeneous NBA Knowledge Graph. The graph connects PlayerSeason anchor nodes (center) to diverse entities including Team, Agent, Award, and Injury. Temporal edges (e.g., Won_Previously, Has_Injury_History) are strictly masked by the admissibility function $A(e,s)$ to prevent look-ahead bias.
Figure 2: Tri-State Evaluation on Eligible Outliers. (a) vs. Weak Baseline: Static embeddings provide a favorable rescue--misguidance trade-off. (b) vs. Strong Baseline: Dynamic architectures incur a "Generalization Tax," reflecting sensitivity to historical networks.

The Value of Graph-based Encoding in NBA Salary Prediction

TL;DR

Abstract

The Value of Graph-based Encoding in NBA Salary Prediction

Authors

TL;DR

Abstract

Table of Contents

Figures (2)