Table of Contents
Fetching ...

Big City Bias: Evaluating the Impact of Metropolitan Size on Computational Job Market Abilities of Language Models

Charlie Campanella, Rob van der Goot

TL;DR

This work aims to quantify the metropolitan size bias encoded within large language models, evaluating zero-shot salary, employer presence, and commute duration predictions in 384 of the United States’ metropolitan regions.

Abstract

Large language models (LLMs) have emerged as a useful technology for job matching, for both candidates and employers. Job matching is often based on a particular geographic location, such as a city or region. However, LLMs have known biases, commonly derived from their training data. In this work, we aim to quantify the metropolitan size bias encoded within large language models, evaluating zero-shot salary, employer presence, and commute duration predictions in 384 of the United States' metropolitan regions. Across all benchmarks, we observe negative correlations between the metropolitan size and the performance of the LLMS, indicating that smaller regions are indeed underrepresented. More concretely, the smallest 10 metropolitan regions show upwards of 300% worse benchmark performance than the largest 10.

Big City Bias: Evaluating the Impact of Metropolitan Size on Computational Job Market Abilities of Language Models

TL;DR

This work aims to quantify the metropolitan size bias encoded within large language models, evaluating zero-shot salary, employer presence, and commute duration predictions in 384 of the United States’ metropolitan regions.

Abstract

Large language models (LLMs) have emerged as a useful technology for job matching, for both candidates and employers. Job matching is often based on a particular geographic location, such as a city or region. However, LLMs have known biases, commonly derived from their training data. In this work, we aim to quantify the metropolitan size bias encoded within large language models, evaluating zero-shot salary, employer presence, and commute duration predictions in 384 of the United States' metropolitan regions. Across all benchmarks, we observe negative correlations between the metropolitan size and the performance of the LLMS, indicating that smaller regions are indeed underrepresented. More concretely, the smallest 10 metropolitan regions show upwards of 300% worse benchmark performance than the largest 10.
Paper Structure (14 sections, 3 figures, 2 tables)

This paper contains 14 sections, 3 figures, 2 tables.

Figures (3)

  • Figure 1: For each variable the cumulative number of metropolitan areas which match a certain threshold (y-axis). Note that the target variables (all except population) are an average over multiple instances within the region, and are thus not directly comparable (i.e. they could be over different sets of jobs, employers, or commutes).
  • Figure 2: Pearson correlations between MSA population and each target variable. Note that the bottom-left and top-right are mirrored as Pearson correlations are not directional.
  • Figure 3: The average error plotted against the log of the population for each task (shown on the right) and each language model (top).