Table of Contents
Fetching ...

Social Construction of Urban Space: Using LLMs to Identify Neighborhood Boundaries From Craigslist Ads

Adam Visokay, Ruth Bagley, Ian Kennedy, Chris Hess, Kyle Crowder, Rob Voigt, Denis Peskoff

TL;DR

This study investigates how urban space is socially constructed by analyzing Craigslist rental ads in Chicago from 2018 to 2024. It combines manual labeling and zero-shot large language models to extract explicit neighborhood claims, compares these to a string-matching baseline, and anchors findings with geospatial centroids to define a 'social center' for each neighborhood. Through topic modeling (LDA) and regression analyses, the authors identify three linguistic substitution patterns—Conflicting Conceptions, Border Stretching, and Reputation Laundering—and show that listing language systematically shifts with distance from the social center. The work demonstrates scalable, transferable methods for examining contested urban boundaries and highlights how language shapes neighborhood reputation and property outcomes in the urban housing market.

Abstract

Rental listings offer a window into how urban space is socially constructed through language. We analyze Chicago Craigslist rental advertisements from 2018 to 2024 to examine how listing agents characterize neighborhoods, identifying mismatches between institutional boundaries and neighborhood claims. Through manual and large language model annotation, we classify unstructured listings from Craigslist according to their neighborhood. Further geospatial analysis reveals three distinct patterns: properties with conflicting neighborhood designations due to competing spatial definitions, border properties with valid claims to adjacent neighborhoods, and "reputation laundering" where listings claim association with distant, desirable neighborhoods. Through topic modeling, we identify patterns that correlate with spatial positioning: listings further from neighborhood centers emphasize different amenities than centrally-located units. Natural language processing techniques reveal how definitions of urban spaces are contested in ways that traditional methods overlook.

Social Construction of Urban Space: Using LLMs to Identify Neighborhood Boundaries From Craigslist Ads

TL;DR

This study investigates how urban space is socially constructed by analyzing Craigslist rental ads in Chicago from 2018 to 2024. It combines manual labeling and zero-shot large language models to extract explicit neighborhood claims, compares these to a string-matching baseline, and anchors findings with geospatial centroids to define a 'social center' for each neighborhood. Through topic modeling (LDA) and regression analyses, the authors identify three linguistic substitution patterns—Conflicting Conceptions, Border Stretching, and Reputation Laundering—and show that listing language systematically shifts with distance from the social center. The work demonstrates scalable, transferable methods for examining contested urban boundaries and highlights how language shapes neighborhood reputation and property outcomes in the urban housing market.

Abstract

Rental listings offer a window into how urban space is socially constructed through language. We analyze Chicago Craigslist rental advertisements from 2018 to 2024 to examine how listing agents characterize neighborhoods, identifying mismatches between institutional boundaries and neighborhood claims. Through manual and large language model annotation, we classify unstructured listings from Craigslist according to their neighborhood. Further geospatial analysis reveals three distinct patterns: properties with conflicting neighborhood designations due to competing spatial definitions, border properties with valid claims to adjacent neighborhoods, and "reputation laundering" where listings claim association with distant, desirable neighborhoods. Through topic modeling, we identify patterns that correlate with spatial positioning: listings further from neighborhood centers emphasize different amenities than centrally-located units. Natural language processing techniques reveal how definitions of urban spaces are contested in ways that traditional methods overlook.

Paper Structure

This paper contains 25 sections, 4 figures, 6 tables.

Figures (4)

  • Figure 1: Extracting neighborhoods from unstructured rental listings with LLMs (RQ1, Section \ref{['sec:labeling']}) provides insight into the social construction of space (RQ2, top) and allows us to study how language changes relative to distance from neighborhood centers (RQ3, bottom).
  • Figure 2: The city of Chicago and its constituent neighborhoods, as defined by Zillow, colored to represent the number of posts for properties located in each area
  • Figure 3: Contested boundaries for Pilsen neighborhood according to Zillow's definition (red). Orange points depict listings claiming to be in Pilsen, purple points are listings claiming to be elsewhere). The green star depicts the Pilsen neighborhood centroid. This is an example of what we call border stretching, demonstrating how static boundary systems may not capture the neighborhood identities as experienced by the people living in them.
  • Figure 4: Contested boundaries for Logan Square neighborhood according to the official City of Chicago limits (blue), Zillow's definition (red), and claims in Craigslist rental listings (orange points for listings claiming to be in the neighborhood, purple if claiming to be elsewhere); the center of our Craigslist-defined neighborhood is represented by a green star. Logan Square represents a popular neighborhood with many postings and a number of nearby listings claiming to be within the neighborhood.