Table of Contents
Fetching ...

CityVerse: A Unified Data Platform for Multi-Task Urban Computing with Large Language Models

Yaqiao Zhu, Hongkai Wen, Mark Birkin, Man Luo

TL;DR

CityVerse addresses the lack of a unified infrastructure for evaluating large language models in urban computing by introducing a coordinate-based data platform, a capability-based task taxonomy, and a dynamic simulation with visualization. It unifies ten urban data categories into a common spatiotemporal substrate and defines 43 tasks across four cognitive levels, enabling standardized inputs, outputs, and metrics across cities. By decoupling data access, task semantics, and evaluation from specific cities or models, CityVerse enables reproducible cross-city assessment of multimodal reasoning and decision-making. Empirical validation on NYC using seven mainstream LLMs reveals scaling trends in perception tasks and notable gaps in decision-related tasks, illustrating the platform's potential to accelerate progress and identify limitations in urban AI research.

Abstract

Large Language Models (LLMs) show remarkable potential for urban computing, from spatial reasoning to predictive analytics. However, evaluating LLMs across diverse urban tasks faces two critical challenges: lack of unified platforms for consistent multi-source data access and fragmented task definitions that hinder fair comparison. To address these challenges, we present CityVerse, the first unified platform integrating multi-source urban data, capability-based task taxonomy, and dynamic simulation for systematic LLM evaluation in urban contexts. CityVerse provides: 1) coordinate-based Data APIs unifying ten categories of urban data-including spatial features, temporal dynamics, demographics, and multi-modal imagery-with over 38 million curated records; 2) Task APIs organizing 43 urban computing tasks into a four-level cognitive hierarchy: Perception, Spatial Understanding, Reasoning and Prediction, and Decision and Interaction, enabling standardized evaluation across capability levels; 3) an interactive visualization frontend supporting real-time data retrieval, multi-layer display, and simulation replay for intuitive exploration and validation. We validate the platform's effectiveness through evaluations on mainstream LLMs across representative tasks, demonstrating its capability to support reproducible and systematic assessment. CityVerse provides a reusable foundation for advancing LLMs and multi-task approaches in the urban computing domain.

CityVerse: A Unified Data Platform for Multi-Task Urban Computing with Large Language Models

TL;DR

CityVerse addresses the lack of a unified infrastructure for evaluating large language models in urban computing by introducing a coordinate-based data platform, a capability-based task taxonomy, and a dynamic simulation with visualization. It unifies ten urban data categories into a common spatiotemporal substrate and defines 43 tasks across four cognitive levels, enabling standardized inputs, outputs, and metrics across cities. By decoupling data access, task semantics, and evaluation from specific cities or models, CityVerse enables reproducible cross-city assessment of multimodal reasoning and decision-making. Empirical validation on NYC using seven mainstream LLMs reveals scaling trends in perception tasks and notable gaps in decision-related tasks, illustrating the platform's potential to accelerate progress and identify limitations in urban AI research.

Abstract

Large Language Models (LLMs) show remarkable potential for urban computing, from spatial reasoning to predictive analytics. However, evaluating LLMs across diverse urban tasks faces two critical challenges: lack of unified platforms for consistent multi-source data access and fragmented task definitions that hinder fair comparison. To address these challenges, we present CityVerse, the first unified platform integrating multi-source urban data, capability-based task taxonomy, and dynamic simulation for systematic LLM evaluation in urban contexts. CityVerse provides: 1) coordinate-based Data APIs unifying ten categories of urban data-including spatial features, temporal dynamics, demographics, and multi-modal imagery-with over 38 million curated records; 2) Task APIs organizing 43 urban computing tasks into a four-level cognitive hierarchy: Perception, Spatial Understanding, Reasoning and Prediction, and Decision and Interaction, enabling standardized evaluation across capability levels; 3) an interactive visualization frontend supporting real-time data retrieval, multi-layer display, and simulation replay for intuitive exploration and validation. We validate the platform's effectiveness through evaluations on mainstream LLMs across representative tasks, demonstrating its capability to support reproducible and systematic assessment. CityVerse provides a reusable foundation for advancing LLMs and multi-task approaches in the urban computing domain.

Paper Structure

This paper contains 9 sections, 2 figures, 2 tables.

Figures (2)

  • Figure 1: CityVerse three-layer architecture.
  • Figure 2: Unified taxonomy organizing 43 urban computing tasks into a four-level capability hierarchy.