Urban Complexity through Vision Intelligence: Variance, Gradients, and Correlations across Six Italian Cities
Mirko Degli Esposti, Armando Bazzani, Chiara Dellacasa, Matteo Falcioni, Mario Massimon, Martino Pietropoli
TL;DR
The paper tackles the challenge of quantifying urban quality and morphology across six Italian cities using vision-based sensing. It introduces UrbIA Vision Intelligence, deploying 500 Humarels per city to sample Street View imagery and score metrics such as PCI and FDS via GPT-4 Vision prompts, yielding a georeferenced visual census for analyses of Spatial Variance, Urban Gradient, and Cross-Metric correlations. Key findings include pronounced spatial heterogeneity (e.g., Milan $\sigma^2_{PCI} = 1.52$) and consistently weak urban gradients ($R^2 < 0.03$), with a modest positive link between façade quality and greenery ($\rho \approx 0.35$). These results demonstrate the diagnostic potential of vision intelligence for scalable urban analytics and motivate national-scale expansion that can integrate additional contextual data for planning and policy applications.
Abstract
This paper introduces a scalable methodology for the objective analysis of quality metrics across six major Italian metropolitan areas: Rome, Bologna, Florence, Milan, Naples, and Palermo. Leveraging georeferenced Street View imagery and an advanced Urban Vision Intelligence system, we systematically classify the visual environment, focusing on key metrics such as the Pavement Condition Index (PCI) and the Façade Degradation Score (FDS). The findings quantify Structural Heterogeneity (Spatial Variance), revealing significant quality dispersion (e.g., Milan $σ^2_{\mathrm{PCI}}=1.52$), and confirm that the classical Urban Gradient -- quality variation as a function of distance from the core -- is consistently weak across all sampled cities ($R^2 < 0.03$), suggesting a complex, polycentric, and fragmented morphology. In addition, a Cross-Metric Correlation Analysis highlights stable but modest interdependencies among visual dimensions, most notably a consistent positive association between façade quality and greenery ($ρ\approx 0.35$), demonstrating that structural and contextual urban qualities co-vary in weak yet interpretable ways. Together, these results underscore the diagnostic potential of Vision Intelligence for capturing the integrated spatial and morphological structure of Italian cities and motivate a large national-scale analysis.
