Table of Contents
Fetching ...

CrossVIT-augmented Geospatial-Intelligence Visualization System for Tracking Economic Development Dynamics

Yanbing Bai, Jinhua Su, Bin Qiao, Xiaoran Ma

TL;DR

Senseconomic tackles the challenge of producing timely, high-resolution geospatial economic indicators by fusing satellite and street-view imagery with nighttime-light proxies through a Vision Transformer-based cross-attention framework. The system is deployed with scalable Spark-based distributed computing and a Vue-based frontend for interactive county-level visualization, enabling efficient data processing and decision-support for policymakers. The authors demonstrate an $R^2$ of $0.8363$ for county-level economic proxies and report substantial speedups (e.g., 23 minutes in some cases) using distributed computing, highlighting practical implications for rapid economic monitoring. Overall, the work contributes a end-to-end, multimodal, scalable pipeline for geospatial economic analysis with tangible policy-relevant visualization capabilities.

Abstract

Timely and accurate economic data is crucial for effective policymaking. Current challenges in data timeliness and spatial resolution can be addressed with advancements in multimodal sensing and distributed computing. We introduce Senseconomic, a scalable system for tracking economic dynamics via multimodal imagery and deep learning. Built on the Transformer framework, it integrates remote sensing and street view images using cross-attention, with nighttime light data as weak supervision. The system achieved an R-squared value of 0.8363 in county-level economic predictions and halved processing time to 23 minutes using distributed computing. Its user-friendly design includes a Vue3-based front end with Baidu maps for visualization and a Python-based back end automating tasks like image downloads and preprocessing. Senseconomic empowers policymakers and researchers with efficient tools for resource allocation and economic planning.

CrossVIT-augmented Geospatial-Intelligence Visualization System for Tracking Economic Development Dynamics

TL;DR

Senseconomic tackles the challenge of producing timely, high-resolution geospatial economic indicators by fusing satellite and street-view imagery with nighttime-light proxies through a Vision Transformer-based cross-attention framework. The system is deployed with scalable Spark-based distributed computing and a Vue-based frontend for interactive county-level visualization, enabling efficient data processing and decision-support for policymakers. The authors demonstrate an of for county-level economic proxies and report substantial speedups (e.g., 23 minutes in some cases) using distributed computing, highlighting practical implications for rapid economic monitoring. Overall, the work contributes a end-to-end, multimodal, scalable pipeline for geospatial economic analysis with tangible policy-relevant visualization capabilities.

Abstract

Timely and accurate economic data is crucial for effective policymaking. Current challenges in data timeliness and spatial resolution can be addressed with advancements in multimodal sensing and distributed computing. We introduce Senseconomic, a scalable system for tracking economic dynamics via multimodal imagery and deep learning. Built on the Transformer framework, it integrates remote sensing and street view images using cross-attention, with nighttime light data as weak supervision. The system achieved an R-squared value of 0.8363 in county-level economic predictions and halved processing time to 23 minutes using distributed computing. Its user-friendly design includes a Vue3-based front end with Baidu maps for visualization and a Python-based back end automating tasks like image downloads and preprocessing. Senseconomic empowers policymakers and researchers with efficient tools for resource allocation and economic planning.

Paper Structure

This paper contains 19 sections, 2 equations, 8 figures, 3 tables.

Figures (8)

  • Figure 1: The structure of ViT.
  • Figure 2: The structure of Cross-Attention.
  • Figure 3: Example of sampled street view locations in Beijing.
  • Figure 4: Example of Level-2A satellite images.
  • Figure 5: A satellite-street view pair.
  • ...and 3 more figures