GeoReg: Weight-Constrained Few-Shot Regression for Socio-Economic Estimation using LLM

Kyeongjin Ahn; Sungwon Han; Seungeon Lee; Donghyun Ahn; Hyoshin Kim; Jungwon Kim; Jihee Kim; Sangyoon Park; Meeyoung Cha

GeoReg: Weight-Constrained Few-Shot Regression for Socio-Economic Estimation using LLM

Kyeongjin Ahn, Sungwon Han, Seungeon Lee, Donghyun Ahn, Hyoshin Kim, Jungwon Kim, Jihee Kim, Sangyoon Park, Meeyoung Cha

Abstract

Socio-economic indicators like regional GDP, population, and education levels, are crucial to shaping policy decisions and fostering sustainable development. This research introduces GeoReg a regression model that integrates diverse data sources, including satellite imagery and web-based geospatial information, to estimate these indicators even for data-scarce regions such as developing countries. Our approach leverages the prior knowledge of large language model to address the scarcity of labeled data, with the language model functioning as a data engineer by extracting informative features to enable effective estimation in few-shot settings. Specifically, our model obtains contextual relationships between data features and the target indicator, categorizing their correlations as positive, negative, mixed, or irrelevant. These features are then fed into the linear estimator with tailored weight constraints for each category. To capture nonlinear patterns, the model also identifies meaningful feature interactions and integrates them, along with nonlinear transformations. Experiments across three countries at different stages of development demonstrate that our model outperforms baselines in estimating socio-economic indicators, even for low-income countries with limited data availability.

GeoReg: Weight-Constrained Few-Shot Regression for Socio-Economic Estimation using LLM

Abstract

GeoReg: Weight-Constrained Few-Shot Regression for Socio-Economic Estimation using LLM

Abstract

Paper Structure

Table of Contents

Figures (9)