Bird Eye-View to Street-View: A Survey
Khawlah Bajbaa, Muhammad Usman, Saeed Anwar, Ibrahim Radwan, Abdul Bais
TL;DR
This survey analyzes satellite-to-street-view synthesis, framing it as a cross-view translation problem with substantial viewpoint and appearance gaps. It catalogs a taxonomy of methods, from conditional GANs and multi-GAN architectures to representation- and geometry-guided, Siamese, and MLP approaches, and notes a recent shift toward diffusion and transformer-based techniques. It highlights publicly available datasets (Dayton, CVUSA, CVACT) and standard image-quality metrics (IS, FID, PSNR, SSIM, KL), while arguing that current metrics and data diversity limit robust evaluation and generalization. The paper emphasizes ethical considerations, practical deployment challenges, and the need for open benchmarks, multimodal data, and interdisciplinary collaboration to advance realistic, geometry-consistent street-view synthesis for urban analytics and navigation tasks.
Abstract
In recent years, street view imagery has grown to become one of the most important sources of geospatial data collection and urban analytics, which facilitates generating meaningful insights and assisting in decision-making. Synthesizing a street-view image from its corresponding satellite image is a challenging task due to the significant differences in appearance and viewpoint between the two domains. In this study, we screened 20 recent research papers to provide a thorough review of the state-of-the-art of how street-view images are synthesized from their corresponding satellite counterparts. The main findings are: (i) novel deep learning techniques are required for synthesizing more realistic and accurate street-view images; (ii) more datasets need to be collected for public usage; and (iii) more specific evaluation metrics need to be investigated for evaluating the generated images appropriately. We conclude that, due to applying outdated deep learning techniques, the recent literature failed to generate detailed and diverse street-view images.
