A Scoping Review of Deep Learning for Urban Visual Pollution and Proposal of a Real-Time Monitoring Framework with a Visual Pollution Index
Mohammad Masudur Rahman, Md. Rashedur Rahman, Ashraful Islam, Saadia B Alam, M Ashraful Amin
TL;DR
The paper tackles the fragmentation of deep learning approaches to urban visual pollution (UVP) by conducting a PRISMA-ScR-based scoping review of 26 publications (2016–2025) and proposing a real-time UVP monitoring framework. It analyzes state-of-the-art models (notably YOLO variants, Faster R-CNN, EfficientDet) and datasets (e.g., TACO, UVPD, MOMRAH VP), and identifies gaps such as inconsistent taxonomies and limited cross-city benchmarks. A core contribution is a layerized framework that integrates data acquisition, edge detection, server segmentation, and a Visual Pollution Index (VPI) to quantify severity, with a formal definition $\mathrm{VPI} = 100 \times (\alpha D + \beta S + \gamma R) \times W$. The work argues for open, interoperable benchmarks and human-in-the-loop deployment to enable scalable, actionable UVP management for urban planning and public well-being, and discusses the potential role of LLMs in reporting and contextualization.
Abstract
Urban Visual Pollution (UVP) has emerged as a critical concern, yet research on automatic detection and application remains fragmented. This scoping review maps the existing deep learning-based approaches for detecting, classifying, and designing a comprehensive application framework for visual pollution management. Following the PRISMA-ScR guidelines, seven academic databases (Scopus, Web of Science, IEEE Xplore, ACM DL, ScienceDirect, SpringerNatureLink, and Wiley) were systematically searched and reviewed, and 26 articles were found. Most research focuses on specific pollutant categories and employs variations of YOLO, Faster R-CNN, and EfficientDet architectures. Although several datasets exist, they are limited to specific areas and lack standardized taxonomies. Few studies integrate detection into real-time application systems, yet they tend to be geographically skewed. We proposed a framework for monitoring visual pollution that integrates a visual pollution index to assess the severity of visual pollution for a certain area. This review highlights the need for a unified UVP management system that incorporates pollutant taxonomy, a cross-city benchmark dataset, a generalized deep learning model, and an assessment index that supports sustainable urban aesthetics and enhances the well-being of urban dwellers.
