Monitoring Social-distance in Wide Areas during Pandemics: a Density Map and Segmentation Approach
Javier A. González-Trejo, Diego A. Mercado-Ravell
TL;DR
This work tackles monitoring social distancing in wide areas during pandemics by reframing the Visual Social Distancing problem into density-map and segmentation tasks. It introduces a ground-truth generation pipeline that projects head annotations to a head plane and removes conforming crowds, enabling end-to-end learning with multi-view data and occlusions. Two learning paths are evaluated: density-map based detection via per-camera FCN_7 networks with a late fusion stage, and segmentation-based detection using FCN_7 and U-Net architectures; densities are projected to a common plane for aggregation. Across CityStreet and PETS2009 datasets, the segmentation approach, particularly U-Net, generally achieves higher F1 and competitive specificity, demonstrating effective NSDC detection in challenging wide-area scenarios and heavy occlusions with practical implications for public safety monitoring.
Abstract
With the relaxation of the containment measurements around the globe, monitoring the social distancing in crowded public places is of grate importance to prevent a new massive wave of COVID-19 infections. Recent works in that matter have limited themselves by detecting social distancing in corridors up to small crowds by detecting each person individually considering the full body in the image. In this work, we propose a new framework for monitoring the social-distance using end-to-end Deep Learning, to detect crowds violating the social-distance in wide areas where important occlusions may be present. Our framework consists in the creation of a new ground truth based on the ground truth density maps and the proposal of two different solutions, a density-map-based and a segmentation-based, to detect the crowds violating the social-distance constrain. We assess the results of both approaches by using the generated ground truth from the PET2009 and CityStreet datasets. We show that our framework performs well at providing the zones where people are not following the social-distance even when heavily occluded or far away from one camera.
