Table of Contents
Fetching ...

Multi Camera Connected Vision System with Multi View Analytics: A Comprehensive Survey

Muhammad Munsif, Waqas Ahmad, Amjid Ali, Mohib Ullah, Adnan Hussain, Sung Wook Baik

TL;DR

This survey addresses the challenge of building robust, real-time MVMC connected vision systems by unifying tracking, cross-camera re-identification, and multi-view action understanding within a cohesive framework. It introduces a four-part taxonomy, systematically reviews state-of-the-art datasets, methods, and evaluation metrics, and highlights the shift from isolated tasks to integrated CVS pipelines. The authors identify core challenges such as scalability, cross-domain generalization, privacy, and real-time multi-modal fusion, and propose future directions including lifelong and zero-shot learning, federated approaches, and BEV-based integration. The work aims to guide researchers and practitioners toward end-to-end, privacy-conscious, and scalable MVMC CVS for smart cities, autonomous systems, and collaborative robotics.

Abstract

Connected Vision Systems (CVS) are transforming a variety of applications, including autonomous vehicles, smart cities, surveillance, and human-robot interaction. These systems harness multi-view multi-camera (MVMC) data to provide enhanced situational awareness through the integration of MVMC tracking, re-identification (Re-ID), and action understanding (AU). However, deploying CVS in real-world, dynamic environments presents a number of challenges, particularly in addressing occlusions, diverse viewpoints, and environmental variability. Existing surveys have focused primarily on isolated tasks such as tracking, Re-ID, and AU, often neglecting their integration into a cohesive system. These reviews typically emphasize single-view setups, overlooking the complexities and opportunities provided by multi-camera collaboration and multi-view data analysis. To the best of our knowledge, this survey is the first to offer a comprehensive and integrated review of MVMC that unifies MVMC tracking, Re-ID, and AU into a single framework. We propose a unique taxonomy to better understand the critical components of CVS, dividing it into four key parts: MVMC tracking, Re-ID, AU, and combined methods. We systematically arrange and summarize the state-of-the-art datasets, methodologies, results, and evaluation metrics, providing a structured view of the field's progression. Furthermore, we identify and discuss the open research questions and challenges, along with emerging technologies such as lifelong learning, privacy, and federated learning, that need to be addressed for future advancements. The paper concludes by outlining key research directions for enhancing the robustness, efficiency, and adaptability of CVS in complex, real-world applications. We hope this survey will inspire innovative solutions and guide future research toward the next generation of intelligent and adaptive CVS.

Multi Camera Connected Vision System with Multi View Analytics: A Comprehensive Survey

TL;DR

This survey addresses the challenge of building robust, real-time MVMC connected vision systems by unifying tracking, cross-camera re-identification, and multi-view action understanding within a cohesive framework. It introduces a four-part taxonomy, systematically reviews state-of-the-art datasets, methods, and evaluation metrics, and highlights the shift from isolated tasks to integrated CVS pipelines. The authors identify core challenges such as scalability, cross-domain generalization, privacy, and real-time multi-modal fusion, and propose future directions including lifelong and zero-shot learning, federated approaches, and BEV-based integration. The work aims to guide researchers and practitioners toward end-to-end, privacy-conscious, and scalable MVMC CVS for smart cities, autonomous systems, and collaborative robotics.

Abstract

Connected Vision Systems (CVS) are transforming a variety of applications, including autonomous vehicles, smart cities, surveillance, and human-robot interaction. These systems harness multi-view multi-camera (MVMC) data to provide enhanced situational awareness through the integration of MVMC tracking, re-identification (Re-ID), and action understanding (AU). However, deploying CVS in real-world, dynamic environments presents a number of challenges, particularly in addressing occlusions, diverse viewpoints, and environmental variability. Existing surveys have focused primarily on isolated tasks such as tracking, Re-ID, and AU, often neglecting their integration into a cohesive system. These reviews typically emphasize single-view setups, overlooking the complexities and opportunities provided by multi-camera collaboration and multi-view data analysis. To the best of our knowledge, this survey is the first to offer a comprehensive and integrated review of MVMC that unifies MVMC tracking, Re-ID, and AU into a single framework. We propose a unique taxonomy to better understand the critical components of CVS, dividing it into four key parts: MVMC tracking, Re-ID, AU, and combined methods. We systematically arrange and summarize the state-of-the-art datasets, methodologies, results, and evaluation metrics, providing a structured view of the field's progression. Furthermore, we identify and discuss the open research questions and challenges, along with emerging technologies such as lifelong learning, privacy, and federated learning, that need to be addressed for future advancements. The paper concludes by outlining key research directions for enhancing the robustness, efficiency, and adaptability of CVS in complex, real-world applications. We hope this survey will inspire innovative solutions and guide future research toward the next generation of intelligent and adaptive CVS.

Paper Structure

This paper contains 47 sections, 17 equations, 12 figures, 15 tables.

Figures (12)

  • Figure 1: Demonstration of a Multi-Camera Connected Vision System with synchronized views and a central diagram showing camera placement and overlapping views. Colored bounding boxes track the same individual across multiple angles, showcasing person tracking and multi-view analysis. The top-left corner image is captured with Camera 1, the top-right with Camera 4, the middle-left with Camera 8, the middle-right with Camera 10, the bottom-left with Camera 3, and the bottom-right with Camera 2.
  • Figure 2: Annual publication trend (2015–August 2025) for multi-camera multi-view research, segmented into three core tasks: multi-view tracking, multi-camera re-identification, and multi-view action understanding. The 2025 data is extrapolated from data available for January–August to ensure comparability with prior years.
  • Figure 3: Distribution of MVMC-related publications across major academic venues. The 2025 data is extrapolated from data available for January–August to ensure comparability with prior years.
  • Figure 4: Advancements in multi-view, multi-camera tracking, Re-ID, and action understanding (2015 to 2025), highlighting key developments and innovations in each area over the past decade.
  • Figure 5: Samples from publicly available datasets of MVMC, multi-object tracking, and Re-ID in a CVS. It presents various tracking and Re-ID scenarios across multiple camera views.
  • ...and 7 more figures