Table of Contents
Fetching ...

Adaptive Visual Perception for Robotic Construction Process: A Multi-Robot Coordination Framework

Jia Xu, Manish Dixit, Xi Wang

TL;DR

This work tackles visual perception on unstructured construction sites by introducing a multi-robot coordination framework where supervising robots adaptively reposition cameras to monitor the upcoming motion of a primary construction robot. The core method combines BIM-driven task planning with a viewpoint-selection module that samples camera poses, applies initial filtering, and then uses NSGA-II-based multi-objective optimization to balance coverage and proximity, with a final visibility-based ranking when targets are involved. The framework is validated through a case study on prefabricated wooden frame installation and a suite of simulations that vary space, material placement, and environmental complexity, demonstrating robust viewpoint selection and improved visibility of large construction elements. The results indicate that two supervising robots typically yield better coverage and robustness against occlusions, while revealing current limitations related to visibility during picking and the need for dynamic site adaptation and more comprehensive data fusion. Overall, the approach advances real-time, adaptive perception for construction robotics, enabling safer operation and paving the way for integration with advanced vision models and perception-driven decision-making.

Abstract

Construction robots operate in unstructured construction sites, where effective visual perception is crucial for ensuring safe and seamless operations. However, construction robots often handle large elements and perform tasks across expansive areas, resulting in occluded views from onboard cameras and necessitating the use of multiple environmental cameras to capture the large task space. This study proposes a multi-robot coordination framework in which a team of supervising robots equipped with cameras adaptively adjust their poses to visually perceive the operation of the primary construction robot and its surrounding environment. A viewpoint selection method is proposed to determine each supervising robot's camera viewpoint, optimizing visual coverage and proximity while considering the visibility of the upcoming construction robot operation. A case study on prefabricated wooden frame installation demonstrates the system's feasibility, and further experiments are conducted to validate the performance and robustness of the proposed viewpoint selection method across various settings. This research advances visual perception of robotic construction processes and paves the way for integrating computer vision techniques to enable real-time adaption and responsiveness. Such advancements contribute to the safe and efficient operation of construction robots in inherently unstructured construction sites.

Adaptive Visual Perception for Robotic Construction Process: A Multi-Robot Coordination Framework

TL;DR

This work tackles visual perception on unstructured construction sites by introducing a multi-robot coordination framework where supervising robots adaptively reposition cameras to monitor the upcoming motion of a primary construction robot. The core method combines BIM-driven task planning with a viewpoint-selection module that samples camera poses, applies initial filtering, and then uses NSGA-II-based multi-objective optimization to balance coverage and proximity, with a final visibility-based ranking when targets are involved. The framework is validated through a case study on prefabricated wooden frame installation and a suite of simulations that vary space, material placement, and environmental complexity, demonstrating robust viewpoint selection and improved visibility of large construction elements. The results indicate that two supervising robots typically yield better coverage and robustness against occlusions, while revealing current limitations related to visibility during picking and the need for dynamic site adaptation and more comprehensive data fusion. Overall, the approach advances real-time, adaptive perception for construction robotics, enabling safer operation and paving the way for integration with advanced vision models and perception-driven decision-making.

Abstract

Construction robots operate in unstructured construction sites, where effective visual perception is crucial for ensuring safe and seamless operations. However, construction robots often handle large elements and perform tasks across expansive areas, resulting in occluded views from onboard cameras and necessitating the use of multiple environmental cameras to capture the large task space. This study proposes a multi-robot coordination framework in which a team of supervising robots equipped with cameras adaptively adjust their poses to visually perceive the operation of the primary construction robot and its surrounding environment. A viewpoint selection method is proposed to determine each supervising robot's camera viewpoint, optimizing visual coverage and proximity while considering the visibility of the upcoming construction robot operation. A case study on prefabricated wooden frame installation demonstrates the system's feasibility, and further experiments are conducted to validate the performance and robustness of the proposed viewpoint selection method across various settings. This research advances visual perception of robotic construction processes and paves the way for integrating computer vision techniques to enable real-time adaption and responsiveness. Such advancements contribute to the safe and efficient operation of construction robots in inherently unstructured construction sites.

Paper Structure

This paper contains 35 sections, 14 equations, 18 figures, 6 tables.

Figures (18)

  • Figure 1: Overall framework of multi-robot coordination
  • Figure 2: Flowchart for candidate viewpoints selection process
  • Figure 3: Illustration of camera view modeling
  • Figure 4: Illustration of robot motion envelope representation
  • Figure 5: Illustration of generated OGM and ray-casting-based visibility. (a) OGM representation of related entities with different robot joint states. (b) Visibility status determination by ray-casting from the sensor to the measured point.
  • ...and 13 more figures