SlumpGuard: An AI-Powered Real-Time System for Automated Concrete Slump Prediction via Video Analysis
Youngmin Kim, Giyeong Oh, Kwangsoo Youm, Youngjae Yu
TL;DR
SlumpGuard introduces a fixed-camera, vision-based system for real-time concrete slump prediction using a three-stage pipeline: chute detection, pouring onset/which chute identification, and video-based slump classification. It leverages oriented bounding-box detection (YOLOv8) and optical-flow–driven timing to isolate pouring events, followed by a ResNet-3D-based video classifier with advanced augmentations and label smoothing to predict slump categories. Across a site-replicated dataset of over 6,000 clips, the approach achieves near-perfect chute localization, robust pouring detection, and slump prediction accuracy around 0.82, with expert human evaluators showing substantial subjectivity in visual slump estimation. The work demonstrates practical deployability, non-intrusive operation, and a path toward automated quality control in concrete construction workflows.
Abstract
Concrete workability is essential for construction quality, with the slump test being the most widely used on-site method for its assessment. However, traditional slump testing is manual, time-consuming, and highly operator-dependent, making it unsuitable for continuous or real-time monitoring during placement. To address these limitations, we present SlumpGuard, an AI-powered vision system that analyzes the natural discharge flow from a mixer-truck chute using a single fixed camera. The system performs automatic chute detection, pouring-event identification, and video-based slump classification, enabling quality monitoring without sensors, hardware installation, or manual intervention. We introduce the system design, construct a site-replicated dataset of over 6,000 video clips, and report extensive evaluations demonstrating reliable chute localization, accurate pouring detection, and robust slump prediction under diverse field conditions. An expert study further reveals significant disagreement in human visual estimates, highlighting the need for automated assessment.
