Table of Contents
Fetching ...

Multi-View Camera System for Variant-Aware Autonomous Vehicle Inspection and Defect Detection

Yash Kulkarni, Raman Jha, Renu Kachhoria

TL;DR

AVI tackles the challenge of variant-aware quality control by integrating eleven synchronized cameras with task-specific detectors and a semantic rule engine to deliver a real-time pass/fail verdict. The pipeline routes views to dedicated models (e.g., YOLOv8 for parts, EfficientNet for ICE/EV, Gemini OCR for mascot, YOLOv8-Seg for defects) and fuses evidence through a view-aware layer before VIN-conditioned reasoning. Key contributions include targeted view-to-task routing, a lightweight decision fusion, and a data-efficient defect segmentation module, all validated on a hybrid OEM/public dataset achieving 93% verification accuracy, 86% defect-detection recall, and 3.3 vehicles per minute throughput. The work demonstrates a deployable, end-to-end solution that unifies multi-camera feature validation with defect detection, enabling real-time, explainable QC in automotive manufacturing.

Abstract

Ensuring that every vehicle leaving a modern production line is built to the correct \emph{variant} specification and is free from visible defects is an increasingly complex challenge. We present the \textbf{Automated Vehicle Inspection (AVI)} platform, an end-to-end, \emph{multi-view} perception system that couples deep-learning detectors with a semantic rule engine to deliver \emph{variant-aware} quality control in real time. Eleven synchronized cameras capture a full 360° sweep of each vehicle; task-specific views are then routed to specialised modules: YOLOv8 for part detection, EfficientNet for ICE/EV classification, Gemini-1.5 Flash for mascot OCR, and YOLOv8-Seg for scratch-and-dent segmentation. A view-aware fusion layer standardises evidence, while a VIN-conditioned rule engine compares detected features against the expected manifest, producing an interpretable pass/fail report in \(\approx\! 300\,\text{ms}\). On a mixed data set of Original Equipment Manufacturer(OEM) vehicle data sets of four distinct models plus public scratch/dent images, AVI achieves \textbf{ 93 \%} verification accuracy, \textbf{86 \%} defect-detection recall, and sustains \(\mathbf{3.3}\) vehicles/min, surpassing single-view or no segmentation baselines by large margins. To our knowledge, this is the first publicly reported system that unifies multi-camera feature validation with defect detection in a deployable automotive setting in industry.

Multi-View Camera System for Variant-Aware Autonomous Vehicle Inspection and Defect Detection

TL;DR

AVI tackles the challenge of variant-aware quality control by integrating eleven synchronized cameras with task-specific detectors and a semantic rule engine to deliver a real-time pass/fail verdict. The pipeline routes views to dedicated models (e.g., YOLOv8 for parts, EfficientNet for ICE/EV, Gemini OCR for mascot, YOLOv8-Seg for defects) and fuses evidence through a view-aware layer before VIN-conditioned reasoning. Key contributions include targeted view-to-task routing, a lightweight decision fusion, and a data-efficient defect segmentation module, all validated on a hybrid OEM/public dataset achieving 93% verification accuracy, 86% defect-detection recall, and 3.3 vehicles per minute throughput. The work demonstrates a deployable, end-to-end solution that unifies multi-camera feature validation with defect detection, enabling real-time, explainable QC in automotive manufacturing.

Abstract

Ensuring that every vehicle leaving a modern production line is built to the correct \emph{variant} specification and is free from visible defects is an increasingly complex challenge. We present the \textbf{Automated Vehicle Inspection (AVI)} platform, an end-to-end, \emph{multi-view} perception system that couples deep-learning detectors with a semantic rule engine to deliver \emph{variant-aware} quality control in real time. Eleven synchronized cameras capture a full 360° sweep of each vehicle; task-specific views are then routed to specialised modules: YOLOv8 for part detection, EfficientNet for ICE/EV classification, Gemini-1.5 Flash for mascot OCR, and YOLOv8-Seg for scratch-and-dent segmentation. A view-aware fusion layer standardises evidence, while a VIN-conditioned rule engine compares detected features against the expected manifest, producing an interpretable pass/fail report in . On a mixed data set of Original Equipment Manufacturer(OEM) vehicle data sets of four distinct models plus public scratch/dent images, AVI achieves \textbf{ 93 \%} verification accuracy, \textbf{86 \%} defect-detection recall, and sustains vehicles/min, surpassing single-view or no segmentation baselines by large margins. To our knowledge, this is the first publicly reported system that unifies multi-camera feature validation with defect detection in a deployable automotive setting in industry.

Paper Structure

This paper contains 22 sections, 4 equations, 3 figures, 5 tables.

Figures (3)

  • Figure 1: The Automated Vehicle Inspection (AVI) camera setup, and synchronization. The synchronized images from 11 cameras are routed to specialized perception modules to perform each specific task.
  • Figure 2: This Automated Vehicle Inspection (AVI) system uses an 11-camera array and deep learning to perform a comprehensive, multi-angle scan of each vehicle. The system automatically identifies the vehicle, detects features and anomalies, and cross-references data to generate an instant pass/fail report.
  • Figure 3: Qualitative results. Top: Stage 1 branding detection (logo, mascot, front_grille) across backbones. Second: grille crops for ICE/EV classification. Third: Stage 2 variant features (roof rails, rear antenna, rear wiper, wheel type). Bottom: scratch/dent instance segmentation masks.