A New Lightweight Hybrid Graph Convolutional Neural Network -- CNN Scheme for Scene Classification using Object Detection Inference
Ayman Beghdadi, Azeddine Beghdadi, Mohib Ullah, Faouzi Alaya Cheikh, Malik Mallem
TL;DR
The paper addresses indoor/outdoor scene classification for autonomous systems by coupling a CNN-based object detector with a lightweight GCNN that operates on a space-semantic graph derived from detected objects. The method constructs a graph whose nodes encode object labels and sizes, and whose edges encode spatial proximity, enabling graph-based reasoning through GCN, GIN, and a Learnable Aggregation Function (LAF). Key contributions include a true lightweight GCNN-CNN framework with performance near traditional CNNs but with far fewer parameters, a novel space-semantic graph construction strategy, and open-source code with demonstrated compatibility as a detector head (e.g., YOLACT-GCNN). Experiments on a CD-COCO-derived dataset show high accuracy (over 90% in some configurations) with substantially lower parameter counts and faster inference than CNN/ViT baselines, and the approach offers a flexible pathway for GCNN-based non-satellite scene classification and broader integration with detection systems.
Abstract
Scene understanding plays an important role in several high-level computer vision applications, such as autonomous vehicles, intelligent video surveillance, or robotics. However, too few solutions have been proposed for indoor/outdoor scene classification to ensure scene context adaptability for computer vision frameworks. We propose the first Lightweight Hybrid Graph Convolutional Neural Network (LH-GCNN)-CNN framework as an add-on to object detection models. The proposed approach uses the output of the CNN object detection model to predict the observed scene type by generating a coherent GCNN representing the semantic and geometric content of the observed scene. This new method, applied to natural scenes, achieves an efficiency of over 90\% for scene classification in a COCO-derived dataset containing a large number of different scenes, while requiring fewer parameters than traditional CNN methods. For the benefit of the scientific community, we will make the source code publicly available: https://github.com/Aymanbegh/Hybrid-GCNN-CNN.
