Towards Commonsense Knowledge based Fuzzy Systems for Supporting Size-Related Fine-Grained Object Detection
Pu Zhang, Tianhua Chen, Bin Liu
TL;DR
This work tackles size-related fine-grained object detection by augmenting a lightweight coarse-grained detector with a commonsense knowledge inference module (CKIM). It introduces two CKIM variants—crisp-rule and fuzzy-rule—grounded in two size-related knowledge rules and operationalized through BoxS and DtoC features, learning with minimal data. Empirical results on CLEVR-derived datasets demonstrate that CKIM-enhanced detectors achieve higher mAP@0.5 while reducing model size and latency, with fuzzy CKIM offering advantages for multi-class labeling. The approach promises practical gains for edge-facing detection tasks where fine-grained annotations are scarce, and suggests avenues for knowledge acquisition via human expertise or LLMs.
Abstract
Deep learning has become the dominating approach for object detection. To achieve accurate fine-grained detection, one needs to employ a large enough model and a vast amount of data annotations. In this paper, we propose a commonsense knowledge inference module (CKIM) which leverages commonsense knowledge to assist a lightweight deep neural network base coarse-grained object detector to achieve accurate fine-grained detection. Specifically, we focus on a scenario where a single image contains objects of similar categories but varying sizes, and we establish a size-related commonsense knowledge inference module (CKIM) that maps the coarse-grained labels produced by the DL detector to size-related fine-grained labels. Considering that rule-based systems are one of the popular methods of knowledge representation and reasoning, our experiments explored two types of rule-based CKIMs, implemented using crisp-rule and fuzzy-rule approaches, respectively. Experimental results demonstrate that compared with baseline methods, our approach achieves accurate fine-grained detection with a reduced amount of annotated data and smaller model size. Our code is available at: https://github.com/ZJLAB-AMMI/CKIM.
