Improving Classification of Occluded Objects through Scene Context
Courtney M. King, Daniel D. Leeds, Damian Lyons, George Kalaitzis
TL;DR
Occlusions challenge robust object recognition; the authors propose two scene-context fusion strategies, MNF and SCU, to integrate scene information into RPN-DCNN pipelines. MNF selects a scene-specific detector per image while SCU adjusts predictions post hoc using object-scene co-occurrence priors, with scene labels predicted by a CNN. Experiments on Occluded Groceries and TACO show improvements in average recall and precision, and training on a mix of occluded and unoccluded data yields the strongest gains in many settings. The work offers interpretable, adaptable methods to enhance occlusion robustness in real-world scenarios.
Abstract
The presence of occlusions has provided substantial challenges to typically-powerful object recognition algorithms. Additional sources of information can be extremely valuable to reduce errors caused by occlusions. Scene context is known to aid in object recognition in biological vision. In this work, we attempt to add robustness into existing Region Proposal Network-Deep Convolutional Neural Network (RPN-DCNN) object detection networks through two distinct scene-based information fusion techniques. We present one algorithm under each methodology: the first operates prior to prediction, selecting a custom object network to use based on the identified background scene, and the second operates after detection, fusing scene knowledge into initial object scores output by the RPN. We demonstrate our algorithms on challenging datasets featuring partial occlusions, which show overall improvement in both recall and precision against baseline methods. In addition, our experiments contrast multiple training methodologies for occlusion handling, finding that training on a combination of both occluded and unoccluded images demonstrates an improvement over the others. Our method is interpretable and can easily be adapted to other datasets, offering many future directions for research and practical applications.
