Table of Contents
Fetching ...

OASIC: Occlusion-Agnostic and Severity-Informed Classification

Kay Gijzen, Gertjan J. Burghouts, Daniël M. Pelt

Abstract

Severe occlusions of objects pose a major challenge for computer vision. We show that two root causes are (1) the loss of visible information and (2) the distracting patterns caused by the occluders. Our approach addresses both causes at the same time. First, the distracting patterns are removed at test-time, via masking of the occluding patterns. This masking is independent of the type of occlusion, by handling the occlusion through the lens of visual anomalies w.r.t. the object of interest. Second, to deal with less visual details, we follow standard practice by masking random parts of the object during training, for various degrees of occlusions. We discover that (a) it is possible to estimate the degree of the occlusion (i.e. severity) at test-time, and (b) that a model optimized for a specific degree of occlusion also performs best on a similar degree during test-time. Combining these two insights brings us to a severity-informed classification model called OASIC: Occlusion Agnostic Severity Informed Classification. We estimate the severity of occlusion for a test image, mask the occluder, and select the model that is optimized for the degree of occlusion. This strategy performs better than any single model optimized for any smaller or broader range of occlusion severities. Experiments show that combining gray masking with adaptive model selection improves $\text{AUC}_\text{occ}$ by +18.5 over standard training on occluded images and +23.7 over finetuning on unoccluded images.

OASIC: Occlusion-Agnostic and Severity-Informed Classification

Abstract

Severe occlusions of objects pose a major challenge for computer vision. We show that two root causes are (1) the loss of visible information and (2) the distracting patterns caused by the occluders. Our approach addresses both causes at the same time. First, the distracting patterns are removed at test-time, via masking of the occluding patterns. This masking is independent of the type of occlusion, by handling the occlusion through the lens of visual anomalies w.r.t. the object of interest. Second, to deal with less visual details, we follow standard practice by masking random parts of the object during training, for various degrees of occlusions. We discover that (a) it is possible to estimate the degree of the occlusion (i.e. severity) at test-time, and (b) that a model optimized for a specific degree of occlusion also performs best on a similar degree during test-time. Combining these two insights brings us to a severity-informed classification model called OASIC: Occlusion Agnostic Severity Informed Classification. We estimate the severity of occlusion for a test image, mask the occluder, and select the model that is optimized for the degree of occlusion. This strategy performs better than any single model optimized for any smaller or broader range of occlusion severities. Experiments show that combining gray masking with adaptive model selection improves by +18.5 over standard training on occluded images and +23.7 over finetuning on unoccluded images.

Paper Structure

This paper contains 19 sections, 10 equations, 5 figures, 1 table.

Figures (5)

  • Figure 1: Occlusion handling by OASIC. At test time, an occlusion map is inferred by scoring against the memory bank $\mathcal{M}$. The segmented mask is turned into gray to suppress distraction, while the estimated occlusion severity informs the selection of the most suitable classification model $f^\ast$ from the pool $\mathcal{F}$, to better handle reduced visual information.
  • Figure 2: Comparison of an occluded image and its masked versions at different thresholds $\tau$. From left to right: the original occluded image, and masks applied with thresholds 0.3, 0.5, and 0.7.
  • Figure 3: Larger degrees of occlusions (severity) deteriorate the performance of finegrained classification, trained without occlusions. Textured occlusions (vegetation, rubble) are more problematic than dull occlusion (smoke) or gray occlusion.
  • Figure 4: Textured occluders draw away the attention from the object. The first column shows the original (unoccluded) images, and the second column displays their corresponding saliency maps. The occlusion mask applied to each row is shown in column 3 and remains the same across all occlusion types. Columns 4–6 present the occluded images: gray, vegetation and rubble. Each is overlaid with its respective attention map.
  • Figure 5: OASIC is much more robust to severe occlusions, improving 5x compared to standard training. Both the gray-masking of occlusion (to suppress distraction) and the severity-informed model selection (to handle limited visibility) contribute to OASIC's performance.