Table of Contents
Fetching ...

Connectivity-Inspired Network for Context-Aware Recognition

Gianluca Carloni, Sara Colantonio

TL;DR

A novel biologically motivated neural network for image classification and a new plug-and-play module to model context awareness focused on the effect of incorporating circuit motifs found in biological brains to address visual recognition are presented.

Abstract

The aim of this paper is threefold. We inform the AI practitioner about the human visual system with an extensive literature review; we propose a novel biologically motivated neural network for image classification; and, finally, we present a new plug-and-play module to model context awareness. We focus on the effect of incorporating circuit motifs found in biological brains to address visual recognition. Our convolutional architecture is inspired by the connectivity of human cortical and subcortical streams, and we implement bottom-up and top-down modulations that mimic the extensive afferent and efferent connections between visual and cognitive areas. Our Contextual Attention Block is simple and effective and can be integrated with any feed-forward neural network. It infers weights that multiply the feature maps according to their causal influence on the scene, modeling the co-occurrence of different objects in the image. We place our module at different bottlenecks to infuse a hierarchical context awareness into the model. We validated our proposals through image classification experiments on benchmark data and found a consistent improvement in performance and the robustness of the produced explanations via class activation. Our code is available at https://github.com/gianlucarloni/CoCoReco.

Connectivity-Inspired Network for Context-Aware Recognition

TL;DR

A novel biologically motivated neural network for image classification and a new plug-and-play module to model context awareness focused on the effect of incorporating circuit motifs found in biological brains to address visual recognition are presented.

Abstract

The aim of this paper is threefold. We inform the AI practitioner about the human visual system with an extensive literature review; we propose a novel biologically motivated neural network for image classification; and, finally, we present a new plug-and-play module to model context awareness. We focus on the effect of incorporating circuit motifs found in biological brains to address visual recognition. Our convolutional architecture is inspired by the connectivity of human cortical and subcortical streams, and we implement bottom-up and top-down modulations that mimic the extensive afferent and efferent connections between visual and cognitive areas. Our Contextual Attention Block is simple and effective and can be integrated with any feed-forward neural network. It infers weights that multiply the feature maps according to their causal influence on the scene, modeling the co-occurrence of different objects in the image. We place our module at different bottlenecks to infuse a hierarchical context awareness into the model. We validated our proposals through image classification experiments on benchmark data and found a consistent improvement in performance and the robustness of the produced explanations via class activation. Our code is available at https://github.com/gianlucarloni/CoCoReco.
Paper Structure (12 sections, 4 figures, 1 table)

This paper contains 12 sections, 4 figures, 1 table.

Figures (4)

  • Figure 1: The ventral and dorsal visual pathways in human vision. The brain models depicted in this image are adapted from https://www.brainfacts.org/ of the Society for Neuroscience (2017).
  • Figure 2: Overview of our Connectivity-inspired Context aware Recognition network. The internals and rationale of CAB module is presented in Section 3.2 and Fig 3. Other abbreviations: lateral geniculostriate nucleus (LGN), superior colliculus (SC), pulvinar (PULV). Best seen in color.
  • Figure 3: Our Contextual Attention Block (CAB) integrated into a general feed-forward network. As shown, CAB is placed at the convolutional bottleneck of the model. Given intermediate feature maps, the module computes corresponding contextual attention scores through a rectified and rescaled version of the weights obtained from the co-occurrence map.
  • Figure 4: GradCAM activations for some test images. The left panel shows the output when the last convolutional layer before the classifier is chosen as the target layer for the GradCAM computation. The panel on the right shows the output for the same images when a combination of target layers is chosen.