Seeing the Intangible: Survey of Image Classification into High-Level and Abstract Categories

Delfina Sol Martinez Pandiani; Valentina Presutti

Seeing the Intangible: Survey of Image Classification into High-Level and Abstract Categories

Delfina Sol Martinez Pandiani, Valentina Presutti

TL;DR

This survey enhances the understanding of high-level visual reasoning in CV and lays the groundwork for future research endeavors, focusing particularly on Abstract Concepts in automatic image classification.

Abstract

The field of Computer Vision (CV) is increasingly shifting towards ``high-level'' visual sensemaking tasks, yet the exact nature of these tasks remains unclear and tacit. This survey paper addresses this ambiguity by systematically reviewing research on high-level visual understanding, focusing particularly on Abstract Concepts (ACs) in automatic image classification. Our survey contributes in three main ways: Firstly, it clarifies the tacit understanding of high-level semantics in CV through a multidisciplinary analysis, and categorization into distinct clusters, including commonsense, emotional, aesthetic, and inductive interpretative semantics. Secondly, it identifies and categorizes computer vision tasks associated with high-level visual sensemaking, offering insights into the diverse research areas within this domain. Lastly, it examines how abstract concepts such as values and ideologies are handled in CV, revealing challenges and opportunities in AC-based image classification. Notably, our survey of AC image classification tasks highlights persistent challenges, such as the limited efficacy of massive datasets and the importance of integrating supplementary information and mid-level features. We emphasize the growing relevance of hybrid AI systems in addressing the multifaceted nature of AC image classification tasks. Overall, this survey enhances our understanding of high-level visual reasoning in CV and lays the groundwork for future research endeavors.

Seeing the Intangible: Survey of Image Classification into High-Level and Abstract Categories

TL;DR

Abstract

Paper Structure (35 sections, 4 figures, 6 tables)

This paper contains 35 sections, 4 figures, 6 tables.

Introduction
Defining High-Level Visual Semantics
Three-Tiered Semantics
Tip of the Iceberg: Upper Visual Semantics
Abstract Concepts and Visual Data
Survey Methodology
Automatic High-Level Visual CV Tasks
Clustering High-Level CV Tasks
Discussion on High-Level CV Tasks
Social and Sociocultural Emphasis in High-Level Visual Semantics
Diversifying Image Types
Task-Specific Dataset Creation
Research Output and Transformative Moments
In-Depth Survey of ACs in CV
Overlap in Abstract Concept Examples
...and 20 more sections

Figures (4)

Figure 1: The three tiers of the visual semantics hierarchy. Visual understanding is often depicted as a multi-layered process, revealing three distinct levels of semantics. The low-level involves raw or elemental features, while the mid-level encompasses individual objects, persons, and regions. In contrast, the high-level remains less defined and explored.
Figure 2: Tip of the iceberg: a deeper characterization of the top level of the visual semantic pyramid. Drawing from a multidisciplinary exploration of semantic entities associated with this upper semantic layer, we have identified four distinct clusters of knowledge.
Figure 3: Computer Vision tasks that deal with "high level semantics" or "high level visual understanding", which have been mapped also to the previous multidisciplinary characterization of high level semantics. Circled in red are the tasks that were found to implicitly or explicitly deal with AC detection.
Figure 4: Two inflection points, (2012) and (2017), that seem to correlate with the increasing interest in CV tasks dealing with high-level visual semantics.

Seeing the Intangible: Survey of Image Classification into High-Level and Abstract Categories

TL;DR

Abstract

Seeing the Intangible: Survey of Image Classification into High-Level and Abstract Categories

Authors

TL;DR

Abstract

Table of Contents

Figures (4)