Abduction of Domain Relationships from Data for VQA
Al Mehdi Saadat Chowdhury, Paulo Shakarian, Gerardo I. Simari
TL;DR
This work tackles VQA when the image and query are represented as ASP programs lacking domain data. It introduces a domain-abduction framework that learns domain relationships via a practical heuristic, FAST-DAP, yielding a set of domain facts $Pi^D$ (assign($A$,$D$)) that, when added to the existing ASP programs, substantially improve question answering accuracy on the GQA dataset from $59.98 ext{%}$ to about $81.0 ext{%}$ using only a small amount of data. The method is orthogonal to knowledge-graph approaches, offering a neurosymbolic, scalable way to inject domain knowledge into reasoning. Key contributions include formalizing the domain abduction problem, proposing a fast and regularized learning algorithm, and demonstrating strong data efficiency and practical performance on real-world data. The work highlights the potential of combining logical representations with abductive inference to enhance VQA under domain uncertainty, while noting the lack of theoretical guarantees and pointing to meta-cognitive AI as a path for future improvement.
Abstract
In this paper, we study the problem of visual question answering (VQA) where the image and query are represented by ASP programs that lack domain data. We provide an approach that is orthogonal and complementary to existing knowledge augmentation techniques where we abduce domain relationships of image constructs from past examples. After framing the abduction problem, we provide a baseline approach, and an implementation that significantly improves the accuracy of query answering yet requires few examples.
