Towards Generative Abstract Reasoning: Completing Raven's Progressive Matrix via Rule Abstraction and Selection
Fan Shi, Bin Li, Xiangyang Xue
TL;DR
The paper tackles abstract visual reasoning in Raven's Progressive Matrix tests by introducing RAISE, a deep latent-variable model that abstracts and selects atomic rules from a global knowledge set to generate missing RPM images. RAISE learns interpretable latent concepts for image attributes and decouples rule learning from perception, optimizing an ELBO that combines reconstruction, concept alignment, and rule-consistency terms, with optional auxiliary supervision for rule annotations. It demonstrates strong performance on RAVEN and I-RAVEN across bottom-right and arbitrary-position tasks, plus odd-one-out and held-out configurations, while offering interpretable latent concepts through concept-attribute mappings. The work advances generative abstract reasoning by enabling rule abstraction, per-concept rule selection, and compositional generalization, with practical implications for robust visual reasoning and systematic generalization in AI systems.
Abstract
Endowing machines with abstract reasoning ability has been a long-term research topic in artificial intelligence. Raven's Progressive Matrix (RPM) is widely used to probe abstract visual reasoning in machine intelligence, where models will analyze the underlying rules and select one image from candidates to complete the image matrix. Participators of RPM tests can show powerful reasoning ability by inferring and combining attribute-changing rules and imagining the missing images at arbitrary positions of a matrix. However, existing solvers can hardly manifest such an ability in realistic RPM tests. In this paper, we propose a deep latent variable model for answer generation problems through Rule AbstractIon and SElection (RAISE). RAISE can encode image attributes into latent concepts and abstract atomic rules that act on the latent concepts. When generating answers, RAISE selects one atomic rule out of the global knowledge set for each latent concept to constitute the underlying rule of an RPM. In the experiments of bottom-right and arbitrary-position answer generation, RAISE outperforms the compared solvers in most configurations of realistic RPM datasets. In the odd-one-out task and two held-out configurations, RAISE can leverage acquired latent concepts and atomic rules to find the rule-breaking image in a matrix and handle problems with unseen combinations of rules and attributes.
