Table of Contents
Fetching ...

Towards Generative Abstract Reasoning: Completing Raven's Progressive Matrix via Rule Abstraction and Selection

Fan Shi, Bin Li, Xiangyang Xue

TL;DR

The paper tackles abstract visual reasoning in Raven's Progressive Matrix tests by introducing RAISE, a deep latent-variable model that abstracts and selects atomic rules from a global knowledge set to generate missing RPM images. RAISE learns interpretable latent concepts for image attributes and decouples rule learning from perception, optimizing an ELBO that combines reconstruction, concept alignment, and rule-consistency terms, with optional auxiliary supervision for rule annotations. It demonstrates strong performance on RAVEN and I-RAVEN across bottom-right and arbitrary-position tasks, plus odd-one-out and held-out configurations, while offering interpretable latent concepts through concept-attribute mappings. The work advances generative abstract reasoning by enabling rule abstraction, per-concept rule selection, and compositional generalization, with practical implications for robust visual reasoning and systematic generalization in AI systems.

Abstract

Endowing machines with abstract reasoning ability has been a long-term research topic in artificial intelligence. Raven's Progressive Matrix (RPM) is widely used to probe abstract visual reasoning in machine intelligence, where models will analyze the underlying rules and select one image from candidates to complete the image matrix. Participators of RPM tests can show powerful reasoning ability by inferring and combining attribute-changing rules and imagining the missing images at arbitrary positions of a matrix. However, existing solvers can hardly manifest such an ability in realistic RPM tests. In this paper, we propose a deep latent variable model for answer generation problems through Rule AbstractIon and SElection (RAISE). RAISE can encode image attributes into latent concepts and abstract atomic rules that act on the latent concepts. When generating answers, RAISE selects one atomic rule out of the global knowledge set for each latent concept to constitute the underlying rule of an RPM. In the experiments of bottom-right and arbitrary-position answer generation, RAISE outperforms the compared solvers in most configurations of realistic RPM datasets. In the odd-one-out task and two held-out configurations, RAISE can leverage acquired latent concepts and atomic rules to find the rule-breaking image in a matrix and handle problems with unseen combinations of rules and attributes.

Towards Generative Abstract Reasoning: Completing Raven's Progressive Matrix via Rule Abstraction and Selection

TL;DR

The paper tackles abstract visual reasoning in Raven's Progressive Matrix tests by introducing RAISE, a deep latent-variable model that abstracts and selects atomic rules from a global knowledge set to generate missing RPM images. RAISE learns interpretable latent concepts for image attributes and decouples rule learning from perception, optimizing an ELBO that combines reconstruction, concept alignment, and rule-consistency terms, with optional auxiliary supervision for rule annotations. It demonstrates strong performance on RAVEN and I-RAVEN across bottom-right and arbitrary-position tasks, plus odd-one-out and held-out configurations, while offering interpretable latent concepts through concept-attribute mappings. The work advances generative abstract reasoning by enabling rule abstraction, per-concept rule selection, and compositional generalization, with practical implications for robust visual reasoning and systematic generalization in AI systems.

Abstract

Endowing machines with abstract reasoning ability has been a long-term research topic in artificial intelligence. Raven's Progressive Matrix (RPM) is widely used to probe abstract visual reasoning in machine intelligence, where models will analyze the underlying rules and select one image from candidates to complete the image matrix. Participators of RPM tests can show powerful reasoning ability by inferring and combining attribute-changing rules and imagining the missing images at arbitrary positions of a matrix. However, existing solvers can hardly manifest such an ability in realistic RPM tests. In this paper, we propose a deep latent variable model for answer generation problems through Rule AbstractIon and SElection (RAISE). RAISE can encode image attributes into latent concepts and abstract atomic rules that act on the latent concepts. When generating answers, RAISE selects one atomic rule out of the global knowledge set for each latent concept to constitute the underlying rule of an RPM. In the experiments of bottom-right and arbitrary-position answer generation, RAISE outperforms the compared solvers in most configurations of realistic RPM datasets. In the odd-one-out task and two held-out configurations, RAISE can leverage acquired latent concepts and atomic rules to find the rule-breaking image in a matrix and handle problems with unseen combinations of rules and attributes.
Paper Structure (36 sections, 27 equations, 11 figures, 9 tables)

This paper contains 36 sections, 27 equations, 11 figures, 9 tables.

Figures (11)

  • Figure 1: An overview of RAISE. The graphical model in (a) displays the generative process (solid black lines) and inference process (dashed red lines). Panel (b) shows the computational details of the abstract reasoning process and highlights the rule selection, rule execution, and global knowledge with blue, yellow, and red backgrounds, respectively.
  • Figure 2: Selection accuracy at arbitrary positions. The selection accuracy of RAISE (purple), Transformer (orange), CLAP (green), ANP (blue), and LGPP (black) in arbitrary positions. The x-axis of each plot indicates the number of candidates, and the y-axis is the selection accuracy.
  • Figure 3: Answer generation at arbitrary positions. The prediction results on RAVEN are highlighted (red box) to illustrate the arbitrary-position generation ability. Due to the existence of noise, some predictions may differ from the original sample, but they still follow the correct rules.
  • Figure 4: Panel (a) shows the interpolation results of latent concepts and the correspondence between the concepts and attributes. Panel (b) provides an example of RPM-based odd-one-out tests and displays the prediction deviations in concepts of each image. Panel (c) illustrates the strategy to split rule-attribute combinations in held-out configurations.
  • Figure 5: Different configurations of RAVEN. In each figure, the top panel is an RPM where the target images are highlighted in red boxes; the middle panel is a candidate set with eight candidate images; and the bottom panel shows the attribute-changing rules in the RPM.
  • ...and 6 more figures