The KANDY Benchmark: Incremental Neuro-Symbolic Learning and Reasoning with Kandinsky Patterns

Luca Salvatore Lorello; Marco Lippi; Stefano Melacci

The KANDY Benchmark: Incremental Neuro-Symbolic Learning and Reasoning with Kandinsky Patterns

Luca Salvatore Lorello, Marco Lippi, Stefano Melacci

TL;DR

KANDY tackles the challenge of evaluating incremental neuro-symbolic learning by introducing a Kandinsky-pattern inspired benchmark that generates curricula of binary visual classification tasks with hierarchical compositionality and sparse supervision. The framework combines a data-generation pipeline, ground-truth symbolic rules, and a curriculum design to test continual and semi-supervised learning capabilities, released as two curricula (easy and hard) with open-source tooling. Experimental results show that state-of-the-art neural models and pure symbolic systems struggle on the benchmarks, especially the Hard curriculum, underscoring the potential of neuro-symbolic approaches that jointly learn perceptual features and logical reasoning. By providing interpretable ground-truth rules and varied supervision schedules, KANDY aims to foster advances in NeSy methods and their application to curriculum-driven, continual learning scenarios, with practical impact for robust, Explainable AI.

Abstract

Artificial intelligence is continuously seeking novel challenges and benchmarks to effectively measure performance and to advance the state-of-the-art. In this paper we introduce KANDY, a benchmarking framework that can be used to generate a variety of learning and reasoning tasks inspired by Kandinsky patterns. By creating curricula of binary classification tasks with increasing complexity and with sparse supervisions, KANDY can be used to implement benchmarks for continual and semi-supervised learning, with a specific focus on symbol compositionality. Classification rules are also provided in the ground truth to enable analysis of interpretable solutions. Together with the benchmark generation pipeline, we release two curricula, an easier and a harder one, that we propose as new challenges for the research community. With a thorough experimental evaluation, we show how both state-of-the-art neural models and purely symbolic approaches struggle with solving most of the tasks, thus calling for the application of advanced neuro-symbolic methods trained over time.

The KANDY Benchmark: Incremental Neuro-Symbolic Learning and Reasoning with Kandinsky Patterns

TL;DR

Abstract

Paper Structure (9 sections, 6 figures, 2 tables)

This paper contains 9 sections, 6 figures, 2 tables.

Introduction
Related works
The KANDY benchmark
Data generation: basic blocks and primitives
Released curricula
Experimental evaluation
Results on KANDY-Easy
Results on KANDY-Hard
Conclusions

Figures (6)

Figure 1: Top: Overview of the KANDY generation pipeline: the user provides task specifics and data is generated. Positive and negative sets are defined via symbolic representations and rendered into synthetic images. The ground truth rule is a propositional clause that can be used to reject samples, and thus explains the task. Bottom: Example of two tasks from a curriculum generated with KANDY.
Figure 2: KANDY-Easy ( 1st, 2nd plot) and KANDY-Hard ( 3rd, 4th plot), average accuracy over time (i.e., after having processed data of each of the sequentially streamed tasks). Left: task incremental learning. Right: Continual online learning.
Figure 3: KANDY-Easy, per-task accuracies of the compared models in different learning settings. Neural (first four rows) and symbolic methods (last row) are represented with different colormaps. Models significantly under-performing (below 0.4) are not shown.
Figure 4: KANDY-Easy, average forgetting (the lower the better), backward transfer, forward transfer over time (i.e., after having processed data of each of the sequentially streamed tasks). Top: task incremental learning. Bottom: Continual online learning.
Figure 5: KANDY-Hard, per-task accuracies of the compared models in different learning settings. Neural (first four rows) and symbolic methods (last row) are represented with different colormaps. Models significantly under-performing (below 0.4) are not shown.
...and 1 more figures

The KANDY Benchmark: Incremental Neuro-Symbolic Learning and Reasoning with Kandinsky Patterns

TL;DR

Abstract

The KANDY Benchmark: Incremental Neuro-Symbolic Learning and Reasoning with Kandinsky Patterns

Authors

TL;DR

Abstract

Table of Contents

Figures (6)