IID Relaxation by Logical Expressivity: A Research Agenda for Fitting Logics to Neurosymbolic Requirements
Maarten C. Stol, Alessandra Mileo
TL;DR
The paper tackles the mismatch between Neurosymbolic knowledge and ML IID assumptions by proposing an IID-relaxation hierarchy of logics. It defines a progression from $L_G$ to $L_{G.Sol}$, $L_{Sol}$, $L_{FP}$, and the Guarded Fragment ($GF$), illustrating how each adds operators to relax IID constraints; in this view, given a $d$-dimensional input space and $k$ labels, ML maps $x \in \mathbb{R}^d$ to $y \in \{0,1\}^k$. The authors discuss implications for loss calculations, batch construction, and modal-truth-inspired losses under a model-theoretic invariance lens. They outline a research agenda to develop dependency-aware ML routines and semantic categorization of NeSy formalisms that align background knowledge with practical distribution constraints.
Abstract
Neurosymbolic background knowledge and the expressivity required of its logic can break Machine Learning assumptions about data Independence and Identical Distribution. In this position paper we propose to analyze IID relaxation in a hierarchy of logics that fit different use case requirements. We discuss the benefits of exploiting known data dependencies and distribution constraints for Neurosymbolic use cases and argue that the expressivity required for this knowledge has implications for the design of underlying ML routines. This opens a new research agenda with general questions about Neurosymbolic background knowledge and the expressivity required of its logic.
