Table of Contents
Fetching ...

Imbalances in Neurosymbolic Learning: Characterization and Mitigating Strategies

Kaifu Wang, Efthymia Tsamoura, Dan Roth

TL;DR

This paper analyzes why learning imbalances arise in neurosymbolic learning (NeSy) due to the symbolic component sigma, even when data are balanced, and derives class-specific risk bounds that reveal sigma's pivotal role. It then introduces statistically consistent marginal estimation and two mitigation avenues: training-time LP-based pseudolabeling and testing-time RSOT-based score adjustments (CAROT). Empirical results show up to 14% accuracy gains over strong NeSy and long-tailed baselines, with insights into the tradeoffs and robustness of marginal estimation. Overall, the work provides a principled framework for characterizing and mitigating class-specific biases in NeSy systems, with tangible benefits for real-world long-tailed NSL tasks.

Abstract

We study one of the most popular problems in **neurosymbolic learning** (NSL), that of learning neural classifiers given only the result of applying a symbolic component $σ$ to the gold labels of the elements of a vector $\mathbf x$. The gold labels of the elements in $\mathbf x$ are unknown to the learner. We make multiple contributions, theoretical and practical, to address a problem that has not been studied so far in this context, that of characterizing and mitigating *learning imbalances*, i.e., major differences in the errors that occur when classifying instances of different classes (aka **class-specific risks**). Our theoretical analysis reveals a unique phenomenon: that $σ$ can greatly impact learning imbalances. This result sharply contrasts with previous research on supervised and weakly supervised learning, which only studies learning imbalances under data imbalances. On the practical side, we introduce a technique for estimating the marginal of the hidden gold labels using weakly supervised data. Then, we introduce algorithms that mitigate imbalances at training and testing time by treating the marginal of the hidden labels as a constraint. We demonstrate the effectiveness of our techniques using strong baselines from NSL and long-tailed learning, suggesting performance improvements of up to 14%.

Imbalances in Neurosymbolic Learning: Characterization and Mitigating Strategies

TL;DR

This paper analyzes why learning imbalances arise in neurosymbolic learning (NeSy) due to the symbolic component sigma, even when data are balanced, and derives class-specific risk bounds that reveal sigma's pivotal role. It then introduces statistically consistent marginal estimation and two mitigation avenues: training-time LP-based pseudolabeling and testing-time RSOT-based score adjustments (CAROT). Empirical results show up to 14% accuracy gains over strong NeSy and long-tailed baselines, with insights into the tradeoffs and robustness of marginal estimation. Overall, the work provides a principled framework for characterizing and mitigating class-specific biases in NeSy systems, with tangible benefits for real-world long-tailed NSL tasks.

Abstract

We study one of the most popular problems in **neurosymbolic learning** (NSL), that of learning neural classifiers given only the result of applying a symbolic component to the gold labels of the elements of a vector . The gold labels of the elements in are unknown to the learner. We make multiple contributions, theoretical and practical, to address a problem that has not been studied so far in this context, that of characterizing and mitigating *learning imbalances*, i.e., major differences in the errors that occur when classifying instances of different classes (aka **class-specific risks**). Our theoretical analysis reveals a unique phenomenon: that can greatly impact learning imbalances. This result sharply contrasts with previous research on supervised and weakly supervised learning, which only studies learning imbalances under data imbalances. On the practical side, we introduce a technique for estimating the marginal of the hidden gold labels using weakly supervised data. Then, we introduce algorithms that mitigate imbalances at training and testing time by treating the marginal of the hidden labels as a constraint. We demonstrate the effectiveness of our techniques using strong baselines from NSL and long-tailed learning, suggesting performance improvements of up to 14%.
Paper Structure (21 sections, 7 theorems, 38 equations, 8 figures, 1 table, 2 algorithms)

This paper contains 21 sections, 7 theorems, 38 equations, 8 figures, 1 table, 2 algorithms.

Key Result

Proposition 3.0

For any $j \in \mathcal{Y}$, we have that $R_j(f) \le \Phi_{\sigma, j}({R}_\mathsf{P}(f;\sigma))$.

Figures (8)

  • Figure 1: Class-specific accuracies of classifier $f$ (Example \ref{['example:motivating']}). Blue, red, and green curves show accuracy at 20, 40 and 100 epochs. Learning converges in 100 epochs.
  • Figure 2: Class-specific upper bounds obtained via \ref{['eqn:optim']}. (left) ${\mathcal{D}_{Y}}$ is uniform. (right) ${{\mathcal{D}_{\mathsf{P}_S}}}$ is uniform.
  • Figure 3: Impact of the label ratio quality on CAROT's performance.
  • Figure 4: Accuracy of the marginal estimates computed by Algorithm \ref{['alg:solver']}. Blue denotes the gold ratios, red the estimated ones, and green the absolute difference between the gold and estimated ratios.
  • Figure 5: Sensitivity of Algorithm \ref{['alg:solver']} to softmax reparameterization.
  • ...and 3 more figures

Theorems & Definitions (16)

  • Example 1.1: Example adapted from DBLP:journals/corr/abs-1907-08194
  • Proposition 3.0: Class-specific risk bound
  • Example 3.1: Cont' Example \ref{['example:motivating']}
  • Proposition 3.1
  • Proposition 3.1
  • Example 4.1
  • Proposition B.0: Class-specific risk bound
  • proof
  • Proposition B.0
  • proof
  • ...and 6 more