Table of Contents
Fetching ...

Learning Aggregate Queries Defined by First-Order Logic with Counting

Steffen van Bergerem, Nicole Schweikardt

TL;DR

This paper presents learnability results beyond Boolean classification on multiclass classification problems where the task is to assign input tuples to arbitrary integers, and uses aggregate queries specified by an extension of first-order logic with counting terms called FOC1 to represent such integer-valued classifiers.

Abstract

In the logical framework introduced by Grohe and Turán (TOCS 2004) for Boolean classification problems, the instances to classify are tuples from a logical structure, and Boolean classifiers are described by parametric models based on logical formulas. This is a specific scenario for supervised passive learning, where classifiers should be learned based on labelled examples. Existing results in this scenario focus on Boolean classification. This paper presents learnability results beyond Boolean classification. We focus on multiclass classification problems where the task is to assign input tuples to arbitrary integers. To represent such integer-valued classifiers, we use aggregate queries specified by an extension of first-order logic with counting terms called FOC1. Our main result shows the following: given a database of polylogarithmic degree, within quasi-linear time, we can build an index structure that makes it possible to learn FOC1-definable integer-valued classifiers in time polylogarithmic in the size of the database and polynomial in the number of training examples.

Learning Aggregate Queries Defined by First-Order Logic with Counting

TL;DR

This paper presents learnability results beyond Boolean classification on multiclass classification problems where the task is to assign input tuples to arbitrary integers, and uses aggregate queries specified by an extension of first-order logic with counting terms called FOC1 to represent such integer-valued classifiers.

Abstract

In the logical framework introduced by Grohe and Turán (TOCS 2004) for Boolean classification problems, the instances to classify are tuples from a logical structure, and Boolean classifiers are described by parametric models based on logical formulas. This is a specific scenario for supervised passive learning, where classifiers should be learned based on labelled examples. Existing results in this scenario focus on Boolean classification. This paper presents learnability results beyond Boolean classification. We focus on multiclass classification problems where the task is to assign input tuples to arbitrary integers. To represent such integer-valued classifiers, we use aggregate queries specified by an extension of first-order logic with counting terms called FOC1. Our main result shows the following: given a database of polylogarithmic degree, within quasi-linear time, we can build an index structure that makes it possible to learn FOC1-definable integer-valued classifiers in time polylogarithmic in the size of the database and polynomial in the number of training examples.

Paper Structure

This paper contains 4 sections, 4 theorems, 7 equations.

Key Result

Theorem 3

Let $\sigma$ be a signature, let $k\in\mathbb{N}_{ \geqslant 1}$, let $\ell,q\in\mathbb{N}$, let $I$ be a finite set of integers, and let $T$ be the set of all $\textup{FOC}_1[\sigma, k{+}\ell, q]$-terms that only use integers from $I$. There is an extension $\sigma'$ of $\sigma$ with relation symbo

Theorems & Definitions (6)

  • Example 1
  • Example 2
  • Theorem 3
  • Lemma 4
  • Theorem 7: Localisation Theorem for $\textup{FOC}_1$, GroheSchweikardt_FOunC
  • Lemma 8