Table of Contents
Fetching ...

Improving Set Function Approximation with Quasi-Arithmetic Neural Networks

Tomas Tokar, Scott Sanner

TL;DR

This work tackles the fundamental challenge of learning permutation-invariant set functions by moving beyond fixed pooling to learnable aggregation. It introduces Neuralized Kolmogorov Means (NKM) and, building on this, Quasi-Arithmetic Neural Networks (QUANNs) that use an invertible neural generator to implement a trainable quasi-arithmetic mean as the pooling operator. Theoretical results show QUANNs can universal-approximate a broad class of set functions and offer advantages in expressivity, regularization, and structured latent representations, particularly when encoder or estimator are pre-trained or transferred. Empirically, QUANNs outperform state-of-the-art baselines across diverse tasks (synthetic, MNIST-sets, Omniglot-sets, ModelNet40, QM9) and demonstrate better encoder transferability to non-set tasks, with some limitations on sum-decomposable functions suggesting further enhancements and extensions. $\hat{F}(\mathbf{X})=\rho\left(\psi^{-1}\left(\frac{1}{|P_k(\mathbf{X})|}\sum_{\pi\in P_k(\mathbf{X})} \psi(\phi(\pi))\right)\right)$ and $M_{\psi}(\mathbf{X})=\psi^{-1}\left(\frac{1}{n}\sum_{i=1}^{n} \psi(\mathbf{x}_i)\right)$ formalize the core aggregations, underpinning the claimed universal approximation and improved latent structure. The work points to practical implications for learning transferable representations in set-rich domains and beyond, including graphs and multi-view fusion.

Abstract

Sets represent a fundamental abstraction across many types of data. To handle the unordered nature of set-structured data, models such as DeepSets and PointNet rely on fixed, non-learnable pooling operations (e.g., sum or max) -- a design choice that can hinder the transferability of learned embeddings and limits model expressivity. More recently, learnable aggregation functions have been proposed as more expressive alternatives. In this work, we advance this line of research by introducing the Neuralized Kolmogorov Mean (NKM) -- a novel, trainable framework for learning a generalized measure of central tendency through an invertible neural function. We further propose quasi-arithmetic neural networks (QUANNs), which incorporate the NKM as a learnable aggregation function. We provide a theoretical analysis showing that, QUANNs are universal approximators for a broad class of common set-function decompositions and, thanks to their invertible neural components, learn more structured latent representations. Empirically, QUANNs outperform state-of-the-art baselines across diverse benchmarks, while learning embeddings that transfer effectively even to tasks that do not involve sets.

Improving Set Function Approximation with Quasi-Arithmetic Neural Networks

TL;DR

This work tackles the fundamental challenge of learning permutation-invariant set functions by moving beyond fixed pooling to learnable aggregation. It introduces Neuralized Kolmogorov Means (NKM) and, building on this, Quasi-Arithmetic Neural Networks (QUANNs) that use an invertible neural generator to implement a trainable quasi-arithmetic mean as the pooling operator. Theoretical results show QUANNs can universal-approximate a broad class of set functions and offer advantages in expressivity, regularization, and structured latent representations, particularly when encoder or estimator are pre-trained or transferred. Empirically, QUANNs outperform state-of-the-art baselines across diverse tasks (synthetic, MNIST-sets, Omniglot-sets, ModelNet40, QM9) and demonstrate better encoder transferability to non-set tasks, with some limitations on sum-decomposable functions suggesting further enhancements and extensions. and formalize the core aggregations, underpinning the claimed universal approximation and improved latent structure. The work points to practical implications for learning transferable representations in set-rich domains and beyond, including graphs and multi-view fusion.

Abstract

Sets represent a fundamental abstraction across many types of data. To handle the unordered nature of set-structured data, models such as DeepSets and PointNet rely on fixed, non-learnable pooling operations (e.g., sum or max) -- a design choice that can hinder the transferability of learned embeddings and limits model expressivity. More recently, learnable aggregation functions have been proposed as more expressive alternatives. In this work, we advance this line of research by introducing the Neuralized Kolmogorov Mean (NKM) -- a novel, trainable framework for learning a generalized measure of central tendency through an invertible neural function. We further propose quasi-arithmetic neural networks (QUANNs), which incorporate the NKM as a learnable aggregation function. We provide a theoretical analysis showing that, QUANNs are universal approximators for a broad class of common set-function decompositions and, thanks to their invertible neural components, learn more structured latent representations. Empirically, QUANNs outperform state-of-the-art baselines across diverse benchmarks, while learning embeddings that transfer effectively even to tasks that do not involve sets.
Paper Structure (72 sections, 7 theorems, 53 equations, 8 figures, 12 tables)

This paper contains 72 sections, 7 theorems, 53 equations, 8 figures, 12 tables.

Key Result

Theorem 5.1

Let $\mathcal{U}$ denote the set of all permutation-invariant set functions $F:\mathcal{P}_f(X) \to \mathcal{Y}$ that can be uniformly approximated by Quasi-Arithmetic Neural Networks of the form eq:quann, where $\rho$, $\psi$ and $\phi$ are arbitrary neural networks. Let $\mathcal{U}_W$ denote the

Figures (8)

  • Figure 1: Generalized framework for set function learning, involving encoder $\phi$, estimator $\rho$ and pooling operation (e.g. sum, or max).
  • Figure 2: Experiment with aggregation of MNIST digits. The goal is to approximate the function $F$ by learning to estimate its outputs.
  • Figure 3: Omniglot experiment. The task is to identify which alphabets are represented in a set of images of hand-written characters.
  • Figure 4: Relative number of outcomes (cf. Table \ref{['tab:performance']}), where the model in the given row outperformed the model in the given column. An asterisk and a double asterisk indicate statistical significance ($p < 0.05$) and very high significance ($p < 0.01$), respectively.
  • Figure 5: QUANN performance in synthetic data experiment in dependence on the choice of the $\psi$-network architecture (RevNet vs RealNVP coupling). The results show that there is no clear consistent pattern indicating improvement or worsening of the model performance due to altered network architecture.
  • ...and 3 more figures

Theorems & Definitions (14)

  • Theorem 5.1: Universal Approximation of QUANNs
  • Corollary H.1: Permutation invariance of Quasi-arithmetic Neural Networks
  • proof
  • Corollary H.2: Kolmogorov Mean with Linear Generator is Equal to Arithmetic Mean
  • proof
  • proof : Proof of therorem \ref{['theorem:quanns_uat']}
  • Proposition H.3: Approximation of Mean-Decomposable Set Function
  • proof
  • Proposition H.4: Approximation of Max-Decomposable Set Function
  • proof
  • ...and 4 more