Polynomial Width is Sufficient for Set Representation with High-dimensional Features
Peihao Wang, Shenghao Yang, Shu Li, Zhangyang Wang, Pan Li
TL;DR
This work addresses the expressiveness of DeepSets-style architectures for set functions with high-dimensional features ($D>1$) by proving that the intermediate embedding width $L$ can grow polynomially with the set size $N$ and feature dimension $D$. It introduces two embedding schemes, LP and LE, and provides constructive proofs that $L$ lies within polynomial bounds for both, extending the classic $D=1$ results to the high-dimensional setting. The authors also extend the theory to permutation-equivariant functions and the complex domain, and they provide empirical validation supporting the polynomial scaling of the required embedding width. The findings have practical implications for scalable set-function representations in DeepSets-based modules within GNNs and related architectures, enabling efficient yet expressive set processing with polynomial resources.
Abstract
Set representation has become ubiquitous in deep learning for modeling the inductive bias of neural networks that are insensitive to the input order. DeepSets is the most widely used neural network architecture for set representation. It involves embedding each set element into a latent space with dimension $L$, followed by a sum pooling to obtain a whole-set embedding, and finally mapping the whole-set embedding to the output. In this work, we investigate the impact of the dimension $L$ on the expressive power of DeepSets. Previous analyses either oversimplified high-dimensional features to be one-dimensional features or were limited to analytic activations, thereby diverging from practical use or resulting in $L$ that grows exponentially with the set size $N$ and feature dimension $D$. To investigate the minimal value of $L$ that achieves sufficient expressive power, we present two set-element embedding layers: (a) linear + power activation (LP) and (b) linear + exponential activations (LE). We demonstrate that $L$ being poly$(N, D)$ is sufficient for set representation using both embedding layers. We also provide a lower bound of $L$ for the LP embedding layer. Furthermore, we extend our results to permutation-equivariant set functions and the complex field.
